They say Artificial Intelligence will enhance the life and work of many of us in future. But can we actually believe this statement? In the last few months, neural network DALL-E 2 has been both praised and dreaded. The exciting (for those who love the progresses of Artificial Intelligence) release from research lab OpenAI with a name that collides Disney’s Wall-E and Dalí’s name, is an image generation Artificial Intelligence based on text.
You type a description, the AI looks at billions of pictures found on the Internet and then comes up with its own version of that image. Algorithms are trained to go through massive amounts of data in a few seconds, and, in this case, DALL-E 2 does not just recognise an object, but it is capable of creating relationships between two objects, so it can predict what the image inspired by the caption you entered should look like and then create a new image based on something unexpected.
Artificial Intelligence is making some great progresses: images created by the AI a few years ago weren’t as defined as the ones created by DALL-E 2 which is based on OpenAI's GPT-3 algorithm (that can generate texts that are close to what a human could write).
Examples on the DALL-E 2 site, show us what the neural network can do when replying to the prompt "An astronaut riding a horse in a photorealistic style" or a "A bowl of soup as a planet in the universe as a 1960s poster" and "Teddy bears working on new AI research underwater with 1990s technology", introducing also the variations it can produce on a theme, and producing more images based on that description.
DALL-E 2 creates scarily clear images and excellent photorealistic content: those who tried it so far (please note that at the moment DALL-E 2 is not open to all, but you can join a waiting list to try it), state that it has a reasonable understanding of objects and, for what regards styles or clothes, it is more precise if you specify if you want to see the clothes on a runway or on a mannequin.
What's impressive is the way DALL-E 2 can reproduce specific artistic styles, from charcoal or pencil sketches to paintings in the style of various artists and artistic movements. It gets instead confused when there are more people in an image and, when it creates variations, finds it challenging to keep consistent with the representation of the characters' faces.
As it happens with innovative technologies, DALL-E 2 has some pros and cons. The main advantage is that DALL-E can generate in just a few seconds digital images of all sorts, based on exactly what we are looking for.
So imagine you had to come up with an idea for a poster or a moodboard, the prototype of an object or an image for a special card or invitation, you could create it in a very short time and consider also variations in colours, scenes and settings.
For what regards violent content, hate and adult images, or the risk of creating deepfake images, luckily, researchers limited the ability for DALL·E 2 to generate such images and used advanced techniques to prevent photorealistic generations of real individuals' faces, including those of public figures without their consent. DALL-E's content policy does not allow the AI to generate images with violent, adult or political content and the AI doesn't generate images if the filters identify text prompts and image uploads that may violate the policies.
The disadvantages of such a system are mainly connected with ethical and copyright issues. First of all, the results produced by an AI depend from what it has been fed with during its learning process. There have been cases of "racist algorithms", built on datasets fed by young white men and DALL-E 2 seems to have the same issue and at times shows some limits. When it comes to visualizing a business person or a lawyer, it conjures up a man, while it offers images of women when the text prompt asks for a nurse.
But there may be more intricate issues when it comes to trademarked logos and copyrights. You may indeed asks DALL-E 2 to produce an image showing "Mickey Mouse drinking from a can of Coca-Cola", and the AI may produce a Mickey Mouse-lookalike with a can of Coke with its trademarked logo, an image that may be considered as a double infringement of copyrights.
Which leads to another question about authorship: who is the author of the final image? You write the text and the AI bases the image on your text, so you definitely have the main idea, but it is the AI that executes it. So who owns the copyright in that case? You, the AI or both? Besides, if you ask for the image of a cat doing the dishes and the AI creates a cat based on somebody's cat, would the owner of that pet also own the copyright on the final image?
There's more to ask, in connection with fashion as well: will we soon have a DALL-E 2 generated collection? After all, we have already seen start up companies launching lines of dresses created with the help of machine learning algorithms.
While we're not sure about a proper collection, DALL-E 2 has already been used for a magazine cover for Cosmopolitan. The image was created in collaboration by Cosmopolitan editors, OpenAI's workers, and Digital Artist Karen X. Cheng who spent around 100 hours experimenting with DALL-E.
The team first tested the AI with a series of prompts going from "1960s fashionable woman close up, encyclopedia-style illustration" to "a strong female president astronaut warrior walking on the planet Mars, digital art synthwave."
Refining the prompt was the trick in this case and the final prompt - "a wide-angle shot from below of a female astronaut with an athletic feminine body walking with swagger toward camera on Mars in an infinite universe, synthwave digital art" - generated the perfect image that was then used for the cover.
The cover actually looks rather good, which leads us to another key question - will DALL-E 2 put out of job artists, graphic designers and creative minds, maybe even fashion designers?
Obviously DALL-E will become more precise as the time will pass, but the problem with text-to-image AI is the same you get with AI producing translations. Literally, translations done by an Artificial Intelligence are fine, but the system doesn't manage to grasp certain nuances and expressions, so there will always be the funny sentence or the incomprehensible line that will have to be rechecked by a human translator. In much the same way, you may argue, there will always be the need for graphic designers who may have to apply their skills to the image produced by the AI to alter or refine it.
For the time being, though, graphic designers shouldn't worry so much. As stated above, DALL-E 2 is not available to all but you can join a waiting list to try it. In the meantime, only its lower-grade replica, DALL-E mini is open access.
So far, DALL-E mini has proved to be a reliable source for memes: the system is the same, but the AI in this case often generates distorted and disturbing images of faces or mangled body parts that seem to protrude from the wrong places.
So type the name of a politician you loathe doing something ridiculous such as riding a whale in a clown costume and you may get a lot of distorted faces of the type you may see in a horror film or in your nightmares. Social media are literally bursting with images of Donald Trump and Boris Johnson doing bizarre things, totally meme-able material. This is generating an aesthetic of its own and you can be sure that at some point Demna Gvasalia at Balenciaga will do a T-shirt with some of these horrific DALL-E mini memes.
That said, if you try and refine your search, you will realise that even the mini version of DALL-E is making progresses: ask it to visualise a Prada branded teapot and it will do so, coming up also with plausible models of the object with a distorted triangular logo in Prada's style; ask it to come up with Memphis Milano designs or Memphis Milano shoes and well, it will attempt to create objects in the style and using the colour schemes and patterns of the Memphis Milano design collective.
So what's the final verdict, will AI help or destroy creativity? Guess it may definitely help, if used by clever human beings, but, for the time being, let's hope Balenciaga spares us a DALL-E mini T-shirt with some distorted or blurred obnoxious politicians.
Comments
You can follow this conversation by subscribing to the comment feed for this post.