Meet Dall·E 2 - Open AI's latest innovation in the AI industry

Artificial intelligence is the ability of a computer (or a robot controlled by a computer), to do tasks that formerly required human intelligence to work with. Though AI is still in its early stage, it has already competed head-to-head with humans.  

An AI called as “Deep blue” beat Garry Kasparov in chess and an AI built by “play form AI” finished Beethoven’s final symphony. Likewise, the AI that we are going to view in this article is capable of creating artist grade pictures just with a short-worded, natural language input. 

Meet Dall·E 2: 

This is an AI research project built by a company called “OpenAI”, a company co-founded by Elon Musk. They stated that it is

 “A new Ai system that can create realistic images and art from a text description in natural language”. 

In January 2021 this company introduced the first variation of Dall·E, which seemed to be a blend of the names Salvador Dali (a Spanish artist) and Wall E (A robot from a Disney fantasy). This robot just like Dall·E 2 could produce images from a short-worded prompt. Yet, the images produced, turned out to be blurry and not effective as it was intended, so in 2022 they have teased an improved version of it in the form of Dall·E 2.  

Along with the natural language processing, it can do much more. 


Reason for its existence:

The whole reason of developing a software like this is to see the world in the eyes of the computer, to see what the computer thinks about something, to check if the world we perceive is same as the world that the computers perceive. 

OpenAI’s statement; 

Our hope is that DALL·E 2 will empower people to express themselves creatively. DALL·E 2 also helps us understand how advanced AI systems see and understand our world, which is critical to our mission of creating AI that benefits humanity.” 


Abilities of Dall·E 2: 


Create: Just like its predecessor, it can take a worded prompt and turn it into an image, it does it with the help of a technique called clip and diffusion. 

Example input: “An astronaut riding a horse as a pencil drawing” 

Clip: This uses machine’s understanding of different concepts to create a picture based on that understanding. In the given example; it knows what an astronaut is, it knows the meaning of riding, it knows the meaning of horse, and it knows the look of a pencil drawing, then it combines all its understanding to make the picture that we described. The clip technology can only provide a gist of the input for computer’s understanding. But to reproduce it you need diffusion. 

Diffusion: Here the blank canvas sets off as “Image noise”, which looks like random dots, then it arranges the dots based on the understanding that it obtained using Clip technology. Which produces the final image. 

How Dall e creates images from scratch
Edit:

Adding: It can paste any object on to any scene that has already been fed. Apart from just pasting it, it can also understand the virtual surface that it is painting on, and add textures, shadows and reflections for the added image (if any). 

Erasing: Just as the addition process, the software can erase any object from the scene that has been input, along with its textures, shadows and reflections. 

 

by Variate: 
Dall·E 2 has this ability to create variations of the image that it has created, or edited,  
Here, instead of just creating one version of the input, it can also create multiple versions of the same thing letting you choose the best by yourself. 

In this example the AI can give you variation, with each variation containing a different dog breed or the same dog in a different location. 


Limitations of Dall·E 2: 

For an AI that uses human input to work, there can be misuse of any form, hence Dall·E 2 has a few limitations of its own to curb the misuse. 

  1. To prevent harm, Dall·E 2 is intentionally denied to produce or edit explicit content of its own. While training, it is reduced the exposure to photorealistic images of people, including public figures and general public. 
  2. Intentionally or accidentally, Dall·E 2 cannot produce text of its own, not even simple texts like “Stop” and “Go”. 
  3. Sadly, for privacy and testing purposes, this software is only available to “trusted personnel”, and not available publicly. it is possible to view some sample creations teasing its launch, but it is not possible to volunteer to go ahead and type your own text.  

When asked about this, OpenAI declared this statement; 

We want to put this research into the hands of people but for the time being, we’re just interested in getting feedback on how people use the platform. We are definitely interested in deploying this technology more widely, but we currently have no plans for commercialization,” 

 This might come to general public in the near future. Till then I am waiting. 


 


 


 

3 Comments

  1. Man your article was really nice. As your article was Short and crisp, I understood the content in an ease. Thanks a lot for helping me to understand the concept.

    ReplyDelete
Previous Post Next Post