DALL·E 2 Explained | OpenAI

🎁Amazon Prime 📖Kindle Unlimited 🎧Audible Plus 🎵Amazon Music Unlimited 🌿iHerb 💰Binance

Video

Transcript

Have you ever seen a polar bear playing bass?

Or a robot painted like a Picasso?

Didn’t think so.

DALL-E 2 is a new AI system from OpenAI that can take simple text descriptions like, “a

koala dunking a basketball” and turn them into photorealistic images that have never

existed before.

DALL-E 2 can also realistically edit and retouch photos.

Based on a simple natural language description, it can fill in or replace part of an image

with AI-generated imagery that blends seamlessly with the original.

It’s called “in-painting”.

In January 2021, OpenAI introduced DALL-E, a system that could generate images from text,

like this “Avocado Armchair”.

DALL-E 2 takes the technology even further with higher resolution, greater comprehension,

and new capabilities like in-painting.

It can even start with an image as an input and create variations with different angles

and styles.

DALL-E was created by training a neural network on images and their text descriptions.

Through deep learning, it not only understands individual objects, like koala bears and motorcycles,

but learns from relationships between objects.

And when you ask DALL-E for an image of a koala bear riding a motorcycle, it knows how

to create that or anything else with a relationship to another object or action.

The DALL-E research has three main outcomes:

First, it can help people express themselves visually in ways they may not have been able

to before.

Second, an AI-generated image can tell us a lot about whether the system understands

us, or is just repeating what it has been taught.

Third, DALL-E helps humans understand how advanced AI systems see and understand our

world.

This is a critical part of developing AI that’s useful and safe.

The technology is constantly evolving, and DALL-E 2 has limitations.

If it’s taught with objects that are incorrectly labeled, like a plane labeled “car”, and

a user tries to generate a car, DALL-E may create…a plane.

It’s like talking to a person who learned the wrong word for something.

DALL-E can also be limited by gaps in its training.

For example, if you type “baboon” and DALL-E has learned what a baboon is through

images and accurate labels, it will generate a lot of great baboons.

But if you type “howler monkey” and it hasn’t learned what a howler monkey is, DALL-E

will give you its best idea of what it thinks it could be: like a “howling monkey”.

What’s exciting about the approach used to train DALL-E is that it can take what it learned

from a variety of other labeled images and then apply it to a new image.

Given a picture of a monkey, DALL-E can infer what it would look like doing something it’s

never done before.

Like paying its taxes, while wearing a funny hat.

DALL-E is an example of how imaginative humans and clever systems can work together to make

new things – amplifying our creative potential.