🎁Amazon Prime 📖Kindle Unlimited 🎧Audible Plus 🎵Amazon Music Unlimited 🌿iHerb 💰Binance
Video
Transcript
Infants are born with the ability to imitate what other people do.
Here, a ten-minute-old newborn sees another person stick their tongue out for the first
time.
In response, he sticks his own tongue out.
Imitation allows humans to learn new behaviors rapidly.
We would like our robots to be able to learn this way, too.
We’ve built a proof of concept system, trained entirely in simulation: we teach the robot
a new task by demonstrating how to assemble block towers in a way we desire — in this
case a single stack of six virtual blocks to form a tower.
Previously, the robot has seen other examples of manipulating blocks, but not this particular
one.
Our robot has now learned the task, even though its movements have to be different from the
ones in the demonstration.
With a single demonstration of the task, we can replicate it in a number of initial conditions.
Teaching the robot how to build a different block arrangement requires only a single new
demonstration.
Here’s how our system works: the robot perceives the environment with its camera, and manipulates
the blocks with its arm.
At its core, there are two neural networks, working together.
The camera image is first processed by the vision network.
Then, based on the recorded demonstration, the imitation network figures out what action
to take next.
Our vision network is a deep neural net that takes a camera image and determines the position
of the blocks relative to the robot.
To train the network, we use only simulated data, using domain randomization to learn
a robust vision model.
We generate thousands of different object locations, light settings and surface textures,
and show these examples to the network.
After training, the network can find the blocks in the physical world, even though it has
never seen images from a real camera before.
Now that we know the location of the blocks, the imitation network takes over.
Its goal is to mimic the task shown by the demonstrator.
This neural net is trained to predict what action the demonstrator would have taken in
the same situation.
On its own, it has learned to scan through the demonstration and pay attention to the
relevant frames that tell it what to do next.
Nothing in our technique is specific to blocks.
This system is an early prototype, and will form the backbone of the general-purpose robotic
systems we’re developing at OpenAI.