The following is a conversation with Wojciech Zaremba, cofounder of OpenAI,
which is one of the top organizations in the world doing artificial intelligence
research and development.
Wojciech is the head of language and cogeneration teams, building and doing
research on GitHub Copilot, OpenAI Codex, and GPT 3, and who knows, 4, 5, 6,
and, and, and plus one, and he also previously led OpenAI’s robotic efforts.
These are incredibly exciting projects to me that deeply challenge and expand
our understanding of the structure and nature of intelligence.
The 21st century, I think, may very well be remembered for a handful of
revolutionary AI systems and their implementations.
GPT, Codex, and applications of language models and transformers in general
to the language and visual domains may very well be at the core of these AI
systems. To support this podcast, please check out our sponsors.
They’re listed in the description.
This is the Lex Friedman podcast, and here is my conversation
with Wojciech Zaremba.
You mentioned that Sam Altman asked about the Fermi Paradox, and the people
at OpenAI had really sophisticated, interesting answers, so that’s when you
knew this is the right team to be working with.
So let me ask you about the Fermi Paradox, about aliens.
Why have we not found overwhelming evidence for aliens visiting Earth?
I don’t have a conviction in the answer, but rather kind of probabilistic
perspective on what might be, let’s say, possible answers.
It’s also interesting that the question itself even can touch on the, you
know, your typical question of what’s the meaning of life, because if you
assume that, like, we don’t see aliens because they destroy themselves, that
kind of upweights the focus on making sure that we won’t destroy ourselves.
At the moment, the place where I am actually with my belief, and these
things also change over the time, is I think that we might be alone in
the universe, which actually makes life more, or let’s say, consciousness
life, more kind of valuable, and that means that we should more appreciate it.
Have we always been alone?
So what’s your intuition about our galaxy, our universe?
Is it just sprinkled with graveyards of intelligent civilizations, or are
we truly, is life, intelligent life, truly unique?
At the moment, my belief that it is unique, but I would say I could also,
you know, there was like some footage released with UFO objects, which makes
me actually doubt my own belief.
Yes.
Yeah, I can tell you one crazy answer that I have heard.
Yes.
So, apparently, when you look actually at the limits of computation, you
can compute more if the temperature of the universe would drop.
Temperature of the universe would drop down.
So one of the things that aliens might want to do if they are truly optimizing
to maximize amount of compute, which, you know, maybe can lead to, or let’s
say simulations or so, it’s instead of wasting current entropy of the
universe, because, you know, we, by living, we are actually somewhat
wasting entropy, then you can wait for the universe to cool down such that
you have more computation.
So that’s kind of a funny answer.
I’m not sure if I believe in it, but that would be one of the
reasons why you don’t see aliens.
It’s also possible to some people say that maybe there is not that much
point in actually going to other galaxies if you can go inwards.
So there is no limits of what could be an experience if we could, you
know, connect machines to our brains while there are still some limits
if we want to explore the universe.
Yeah, there could be a lot of ways to go inwards too.
Once you figure out some aspect of physics, we haven’t figured out yet.
Maybe you can travel to different dimensions.
I mean, travel in three dimensional space may not be the most fun kind of travel.
There may be like just a huge amount of different ways to travel and it
doesn’t require a spaceship going slowly in 3d space to space time.
It also feels, you know, one of the problems is that speed of light
is low and the universe is vast.
And it seems that actually most likely if we want to travel very far, then
we would, instead of actually sending spaceships with humans that weight a
lot, we would send something similar to what Yuri Miller is working on.
These are like a huge sail, which is at first powered or there is a shot of
laser from an air and it can propel it to quarter of speed of light and sail
itself contains a few grams of equipment.
And that might be the way to actually transport matter through universe.
But then when you think what would it mean for humans, it means that we would
need to actually put their 3d printer and, you know, 3d print the human on
other planet, I don’t know, play them YouTube or let’s say, or like a 3d
print like huge human right away, or maybe a womb or so, um, yeah.
With our current techniques of archeology, if, if, if a civilization
was born and died, uh, long, long enough ago on earth, we wouldn’t be able to
tell, and so that makes me really sad.
And so I think about earth in that same way.
How can we leave some remnants if we do destroy ourselves?
How can we leave remnants for aliens in the future to discover?
Like, here’s some nice stuff we’ve done, like Wikipedia and YouTube.
Do we have it like in a satellite orbiting earth with a hard drive?
Like, how, how do we say, how do we back up human civilization?
Uh, the good parts or all of it is good parts so that, uh, it can be
preserved longer than our bodies can.
That’s a, that’s kind of, um, it’s a difficult question.
It also requires the difficult acceptance of the fact that we may die.
And if we die, we may die suddenly as a civilization.
So let’s see, I think it kind of depends on the cataclysm.
We have observed in other parts of the universe that birds of gamma rays, uh,
these are, uh, high energy, uh, rays of light that actually can
apparently kill entire galaxy.
So there might be actually nothing, even to, nothing to protect us from it.
I’m also, and I’m looking actually at the past civilizations.
So it’s like Aztecs or so they disappear from the surface of the earth.
And one can ask, why is it the case?
And the way I’m thinking about it is, you know, that definitely they had some
problem that they couldn’t solve and maybe there was a flood and all of a
sudden they couldn’t drink, uh, there was no potable water and they all died.
And, um, I think that, uh, so far the best solution to such a problems is I
guess, technology, so, I mean, if they would know that you can just boil
water and then drink it after, then that would save their civilization.
And even now, when we look actually at the current pandemic, it seems
that there, once again, actually science comes to rest.
And somehow science increases size of the action space.
And I think that’s a good thing.
Yeah.
But nature has a vastly larger action space, but still it might be a good thing
for us to keep on increasing action space.
Okay.
Uh, looking at past civilizations.
Yes.
But looking at the destruction of human civilization, perhaps expanding the
action space will add, um, actions that are easily acted upon, easily executed
and as a result, destroy us.
So let’s see, I was pondering, uh, why actually even, uh, we have
negative impact on the, uh, globe.
Because, you know, if you ask every single individual, they
would like to have clean air.
They would like healthy planet, but somehow it’s not.
It’s not the case that as a collective, we are not going in this direction.
I think that there exists very powerful system to describe what we value.
That’s capitalism.
It assigns actually monetary values to various activities.
At the moment, the problem in the current system is that there’s
some things which we value.
There is no cost assigned to it.
So even though we value clean air, or maybe we also, uh, value, uh,
value lack of destruction on, let’s say internet or so at the moment, these
quantities, you know, companies, corporations can pollute them, uh, for free.
So in some sense, I wished or like, and that’s, I guess, purpose of politics
to, to align the incentive systems.
And we are kind of maybe even moving in this direction.
The first issue is even to be able to measure the things that we value.
Then we can actually assign the monetary value to them.
Yeah.
And that’s, so it’s getting the data and also probably through technology,
enabling people to vote and to move money around in a way that is aligned
with their values, and that’s very much a technology question.
So like having one president and Congress and voting that happens every four years
or something like that, that’s a very outdated idea that could be some
technological improvements to that kind of idea.
So I’m thinking from time to time about these topics, but it’s also feels to me
that it’s, it’s a little bit like, uh, it’s hard for me to actually make
correct predictions.
What is the appropriate thing to do?
I extremely trust, uh, Sam Altman, our CEO on these topics here, um, like, uh,
I’m more on the side of being, I guess, naive hippie.
That, uh, yeah, that’s your life philosophy.
Um, well, like I think self doubt and, uh, I think hippie implies optimism.
Those, those two things are pretty, pretty good way to operate.
I mean, still, it is hard for me to actually understand how the politics
works or like, uh, how this, like, uh, exactly how the things would play out.
And Sam is, uh, really excellent with it.
What do you think is rarest in the universe?
You said we might be alone.
What’s hardest to build is another engineering way to ask that life,
intelligence or consciousness.
So like you said that we might be alone, which is the thing that’s hardest to get
to, is it just the origin of life?
Is it the origin of intelligence?
Is it the origin of consciousness?
So, um, let me at first explain to you my kind of mental model, what I think
is needed for life to appear.
Um, so I imagine that at some point there was this primordial, uh, soup of, uh,
amino acids and maybe some proteins in the ocean and, uh, you know, some
proteins were turning into some other proteins through reaction and, uh, you
can also, uh, you know, you can, you know, you can, you know, you can
and, uh, you can almost think about this, uh, cycle of what, uh, turns into what
as there is a graph essentially describing which substance turns into
some other substance and essentially life means that all of a sudden in the graph
has been created that cycle such that the same thing keeps on happening over
and over again, that’s what is needed for life to happen.
And in some sense, you can think almost that you have this gigantic graph and it
needs like a sufficient number of edges for the cycle to appear.
Um, then, um, from perspective of intelligence and consciousness, uh, my
current intuition is that they might be quite intertwined.
First of all, it might not be that it’s like a binary thing that you
have intelligence or consciousness.
It seems to be, uh, uh, more, uh, continuous component.
Let’s see, if we look for instance on the event networks, uh, recognizing
images and people are able to show that the activations of these networks
correlate very strongly, uh, with activations in visual cortex, uh, of
some monkeys, the same seems to be true about language models.
Um, also if you, for instance, um, look, um, if you train agent in, um, 3d
world, um, at first, you know, it, it, it, it barely recognizes what is going
on over the time, it kind of recognizes foreground from a background over the
time, it kind of knows where there is a foot, uh, and it just follows it.
Um, over the time it actually starts having a 3d perception.
So it is possible for instance, to look inside of the head of an agent and ask,
what would it see if it looks to the right?
And the crazy thing is, you know, initially when the agents are barely
trained, that these predictions are pretty bad over the time they become
better and better, you can still see that if you ask what happens when the
head is turned by 360 degrees for some time, they think that the different
thing appears and then at some stage they understand actually that the same
thing supposed to appear.
So they get that understanding of 3d structure.
It’s also, you know, very likely that they have inside some level of, of like
a symbolic reasoning, like a particular, these symbols for other agents.
So when you look at DOTA agents, they collaborate together and, uh, and, uh,
no, they, they, they, they have some anticipation of, uh, if, if they would
win battle, they have some, some expectations with respect to other
agents.
I might be, you know, too much anthropomorphizing, um, the, the, the,
how the things look, look, look for me, but then the fact that they have a
symbol for other agents, uh, makes me believe that, uh, at some stage as the,
uh, you know, as they are optimizing for skills, they would have also symbol to
describe themselves.
Uh, this is like a very useful symbol to have.
And this particularity, I would call it like a self consciousness or self
awareness, uh, and, uh, still it might be different from the consciousness.
So I guess the, the way how I’m understanding the word consciousness,
I’d say the experience of drinking a coffee or let’s say experience of being
a bat, that’s the meaning of the word consciousness.
It doesn’t mean to be awake.
Uh, yeah, it feels, it might be also somewhat related to memory and
recurrent connections.
So, um, it’s kind of like, if you look at anesthetic drugs, they might be, uh,
uh, like, uh, that they essentially, they, they disturb, uh, uh, brainwaves, uh, such
that, um, maybe memories, not, not form.
And so there’s a lessening of consciousness when you do that.
Correct.
And so that’s the one way to intuit what is consciousness.
There’s also kind of another element here.
It could be that it’s, you know, this kind of self awareness
module that you described, plus the actual subjective experience is a
storytelling module that tells us a story about, uh, what we’re experiencing.
The crazy thing.
So let’s say, I mean, in meditation, they teach people not to speak
story inside of their head.
And there is also some fraction of population who doesn’t have actually
a narrator, I know people who don’t have a narrator and, you know, they have
to use external people in order to, um, kind of solve tasks that
require internal narrator.
Um, so it seems that it’s possible to have the experience without the talk.
What are we talking about when we talk about the internal narrator?
Is that the voice when you’re like, yeah, I thought that that’s what you are
referring to while I was referring more on the, like, not an actual voice.
I meant like, there’s some kind of like subjective experience feels like it’s.
It’s fundamentally about storytelling to ourselves.
It feels like, like the feeling is a story that is much, uh, much
simpler abstraction than the raw sensory information.
So there feels like it’s a very high level of abstraction that, uh, is useful
for me to feel like entity in this world.
M most useful aspect of it is that because I’m conscious, I think there’s
an intricate connection to me, not wanting to die.
So like, it’s a useful hack to really prioritize not dying, like those
seem to be somehow connected.
So I’m telling the story of like, it’s rich.
He feels like something to be me and the fact that me exists in this world.
I want to preserve me.
And so that makes it a useful agent hack.
So I will just refer maybe to that first part, as you said, about that kind
of story of describing who you are.
Um, I was, uh, thinking about that even, so, you know, obviously I’m, I, I like
thinking about consciousness, uh, I like thinking about AI as well, and I’m trying
to see analogies of these things in AI, what would it correspond to?
So, um, um, you know, open AI train, uh, uh, a model called GPT, uh, which, uh,
can generate, uh, pretty, I’m using texts on arbitrary topic and, um, um, and one
way to control GPT is, uh, by putting into prefix at the beginning of the text, some
information, what would be the story about, uh, you can have even chat with, uh, uh,
you know, with GPT by saying that the chat is with Lex or Elon Musk or so, and, uh,
GPT would just pretend to be you or Elon Musk or so, and, uh, uh, it almost feels
that this, uh, story that we give ourselves to describe our life, it’s almost like, uh,
things that you put into context of GPT.
Yeah.
The primary, it’s the, and so, but the context we provide to GPT is, uh, is multimodal.
It’s more so GPT itself is multimodal.
GPT itself, uh, hasn’t learned actually from experience of single human, but from the
experience of humanity, it’s a chameleon.
You can turn it into anything and in some sense, by providing context, um, it, you
know, behaves as the thing that you wanted it to be.
Um, it’s interesting that the, you know, people have a stories of who they are.
And, uh, as you said, these stories, they help them to operate in the world.
Um, but it’s also, you know, interesting, I guess, various people find it out through
meditation or so that, uh, there might be some patterns that you have learned when
you were a kid that actually are not serving you anymore.
And you also might be thinking that that’s who you are and that’s actually just a story.
Mm hmm.
Yeah.
So it’s a useful hack, but sometimes it gets us into trouble.
It’s a local optima.
It’s a local optima.
You wrote that Stephen Hawking, he tweeted, Stephen Hawking asked what
breathes fire into equations, which meant what makes given mathematical
equations realize the physics of a universe.
Similarly, I wonder what breathes fire into computation.
What makes given computation conscious?
Okay.
So how do we engineer consciousness?
How do you breathe fire and magic?
How do you breathe fire and magic into the machine?
So, um, it seems clear to me that not every computation is conscious.
I mean, you can, let’s say, just keep on multiplying one matrix over and over
again and might be gigantic matrix.
You can put a lot of computation.
I don’t think it would be conscious.
So in some sense, the question is, uh, what are the computations which could be
conscious, uh, I mean, so, so one assumption is that it has to do purely
with computation that you can abstract away matter and other possibilities
that it’s very important was the realization of computation that it has
to do with some, uh, uh, force fields or so, and they bring consciousness.
At the moment, my intuition is that it can be fully abstracted away.
So in case of computation, you can ask yourself, what are the mathematical
objects or so that could bring such a properties?
So for instance, if we think about the models, uh, AI models, the, what they
truly try to do, uh, or like a models like GPT is, uh, uh, you know, they try
to predict, uh, next word or so.
And this turns out to be equivalent to, uh, compressing, uh, text.
Um, and, uh, because in some sense, compression means that, uh, you learn
the model of reality and you have just to, uh, remember where are your mistakes.
The better you are in predicting the, and, and, and in some sense, when we
look at our experience, also, when you look, for instance, at the car driving,
you know, in which direction it will go, you are good like in prediction.
And, um, you know, it might be the case that the consciousness is intertwined
with, uh, compression, it might be also the case that self consciousness, uh,
has to do with compress or trying to compress itself.
So, um, okay.
I was just wondering, what are the objects in, you know, mathematics or
computer science, which are mysterious that could, uh, that, that, that could
have to do with consciousness.
And then I thought, um, you know, you, you see in mathematics, there is
something called Gadel theorem, uh, which means, okay, you have, if you have
sufficiently complicated mathematical system, it is possible to point the
mathematical system back on itself.
In computer science, there is, uh, something called helping problem.
It’s, it’s somewhat similar construction.
So I thought that, you know, if we believe that, uh, that, uh, that under
assumption that consciousness has to do with, uh, with compression, uh, then
you could imagine that the, that the, as you keep on compressing things, then
at some point, it actually makes sense for the compressor to compress itself.
Metacompression consciousness is metacompression.
That’s a, that’s an I, an, an, an idea.
And in some sense, you know, the crazy, thank you.
So, uh, but do you think if we think of a Turing machine, a universal
Turing machine, can that achieve consciousness?
So is there some thing beyond our traditional definition
of computation that’s required?
So it’s a specific computation.
And I said, this computation has to do with compression and, uh, the compression
itself, maybe other way of putting it is like, uh, you are internally creating
the model of reality in order, like, uh, it’s like a, you try inside to simplify
reality in order to predict what’s going to happen.
And, um, that also feels somewhat similar to how I think actually about my own
conscious experience, though clearly I don’t have access to reality.
The only access to reality is through, you know, cable going to my brain and my
brain is creating a simulation of reality and I have access to the simulation of
reality.
Are you by any chance, uh, aware of, uh, the Hutter prize, Marcus Hutter?
He, uh, he made this prize for compression.
Uh, Wikipedia pages, and, uh, there’s a few qualities to it.
One, I think has to be perfect compression, which makes, I think that
little cork makes it much less, um, applicable to the general task of
intelligence, because it feels like intelligence is always going to be messy.
Uh, like perfect compression is feels like it’s not the right goal, but
it’s nevertheless a very interesting goal.
So for him, intelligence equals compression.
And so the smaller you make the file, given a large Wikipedia page, the
more intelligent the system has to be.
Yeah, that makes sense.
So you can make perfect compression if you store errors.
And I think that actually what he meant is you have algorithm plus errors.
Uh, by the way, Hutter, Hutter is, uh, he was a PhD advisor of Sean
Leck, who is a DeepMind, uh, uh, DeepMind cofounder.
Yeah.
Yeah.
So there’s an interesting, uh, and now he’s a DeepMind, there’s an
interesting, uh, network of people.
And he’s one of the people that I think seriously took on the task of
what would an AGI system look like?
Uh, I think for a longest time, the question of AGI was not taken
seriously or rather rigorously.
And he did just that, like mathematically speaking, what
would the model look like if you remove the constraints of it, having to be,
uh, um, having to have a reasonable amount of memory, reasonable amount
of, uh, running time, complexity, uh, computation time, what would it look
like and essentially it’s, it’s a half math, half philosophical discussion
of, uh, how would it like a reinforcement learning type of
framework look like for an AGI?
Yeah.
So he developed the framework even to describe what’s optimal with
respect to reinforcement learning.
Like there is a theoretical framework, which is, as you said, under assumption,
there is infinite amount of memory and compute.
Um, there was actually one person before his name is Solomonov, who
there extended, uh, Solomonov work to reinforcement learning, but there
exists the, uh, theoretical algorithm, which is optimal algorithm to build
intelligence and I can actually explain you the algorithm.
Yes.
Let’s go.
Let’s go.
So the task itself, can I just pause how absurd it is for brain in a
skull, trying to explain the algorithm for intelligence, just go ahead.
It is pretty crazy.
It is pretty crazy that, you know, the brain itself is actually so
small and it can ponder, uh, how to design algorithms that optimally
solve the problem of intelligence.
Okay.
All right.
So what’s the algorithm?
So let’s see.
So first of all, the task itself is, uh, described as, uh, you have infinite
sequence of zeros and ones.
Okay.
Okay. You read, uh, N bits and they are about to predict N plus one bit.
So that’s the task.
And you could imagine that every task could be casted as such a task.
So if for instance, you have images and labels, you can just turn every image
into a sequence of zeros and ones, then label, you concatenate labels and
you, and that that’s actually the, the, and you could, you could start by
having training data first, and then afterwards you have test data.
So theoretically any problem could be casted as a problem of predicting
zeros and ones on this, uh, infinite tape.
So, um, so let’s say you read already N bits and you want to predict N plus
one bit, and I will ask you to write every possible program that generates
these N bits.
Okay.
So, um, and you can have, you, you choose programming language.
It can be Python or C plus plus.
And the difference between programming languages, uh, might be, there is
a difference by constant asymptotically, your predictions will be equivalent.
So you read N bits, you enumerate all the programs that produce
these N bits in their output.
And then in order to predict N plus one bit, you actually weight the programs
according to their length.
And there is like a, some specific formula, how you weight them.
And then the N plus, uh, one bit prediction is the prediction, uh, from each
of these program, according to that weight.
Like statistically, you pick, so the smaller the program, the more likely
you, you are to pick the, its output.
So, uh, that’s, that algorithm is grounded in the hope or the intuition
that the simple answer is the right one.
It’s a formalization of it.
Um, it also means like, if you would ask the question after how many years
would, you know, sun explode, uh, you can say, hmm, it’s more likely
the answer is due to some power because they’re shorter program.
Yeah.
Um, then other, well, I don’t have a good intuition about, uh, how different
the space of short programs are from the space of large programs.
Like, what is the universe where short programs, uh, like run things?
Uh, so, so I said, the things have to agree with N bits.
So even if you have, you, you need to start, okay.
If, if you have very short program and they’re like a steel, some has, if, if
it’s not perfectly prediction of N bits, you have to start errors.
What are the errors?
And that gives you the full program that agrees on N bits.
Oh, so you don’t agree with the N bits.
And you store, that’s like a longer, a longer program, slightly longer program
because it can take these extra bits of errors.
That’s fascinating.
What’s what’s your intuition about the, the programs that are able to do cool
stuff like intelligence and consciousness, are they, uh, perfectly like, is, is it,
uh, is there if then statements in them?
So like, is there a lot of a good, uh, if then statements in them?
So like, is there a lot of exceptions that they’re storing?
So, um, you could imagine if there would be tremendous amount of if statements,
then they wouldn’t be that short.
In case of neural networks, you could imagine that, um, what happens is, uh,
they, uh, when you start with an initialized neural network, uh, it stores
internally many possibilities, how the, uh, how the problem can be solved.
And SGD is kind of magnifying some, some, uh, some, uh, paths, which are slightly
similar to the correct answer.
So it’s kind of magnifying correct programs.
And in some sense, SGD is a search algorithm in the program space and the
program space is represented by, uh, you know, kind of the wiring inside of the
neural network and there’s like an insane number of ways how the features can be
computed.
Let me ask you the high level, basic question that’s not so basic.
What is deep learning?
Is there a way you’d like to think of it that is different than like
a generic textbook definition?
The thing that I hinted just a second ago is maybe that, uh, closest to how I’m
thinking these days about deep learning.
So, uh, now the statement is, uh, neural networks can represent some programs.
Uh, it seems that various modules that we are actually adding up to, or like, uh,
you know, we, we want networks to be deep because we, we want multiple
steps of the computation and, uh, uh, and deep learning provides the way to
represent space of programs, which is searchable and it’s searchable with,
uh, stochastic gradient descent.
So we have an algorithm to search over humongous number of programs and
gradient descent kind of bubbles up the things that are, uh, tend to give correct
answers.
So a neural network with a, with fixed weights that’s optimized, do you think
of that as a single program?
Um, so there is a, uh, work by Christopher Olaj where he, uh, so he works on
interpretability of neural networks and he was able to, uh, to identify the
neural network, for instance, a detector of a wheel for a car, or the detector of
a mask for a car, and then he was able to separate them out and assemble them, uh,
together using a simple program, uh, for the detector, for a car detector.
That’s like, uh, if you think of traditionally defined programs, that’s
like a function within a program that this particular neural network was able
to find and you can tear that out, just like you can copy and paste it into a
stack overflow that, so, uh, any program is a composition of smaller programs.
Yeah.
I mean, the nice thing about the neural networks is that it allows the things
to be more fuzzy than in case of programs.
Uh, in case of programs, you have this, like a branching this way or that way.
And the neural networks, they, they have an easier way to, to be somewhere in
between or to share things.
What is the most beautiful or surprising idea in deep learning and the utilization
of these neural networks, which by the way, for people who are not familiar,
neural networks is a bunch of, uh, what would you say it’s inspired by the human
brain, there’s neurons, there’s connection between those neurons, there’s inputs and
there’s outputs and there’s millions or billions of those neurons and the
learning happens in the neural network.
Neurons and the learning happens, uh, by adjusting the weights on the
edges that connect these neurons.
Thank you for giving definition that I supposed to do it, but I guess you have
enough empathy to listeners to actually know that the, that that might be useful.
No, that’s like, so I’m asking Plato of like, what is the meaning of life?
He’s not going to answer.
You’re being philosophical and deep and quite profound talking about the space
of programs, which is, which is very interesting, but also for people who
just not familiar with the hell we’re talking about when we talk about deep
learning anyway, sorry, what is the most beautiful or surprising idea to you in,
in, um, in all the time you’ve worked at deep learning and you worked on a lot of.
Fascinating projects, applications of neural networks.
It doesn’t have to be big and profound.
It can be a cool trick.
Yeah.
I mean, I’m thinking about the trick, but like, uh, it’s still, uh, I’m using
to me that it works at all that let’s say that the extremely simple algorithm
stochastic gradient descent, which is something that I would be able to derive
on the piece of paper to high school student, uh, when put at the, at the
scale of, you know, thousands of machines actually, uh, can create the.
Behaviors we, which we called kind of human like behaviors.
So in general, any application is stochastic gradient descent
to neural networks is, is amazing to you.
So that, or is there a particular application in natural language
reinforcement learning, uh, and also what do you attribute that success to?
Is it just scale?
What profound insight can we take from the fact that the thing works
for gigantic, uh, sets of variables?
I mean, the interesting thing is this algorithms, they were invented decades
ago and, uh, people actually, uh, gave up on the idea and, um, you know, back
then they thought that we need profoundly different algorithms and they spent a lot
of cycles on very different algorithms.
And I believe that, uh, you know, we have seen that various, uh, various innovations
that say like transformer or, or dropout or so they can, uh, you know, pass the
help, but it’s also remarkable to me that this algorithm from sixties or so, uh, or,
I mean, you can even say that the gradient descent was invented by Leibniz in, I
guess, 18th century or so that actually is the core of learning in the past.
In the past people are, it’s almost like a, out of the, maybe an ego, people are
saying that it cannot be the case that such a simple algorithm is there, you
know, uh, could solve complicated problems.
So they were in search for the other algorithms.
And as I’m saying, like, I believe that actually we are in the game where there
is, there are actually frankly three levers.
There is compute, there are algorithms and there is data.
And, uh, if we want to build intelligent systems, we have to pull, uh, all three
levers and they are actually multiplicative.
Um, it’s also interesting.
So you ask, is it only compute?
Uh, people internally, they did the studies to determine how much gains they
were coming from different levers.
And so far we have seen that more gains came from compute than algorithms, but
also we are in the world that in case of compute, there is a kind of, you know,
exponential increase in funding and at some point it’s impossible to, uh, invest
more, it’s impossible to, you know, invest $10 trillion as we are speaking about
the, let’s say all taxes in us.
Uh, but you’re talking about money that could be innovation in the compute.
That’s that’s true as well.
Uh, so I mean, they’re like a few pieces.
So one piece is human brain is an incredible supercomputer and they’re like
a, it, it, it has a hundred trillion parameters or like a, if you try to count
the various quantities in the brain, they’re like a neuron synapses that small
number of neurons, there is a lot of synapses it’s unclear even how to map, uh,
synapses to, uh, to parameters of neural networks, but it’s clear that there are
many more.
Yeah. Um, so it might be the case that our networks are still somewhat small.
Uh, it also might be the case that they are more efficient than brain or less
efficient by some, by some huge factor.
Um, I also believe that there will be like a, you know, at the moment we are at
the stage that the, these neural networks, they require thousand X or, or like a
huge factor of more data than humans do.
And it will be a matter of, uh, um, there will be algorithms that vastly decrease
sample complexity, I believe so, but that place where we are heading today is
there are domains which contains million X more data.
And even though computers might be 1000 times slower than humans in learning,
that’s not a problem.
Like, uh, for instance, uh, I believe that, uh, it should be possible to create
super human therapist, uh, by, uh, and, and the, the, like, uh, even simple
steps of, of, of doing what, of, of doing it.
And, you know, the, the core reason is there is just machine will be able to
read way more transcripts of therapies, and then it should be able to speak
simultaneously with many more people and it should be possible to optimize it,
uh, all in parallel.
And, uh, well, there’s now you’re touching on something I deeply care about
and think is way harder than we imagine.
Um, what’s the goal of a therapist?
What’s the goal of therapies?
So, okay, so one goal now this is terrifying to me, but there’s a lot of
people that, uh, contemplate suicide, suffer from depression, uh, and they
could significantly be helped with therapy and the idea that an AI algorithm
might be in charge of that, it’s like a life and death task.
It’s, uh, the stakes are high.
So one goal for a therapist, whether human or AI is to prevent suicide
ideation to prevent suicide.
How do you achieve that?
So let’s see.
So to be clear, I don’t think that the current models are good enough for such
a task because it requires insane amount of understanding, empathy, and the
models are far from this place, but it’s.
But do you think that understanding empathy, that signal is in the data?
Um, I think there is some signal in the data.
Yes.
I mean, there are plenty of transcripts of conversations and it is possible to,
it is possible from it to understand personalities.
It is possible from it to understand, uh, if conversation is, uh,
friendly, uh, amicable, uh, uh, antagonistic, it is, I believe that the,
you know, given the fact that the models that we train now, they can, uh, they
can have, they are chameleons that they can have any personality, they might
turn out to be better in understanding, uh, personality of other people than
anyone else and they empathetic to be empathetic.
Yeah.
Interesting.
Yeah, interesting. Uh, but I wonder if there’s some level of, uh, multiple
modalities required to be able to, um, be empathetic of the human experience,
whether language is not enough to understand death, to understand fear,
to understand, uh, childhood trauma, to understand, uh, wit and humor required
when you’re dancing with a person who might be depressed or suffering both
humor and hope and love and all those kinds of things.
So there’s another underlying question, which is self supervised versus
supervised.
So can you get that from the data by just reading a huge number of transcripts?
I actually, so I think that reading huge number of transcripts is a step one.
It’s like at the same way as you cannot learn to dance if just from YouTube by
watching it, you have to actually try it out yourself.
And so I think that here that’s a similar situation.
I also wouldn’t deploy the system in the high stakes situations right away, but
kind of see gradually where it goes.
And, uh, obviously initially, uh, it would have to go hand in hand with humans.
But, uh, at the moment we are in the situation that actually there is many
more people who actually would like to have a therapy or, or speak with, with
someone than there are therapies out there.
I can, you know, I was so, so fundamentally I was thinking, what are
the things that, uh, can vastly increase people’s well being therapy is one of
them being meditation is other one, I guess maybe human connection is a third
one, and I guess pharmacologically it’s also possible, maybe direct brain
stimulation or something like that.
But these are pretty much options out there.
Then let’s say the way I’m thinking about the AGI endeavor is by default,
that’s an endeavor to, uh, increase amount of wealth.
And I believe that we can invest the increase amount of wealth for everyone
and simultaneously.
So, I mean, there are like a two endeavors that make sense to me.
One is like essentially increase amount of wealth.
And second one is, uh, increase overall human wellbeing.
And those are coupled together and they, they can, like, uh, I would
say these are different topics.
One can help another and, uh, you know, therapist is a, is a funny word
because I see friendship and love as therapy.
I mean, so therapist broadly defined as just friendship as a friend.
So like therapist is, has a very kind of clinical sense to it, but what
is human connection you’re like, uh, not to get all Camus and Dostoevsky on you,
but you know, life is suffering and we draw, we seek connection with the
humans as we, uh, desperately try to make sense of this world in a deep
overwhelming loneliness that we feel inside.
So I think connection has to do with understanding.
And I think that almost like a lack of understanding causes suffering.
If you speak with someone and do you, do you feel ignored that actually causes pain?
If you are feeling deeply understood that actually they, they, they might
not even tell you what to do in life, but like a pure understanding
or just being heard, understanding is a kind of, that’s a lot, you know,
just being heard, feel like you’re being heard, like somehow that’s a
alleviation temporarily of the loneliness that if somebody knows
you’re here with their body language, with the way they are, with the way
they look at you, with the way they talk, do you feel less alone for a brief moment?
Yeah, very much agree.
So I thought in the past about, um, somewhat similar question to yours,
which is what is love, uh, rather what is connection.
Yes. And, um, and obviously I think about these things from AI perspective.
What would it mean?
Um, so I said that, um, you know, intelligence has to do with some compression,
which is more or less like I can say, almost understanding of what is going around.
It seems to me that, uh, other aspect is there seem to be reward functions and you
can have, uh, uh, you know, reward for, uh, food, for maybe human connection, for,
uh, let’s say warmth, uh, sex and so on.
And, um, and it turns out that the various people might be optimizing slightly
different, uh, reward functions.
They essentially might care about different things.
And, uh, uh, in case of, uh, love at least the love between two people, you can say
that the, um, you know, boundary between people dissolves to such extent that, uh,
they end up optimizing each other reward functions and yeah, oh, that’s interesting.
Um, celebrate the success of each other.
Yeah.
In some sense, I would say love means, uh, helping others to optimize their, uh,
reward functions, not your reward functions, not the things that you think are
important, but the things that the person cares about, you try to help them to,
uh, optimize it.
So love is, uh, if you think of two reward functions, you just, it’s a condition.
You combine them together, pretty much maybe like with a weight and it depends
like the dynamic of the relationship.
Yeah.
I mean, you could imagine that if you’re fully, uh, optimizing someone’s reward
function without yours, then, then maybe are creating codependency or something
like that, but I’m not sure what’s the appropriate weight, but the interesting
thing is I even, I even think that the, uh, individual reward function is
saying that the individual person, uh, uh, we ourselves, we are actually less
of a unified insight.
So for instance, if you look at, at the donut on the one level, you might think,
oh, this is like, it looks tasty.
I would like to eat it on other level.
You might tell yourself, I shouldn’t be doing it because I want to gain muscles.
So, and you know, you might do it regardless kind of against yourself.
So it seems that even within ourselves, they’re almost like a kind of intertwined
personas and, um, I believe that the self love means that, uh, the love between all
these personas, which also means being able to love, love yourself when we are
angry or stressed or so combining all those reward functions of the different
selves you have and accepting that they are there, like, uh, you know, often
people, they have a negative self talk or they say, I don’t like when I’m angry.
And like, I try to imagine, try to imagine if there would be like a small
baby Lex, like a five years old, angry, and then they are like, you shouldn’t
be angry.
Like stop being angry.
Yeah.
But like an instant, actually you want the Lex to come over, give him a hug and
just like, I say, it’s fine.
Okay.
It’s going to be angry as long as you want.
And then he would stop or, or maybe not, or maybe not, but you cannot expect it
even.
Yeah.
But still, that doesn’t explain the why of love.
Like why is love part of the human condition?
Why is it useful to combine the reward functions?
It seems like that doesn’t, I mean, I don’t think reinforcement learning
frameworks can give us answers to why even, even the Hutter framework has
an objective function that’s static.
So we came to existence as a consequence of evolutionary process.
And in some sense, the purpose of evolution is survival.
And then the, this complicated optimization objective baked into us, let’s
say compression, which might help us operate in the real world and it baked
into us various reward functions.
Yeah.
Then to be clear at the moment we are operating in the regime, which is somewhat
out of distribution, where they even evolution optimized us.
It’s almost like love is a consequence of a cooperation that we’ve discovered is
useful.
Correct.
In some way it’s even the case.
If you, I just love the idea that love is like the out of distribution.
Or it’s not out of distribution.
It’s like, as you said, it evolved for cooperation.
Yes.
And I believe that the cop, like in some sense, cooperation ends up helping each
of us individually, so it makes sense evolutionary and there is a, in some
sense, and, you know, love means there is this dissolution of boundaries that you
have a shared reward function and we evolve to actually identify ourselves with
larger groups, so we can identify ourselves, you know, with a family, we can
identify ourselves with a country to such extent that people are willing to give
away their life for country.
So there is, we are wired actually even for love.
And at the moment, I guess, the, maybe it would be somewhat more beneficial if you
will, if we would identify ourselves with all the humanity as a whole.
So you can clearly see when people travel around the world, when they run into
person from the same country, they say, oh, which CPR and all this, like all the
sudden they find all these similarities.
They find some, they befriended those folks earlier than others.
So there is like a sense, some sense of the belonging. And I would say, I think
it would be overall good thing to the world for people to move towards, I think
it’s even called open individualism, move toward the mindset of a larger and
larger groups.
So the challenge there, that’s a beautiful vision and I share it to expand
that circle of empathy, that circle of love towards the entirety of humanity.
But then you start to ask, well, where do you draw the line?
Because why not expand it to other conscious beings?
And then finally, for our discussion, something I think about is why not
expand it to AI systems?
Like we, we start respecting each other when the, the person, the entity on the
other side has the capacity to suffer.
Cause then we develop a capacity to sort of empathize.
And so I could see AI systems that are interacting with humans more and more
having conscious, like displays.
So like they display consciousness through language and through other means.
And so then the question is like, well, is that consciousness?
Because they’re acting conscious.
And so, you know, the reason we don’t like torturing animals is because
they look like they’re suffering when they’re tortured and if AI looks like
it’s suffering when it’s tortured, how is that not requiring of the same kind
of empathy from us and respect and rights that animals do and other humans do?
I think it requires empathy as well.
I mean, I would like, I guess us or humanity or so make a progress in
understanding what consciousness is, because I don’t want just to be speaking
about that, the philosophy, but rather actually make a scientific, uh, to have
a, like, you know, there was a time that people thought that there is a force of
life and, uh, the things that have this force, they are alive.
And, um, I think that there is actually a path to understand exactly what
consciousness is and how it works.
Understand exactly what consciousness is.
And, uh, um, in some sense, it might require essentially putting
probes inside of a human brain, uh, what Neuralink, uh, does.
So the goal there, I mean, there’s several things with consciousness
that make it a real discipline, which is one is rigorous
measurement of consciousness.
And then the other is the engineering of consciousness,
which may or may not be related.
I mean, you could also run into trouble.
Like, for example, in the United States for the department, DOT,
department of transportation, and a lot of different places
put a value on human life.
I think DOT is, uh, values $9 million per person.
Sort of in that same way, you can get into trouble.
If you put a number on how conscious of being is, because then you can start
making policy, if a cow is a 0.1 or like, um, 10% as conscious as a human,
then you can start making calculations and it might get you into trouble.
But then again, that might be a very good way to do it.
I would like, uh, to move to that place that actually we have scientific
understanding what consciousness is.
And then we’ll be able to actually assign value.
And I believe that there is even the path for the experimentation in it.
So, uh, you know, w we said that, you know, you could put the
probes inside of the brain.
There is actually a few other things that you could do with
devices like Neuralink.
So you could imagine that the way even to measure if AI system is conscious
is by literally just plugging into the brain.
Um, I mean, that, that seems like it’s kind of easy, but the plugging
into the brain and asking person if they feel that their consciousness
expanded, um, this direction of course has some issues.
You can say, you know, if someone takes a psychedelic drug, they might
feel that their consciousness expanded, even though that drug
itself is not conscious.
Right.
So like, you can’t fully trust the self report of a person saying their,
their consciousness is expanded or not.
Let me ask you a little bit about psychedelics is, uh, there’ve been
a lot of excellent research on, uh, different psychedelics, psilocybin,
MDMA, even DMT drugs in general, marijuana too.
Uh, what do you think psychedelics do to the human mind?
It seems they take the human mind to some interesting places.
Is that just a little, uh, hack, a visual hack, or is there some
profound expansion of the mind?
So let’s see, I don’t believe in magic.
I believe in, uh, I believe in, uh, in science in, in causality, um, still,
let’s say, and then as I said, like, I think that the brain, that the, our
subjective experience of reality is, uh, we live in the simulation run by our
brain and the simulation that our brain runs, they can be very pleasant or very
hellish drugs, they are changing some hyper parameters of the simulation.
It is possible thanks to change of these hyper parameters to actually look back
on your experience and even see that the given things that we took for
granted, they are changeable.
So they allow to have a amazing perspective.
There is also, for instance, the fact that after DMT people can see the
full movie inside of their head, gives me further belief that the brain can generate
that full movie, that the brain is actually learning the model of reality
to such extent that it tries to predict what’s going to happen next.
Yeah.
Very high resolution.
So it can replay reality.
Extremely high resolution.
Yeah.
It’s also kind of interesting to me that somehow there seems to be some similarity
between these, uh, drugs and meditation itself.
And I actually started even these days to think about meditation as a psychedelic.
Do you practice meditation?
I practice meditation.
I mean, I went a few times on the retreats and it feels after like after
second or third day of meditation, uh, there is a, there is almost like a
sense of, you know, tripping what, what does the meditation retreat entail?
So you w you wake up early in the morning and you meditate for extended
period of time, uh, and yeah, so it’s optimized, even though there are other
people, it’s optimized for isolation.
So you don’t speak with anyone.
You don’t actually look into other people’s eyes and, uh, you know, you sit
on the chair and say Vipassana meditation tells you, uh, to focus on the breath.
So you try to put, uh, all the, all attention into breathing and, uh,
breathing in and breathing out.
And the crazy thing is that as you focus attention like that, uh, after some
time, their stems starts coming back, like some memories that you completely
forgotten, it almost feels like, uh, that you’ll have a mailbox and then you know,
you are just like a archiving email one by one.
And at some point, at some point there is this like a amazing feeling of getting
to mailbox zero, zero emails.
And, uh, it’s very pleasant.
It’s, it’s kind of, it’s, it’s, it’s crazy to me that, um, that once you
resolve these, uh, inner store stories or like inner traumas, then once there is
nothing, uh, left that default, uh, state of human mind is extremely peaceful and
happy, extreme, like, uh, some sense it, it feels that the, it feels at least to
me that way, how, when I was a child that I can look at any object and it’s very
beautiful, I have a lot of curiosity about the simple things and that’s where
the usual meditation takes me.
Are you, what are you experiencing?
Are you just taking in simple sensory information and they’re just enjoying
the rawness of that sensory information?
So there’s no, there’s no memories or all that kind of stuff.
You’re just enjoying being.
Yeah, pretty much.
I mean, still there is, uh, that it’s, it’s thoughts are slowing down.
Sometimes they pop up, but it’s also somehow the extended meditation takes you
to the space that they are way more friendly, way more positive.
Um, there is also this, uh, this thing that, uh, we’ve, it almost feels that the.
It almost feels that the, we are constantly getting a little bit of a reward
function and we are just spreading this reward function on various activities.
But if you’ll stay still for extended period of time, it kind of accumulates,
accumulates, accumulates, and, uh, there is a, there is a sense, there is a sense
that some point it passes some threshold and it feels as drop is falling into kind
of ocean of love and this, and that’s like, uh, this is like a very pleasant.
And that’s, I’m saying like, uh, that corresponds to the subjective experience.
Some people, uh, I guess in spiritual community, they describe it that that’s
the reality, and I would say, I believe that they’re like, uh, all sorts of
subjective experience that one can have.
And, uh, I believe that for instance, meditation might take you to the
subjective experiences with the subject.
Vision might take you to the subjective experiences, which are
very pleasant, collaborative.
And I would like a word to move toward a more collaborative, uh, uh, place.
Yeah.
I would say that’s very pleasant and I enjoy doing stuff like that.
I, um, I wonder how that maps to your, uh, mathematical model of love with, uh,
the reward function, combining a bunch of things, it seems like our life then is
just, we have this reward function and we’re accumulating a bunch of stuff
in it with weights, it’s like, um, like multi objective and what meditation
is, is you just remove them, remove them until the weight on one, uh, or
just a few is very high and that’s where the pleasure comes from.
Yeah.
So something similar, how I’m thinking about this.
So I told you that there is this like, uh, that there is a story of who you are.
And I think almost about it as a, you know, text prepended to GPT.
Yeah.
And, uh, some people refer to it as ego.
Okay.
There’s like a story who, who, who you are.
Okay.
So ego is the prompt for GPT three or GPT.
Yes.
And that’s description of you.
And then with meditation, you can get to the point that actually you experience
things without the prompt and you experience things like as they are, you
are not biased over the description, how they supposed to be, uh, that’s very
pleasant.
And then we’ve respected the reward function.
Uh, it’s possible to get to the point that the, there is the solution of self.
And therefore you can say that the, or you’re having a, your, or like a, your
brain attempts to simulate the reward function of everyone else or like
everything that’s that there is this like a love, which feels like a oneness with
everything.
And that’s also, you know, very beautiful, very pleasant.
At some point you might have a lot of altruistic thoughts during that moment.
And then the self, uh, always comes back.
How would you recommend if somebody is interested in meditation, like a big
thing to take on as a project, would you recommend a meditation retreat?
How many days, what kind of thing would you recommend?
I think that actually retreat is the way to go.
Um, it almost feels that, uh, um, as I said, like a meditation is a psychedelic,
but, uh, when you take it in the small dose, you might barely feel it.
Once you get the high dose, actually you’re going to feel it.
Um, so even cold turkey, if you haven’t really seriously meditated for a long
period of time, just go to a retreat.
Yeah.
How many days, how many days?
Start weekend one weekend.
So like two, three days.
And it’s like, uh, it’s interesting that first or second day, it’s hard.
And at some point it becomes easy.
There’s a lot of seconds in a day.
How hard is the meditation retreat just sitting there in a chair?
So the thing is actually, it literally just depends on your, uh, on the,
your own framing, like if you are in the mindset that you are waiting for it to
be over, or you are waiting for a Nirvana to happen, you are waiting
it will be very unpleasant.
And in some sense, even the difficulty, it’s not even in the lack of being
able to speak with others, like, uh, you’re sitting there, your legs
will hurt from sitting in terms of like the practical things.
Do you experience kind of discomfort, like physical discomfort of just
sitting, like your, your butt being numb, your legs being sore, all that kind of
stuff?
Yes.
You experience it.
And then the, the, they teach you to observe it rather.
And it’s like, uh, the crazy thing is you at first might have a feeling
toward trying to escape it and that becomes very apparent that that’s
extremely unpleasant.
And then you just, just observe it.
And then at some point it just becomes, uh, it just is, it’s like, uh, I remember
that we’ve, Ilya told me some time ago that, uh, you know, he takes a cold
shower and he’s the mindset of taking a cold shower was to embrace suffering.
Yeah.
Excellent.
I do the same.
This is your style?
Yeah, it’s my style.
I like this.
So my style is actually, I also sometimes take cold showers.
It is purely observing how the water goes through my body, like a purely being
present, not trying to escape from there.
Yeah.
And I would say then it actually becomes pleasant.
It’s not like, ah, well, that that’s interesting.
Um, I I’m also that mean that’s, that’s the way to deal with anything really
difficult, especially in the physical space is to observe it to say it’s pleasant.
Hmm.
It’s a D I would use a different word.
You’re, um, you’re accepting of the full beauty of reality.
I would say, cause say pleasant.
But yeah, I mean, in some sense it is pleasant.
That’s the only way to deal with a cold shower is to, to, uh, become an
observer and to find joy in it.
Um, same with like really difficult, physical, um, exercise or like running
for a really long time, endurance events, just anytime you’re, any kind of pain.
I think the only way to survive it is not to resist it is to observe it.
You mentioned, you mentioned, um, you mentioned, um, you mentioned
Ilya, Ilya says, it’s very, he’s our chief scientist, but also
he’s very close friend of mine.
He cofounded open air with you.
I’ve spoken with him a few times.
He’s brilliant.
I really enjoy talking to him.
His mind, just like yours works in fascinating ways.
Now, both of you are not able to define deep learning simply.
Uh, what’s it like having him as somebody you have technical discussions with on
in the space of machine learning, deep learning, AI, but also life.
What’s it like when these two, um, agents get into a self play situation in a room?
What’s it like collaborating with him?
So I believe that we have, uh, extreme, uh, respect to each other.
So, uh, in, I love Ilya’s insight, both like, uh, I guess about
consciousness, uh, life AI, but, uh, in terms of the, it’s interesting to
me, cause you’re a brilliant, uh, Thinker in the space of machine
learning, like intuition, like digging deep in what works, what doesn’t,
why it works, why it doesn’t, and so is Ilya.
I’m wondering if there’s interesting deep discussions you’ve had with him in the
past or disagreements that were very productive.
So I can say, I also understood over the time, where are my strengths?
So obviously we have plenty of AI discussions and, um, um, and do you
know, I myself have plenty of ideas, but like I consider Ilya, uh, what
of the most prolific AI scientists in the entire world.
And, uh, I think that, um, I realized that maybe my super skill, um, is, uh,
being able to bring people to collaborate together, that I have some level of
empathy that is unique in AI world.
And that might come, you know, from either meditation, psychedelics, or
let’s say I read just hundreds of books on this topic.
So, and I also went through a journey of, you know, I developed a
lot of, uh, algorithms, so I think that maybe I can, that’s my super human skill.
Uh, Ilya is, uh, one of the best AI scientists, but then I’m pretty
good in assembling teams and I’m also not holding to people.
Like I’m growing people and then people become managers at OpenAI.
I grew many of them, like a research managers.
So you, you find, you find places where you’re excellent and he finds like his,
his, his deep scientific insights is where he is and you find ways you can,
the puzzle pieces fit together.
Correct.
Like, uh, you know, ultimately, for instance, let’s say Ilya, he doesn’t
manage people, uh, that’s not what he likes or so.
Um, I like, I like hanging out with people.
By default, I’m an extrovert and I care about people.
Oh, interesting. Okay. All right. Okay, cool.
So that, that fits perfectly together.
But I mean, uh, I also just like your intuition about various
problems in machine learning.
He’s definitely one I really enjoy.
I remember talking to him about something I was struggling with, which
is coming up with a good model for pedestrians, for human beings across
the street in the context of autonomous vehicles, and I was like, okay,
in the context of autonomous vehicles.
And he immediately started to like formulate a framework within which you
can evolve a model for pedestrians, like through self play, all that kind of
mechanisms, the depth of thought on a particular problem, especially problems
he doesn’t know anything about is, is fascinating to watch.
It makes you realize like, um, yeah, the, the, the limits of the, that the human
intellect may be limitless, or it’s just impressive to see a descendant of
ape come up with clever ideas.
Yeah.
I mean, so even in the space of deep learning, when you look at various
people, there are people now who invented some breakthroughs once, but
there are very few people who did it multiple times.
And you can think if someone invented it once, that might be just a sheer luck.
And if someone invented it multiple times, you know, if a probability of
inventing it once is one over a million, then probability of inventing it twice
or three times would be one over a million square or, or to the power of
three, which, which would be just impossible.
So it literally means that it’s, it’s given that, uh, it’s not the luck.
Yeah.
And Ilya is one of these few people who, uh, uh, who have, uh, a lot of
these inventions in his arsenal.
It also feels that, um, you know, for instance, if you think about folks
like Gauss or Euler, uh, you know, at first they read a lot of books and then
they did thinking and then they figure out math and that’s how it feels with
Ilya, you know, at first he read stuff and then like he spent his thinking cycles.
And that’s a really good way to put it.
When I talk to him, I, I see thinking.
He’s actually thinking, like, he makes me realize that there’s like deep
thinking that the human mind can do.
Like most of us are not thinking deeply.
Uh, like you really have to put in a lot of effort to think deeply.
Like I have to really put myself in a place where I think deeply about a
problem, it takes a lot of effort.
It’s like, uh, it’s like an airplane taking off or something.
You have to achieve deep focus.
He he’s just, uh, he’s what is it?
He said, what does it, his brain is like a vertical takeoff in
terms of airplane analogy.
So it’s interesting, but it, I mean, Cal Newport talks about
this as ideas of deep work.
It’s, you know, most of us don’t work much at all in terms of like, like deeply
think about particular problems, whether it’s a math engineering, all that kind
of stuff, you want to go to that place often and that’s real hard work.
And some of us are better than others at that.
So I think that the big piece has to do with actually even engineering
your environment that says that it’s conducive to that.
Yeah.
So, um, see both Ilya and I, uh, on the frequent basis, we kind of disconnect
ourselves from the world in order to be able to do extensive amount of thinking.
Yes.
So Ilya usually, he just, uh, leaves iPad at hand.
He loves his iPad.
And, uh, for me, I’m even sometimes, you know, just going for a few days
to different location to Airbnb, I’m turning off my phone and there is no
access to me and, uh, that’s extremely important for me to be able to actually
just formulate new thoughts, to do deep work rather than to be reactive.
And the, the, the older I am, the more of these random tasks are at hand.
Before I go on to that, uh, thread, let me return to our friend, GPT.
And let me ask you another ridiculously big question.
Can you give an overview of what GPT three is, or like you say in
your Twitter bio, GPT N plus one, how it works and why it works.
So, um, GPT three is a humongous neural network.
Um, let’s assume that we know what is neural network, the definition, and it
is trained on the entire internet and just to predict next word.
So let’s say it sees part of the, uh, article and it, the only task that it
has at hand, it is to say what would be the next word and what would be the next
word and it becomes a really exceptional at the task of figuring out what’s the
next word. So you might ask, why would, uh, this be an important, uh, task?
Why would it be important to predict what’s the next word?
And it turns out that a lot of problems, uh, can be formulated, uh, as a text
completion problem.
So GPT is purely, uh, learning to complete the text.
And you could imagine, for instance, if you are asking a question, uh, who is
the president of the United States, then GPT can give you an answer to it.
It turns out that many more things can be formulated this way.
You can format text in the way that you have sentence in English.
You make it even look like some content of a website, uh, elsewhere, which would
be teaching people how to translate things between languages.
So it would be EN colon, uh, text in English, FR colon, and then you’ll
uh, uh, and then you’ll ask people and then you ask model to, to continue.
And it turns out that the, such a model is predicting translation from English
to French.
The crazy thing is that this model can be used for way more sophisticated tasks.
So you can format text such that it looks like a conversation between two people.
And that might be a conversation between you and Elon Musk.
And because the model read all the texts about Elon Musk, it will be able to
predict Elon Musk words as it would be Elon Musk.
It will speak about colonization of Mars, about sustainable future and so on.
And it’s also possible to, to even give arbitrary personality to the model.
You can say, here is a conversation that we’ve a friendly AI bot.
And the model, uh, will complete the text as a friendly AI bot.
So, I mean, how do I express how amazing this is?
So just to clarify, uh, a conversation, generating a conversation between me and
Elon Musk, it wouldn’t just generate good examples of what Elon would say.
It would get the same results as the conversation between Elon Musk and me.
Say it would get the syntax all correct.
So like interview style, it would say like Elon call and Lex call, like it,
it’s not just like, uh, inklings of, um, semantic correctness.
It’s like the whole thing, grammatical, syntactic, semantic, it’s just really,
really impressive, uh, generalization.
Yeah.
I mean, I also want to, you know, provide some caveats so it can generate
few paragraphs of coherent text, but as you go to, uh, longer pieces,
it, uh, it actually goes off the rails.
Okay.
If you try to write a book, it won’t work out this way.
What way does it go off the rails, by the way?
Is there interesting ways in which it goes off the rails?
Like what falls apart first?
So the model is trained on the, all the existing data, uh, that is out there,
which means that it is not trained on its own mistakes.
So for instance, if it would make a mistake, then, uh, I kept,
so to give you, give you an example.
So let’s say I have a conversation with a model pretending that is Elon Musk.
And then I start putting some, uh, I’m start actually making up
things which are not factual.
Um, I would say like Twitter, but I got you.
Sorry.
Yeah.
Um, like, uh, I don’t know.
I would say that Elon is my wife and the model will just keep on carrying it on.
And as if it’s true.
Yes.
And in some sense, if you would have a normal conversation with Elon,
he would be what the fuck.
Yeah.
There’ll be some feedback between, so the model is trained on things
that humans have written, but through the generation process, there’s
no human in the loop feedback.
Correct.
That’s fascinating.
Makes sense.
So it’s magnified.
It’s like the errors get magnified and magnified and it’s also interesting.
I mean, first of all, humans have the same problem.
It’s just that we, uh, we’ll make fewer errors and magnify the errors slower.
I think that actually what happens with humans is if you have a wrong
belief about the world as a kid, then very quickly we’ll learn that it’s
not correct because they are grounded in reality and they are learning
from your new experience.
Yes.
But do you think the model can correct itself too?
Won’t it through the power of the representation.
And so the absence of Elon Musk being your wife information on the
internet, won’t it correct itself?
There won’t be examples like that.
So the errors will be subtle at first.
Subtle at first.
And in some sense, you can also say that the data that is not out there is
the data, which would represent how the human learns and maybe model would
be learned, trained on such a data.
Then it would be better off.
How intelligent is GPT3 do you think?
Like when you think about the nature of intelligence, it
seems exceptionally impressive.
But then if you think about the big AGI problem, is this
footsteps along the way to AGI?
So let’s see, it seems that intelligence itself is, there are multiple axis of it.
And I would expect that the systems that we are building, they might end up being
superhuman on some axis and subhuman on some other axis.
It would be surprising to me on all axis simultaneously, they would become superhuman.
Of course, people ask this question, is GPT a spaceship that would take us to
the moon or are we putting a, building a ladder to heaven that we are just
building bigger and bigger ladder.
And we don’t know in some sense, which one of these two.
Which one is better?
I’m trying to, I like stairway to heaven.
It’s a good song.
So I’m not exactly sure which one is better, but you’re saying like the
spaceship to the moon is actually effective.
Correct.
So people who criticize GPT, they say, you guys just building a
taller, a ladder, and it will never reach the moon.
And at the moment, I would say the way I’m thinking is, is like a scientific question.
And I’m also in heart, I’m a builder creator and like, I’m thinking, let’s try out, let’s
see how far it goes.
And so far we see constantly that there is a progress.
Yeah.
So do you think GPT four, GPT five, GPT N plus one will, um, there’ll be a phase
shift, like a transition to a, to a place where we’ll be truly surprised.
Then again, like GPT three is already very like truly surprising.
The people that criticize GPT three as a stair, as a, what is it?
Ladder to heaven.
I think too quickly get accustomed to how impressive it is that they’re
impressive, it is that the prediction of the next word can achieve such depth of
semantics, accuracy of syntax, grammar, and semantics.
Um, do you, do you think GPT four and five and six will continue to surprise us?
I mean, definitely there will be more impressive models that there is a
question of course, if there will be a phase shift and, uh, the, also even the
way I’m thinking about the, about these models is that when we build these
models, you know, we see some level of the capabilities, but we don’t even fully
understand everything that the model can do.
And actually one of the best things to do is to allow other people to probe the
model to even see what is possible.
Hence the, the using GPT as an API and opening it up to the world.
Yeah.
I mean, so when I’m thinking from perspective of like, uh, obviously
various people are, that have concerns about AGI, including myself.
Um, and then when I’m thinking from perspective, what’s the strategy even to
deploy these things to the world, then the one strategy that I have seen many
times working is that iterative deployment that you deploy, um, slightly
better versions and you allow other people to criticize you.
So you actually, or try it out, you see where are their fundamental issues.
And it’s almost, you don’t want to be in that situation that you are holding
into powerful system and there’s like a huge overhang, then you deploy it and it
might have a random chaotic impact on the world.
So you actually want to be in the situation that they are
gradually deploying systems.
I asked this question of Illya, let me ask you, uh, you this question.
I’ve been reading a lot about Stalin and power.
If you’re in possession of a system that’s like AGI, that’s exceptionally
powerful, do you think your character and integrity might become corrupted?
Like famously power corrupts and absolute power corrupts.
Absolutely.
So I believe that the, you want at some point to work toward distributing the power.
I think that the, you want to be in the situation that actually AGI is not
controlled by a small number of people, uh, but, uh, essentially, uh, by a larger
collective.
So the thing is that requires a George Washington style move in the ascent to
power, there’s always a moment when somebody gets a lot of power and they
have to have the integrity and, uh, the moral compass to give away that power.
That humans have been good and bad throughout history at this particular
step.
And I wonder, I wonder we like blind ourselves in a, for example, between
nations, a race, uh, towards, um, they, yeah, AI race between nations, we might
blind ourselves and justify to ourselves the development of AI without distributing
the power because we want to defend ourselves against China, against Russia,
that kind of, that kind of logic.
And, um, I wonder how we, um, how we design governance mechanisms that, um,
prevent us from becoming power hungry and in the process, destroying ourselves.
So let’s see, I have been thinking about this topic quite a bit, but I also want
to admit that, uh, once again, I actually want to rely way more on Sam Altman on it.
He wrote an excellent blog on how even to distribute wealth.
Um, and he’s proper, he proposed in his blog, uh, to tax, uh, equity of the companies
rather than profit and to distribute it.
And this is, this is an example of, uh, Washington move.
I guess I personally have insane trust in some here already spent plenty of money
running, uh, universal basic income, uh, project.
That like, uh, gives me, I guess, maybe some level of trust to him, but I also,
I guess love him as a friend.
Yeah.
I wonder because we’re sort of summoning a new set of technologies.
I wonder if we’ll be, um, cognizant, like you’re describing the process of open AI,
but it could also be at other places like in the U S government, right?
Uh, both China and the U S are now full steam ahead on autonomous
weapons systems development.
And that’s really worrying to me because in the framework of something being a
national security danger or military danger, you can do a lot of pretty dark
things that blind our moral compass.
And I think AI will be one of those things, um, in some sense, the, the mission
and the work you’re doing in open AI is like the counterbalance to that.
So you want to have more open AI and less autonomous weapons systems.
I, I, I, I like these statements, like to be clear, like this interesting and I’m
thinking about it myself, but, uh, this is a place that I, I, I put my trust
actually in Sam’s hands, because it’s extremely hard for me to reason about it.
Yeah.
I mean, one important statement to make is, um, it’s good to think about this.
Yeah.
No question about it.
No question, even like low level quote unquote engineer, like there’s such a,
um, I remember I, I programmed a car, uh, our RC car, um, and it was, it was
programmed a car, uh, our RC car, they went really fast, like 30, 40 miles an hour.
And I remember I was like sleep deprived.
So I programmed it pretty crappily and it like, uh, the, the, the code froze.
So it’s doing some basic computer vision and it’s going around on track,
but it’s going full speed.
And, uh, there was a bug in the code that, uh, the car just went, it didn’t turn.
Went straight full speed and smash into the wall.
And I remember thinking the seriousness with which you need to approach the
design of artificial intelligence systems and the programming of artificial
intelligence systems is high because the consequences are high, like that
little car smashing into the wall.
For some reason, I immediately thought of like an algorithm that controls
nuclear weapons, having the same kind of bug.
And so like the lowest level engineer and the CEO of a company all need to
have the seriousness, uh, in approaching this problem and thinking
about the worst case consequences.
So I think that is true.
I mean, the, what I also recognize in myself and others even asking this
question is that it evokes a lot of fear and fear itself ends up being
actually quite debilitating.
The place where I arrived at the moment might sound cheesy or so, but it’s
almost to build things out of love rather than fear, like a focus on how, uh, I can,
you know, maximize the value, how the systems that I’m building might be, uh,
useful.
I’m not saying that the fear doesn’t exist out there and like it totally
makes sense to minimize it, but I don’t want to be working because, uh, I’m
scared, I want to be working out of passion, out of curiosity, out of the,
you know, uh, looking forward for the positive future.
With, uh, the definition of love arising from a rigorous practice of empathy.
So not just like your own conception of what is good for the world, but
always listening to others.
Correct.
Like the love where I’m considering reward functions of others.
Others to limit to infinity is like a sum of like one to N where N is, uh,
7 billion or whatever it is.
Not, not projecting my reward functions on others.
Yeah, exactly.
Okay.
Can we just take a step back to something else?
Super cool, which is, uh, OpenAI Codex.
Can you give an overview of what OpenAI Codex and GitHub Copilot is, how it works
and why the hell it works so well?
So with GPT tree, we noticed that the system, uh, you know, that system train
on all the language out there started having some rudimentary coding capabilities.
So we’re able to ask it, you know, to implement addition function between
two numbers and indeed it can write item or JavaScript code for that.
And then we thought, uh, we might as well just go full steam ahead and try to
create a system that is actually good at what we are doing every day ourselves,
which is programming.
We optimize models for proficiency in coding.
We actually even created models that both have a comprehension of language and code.
And Codex is API for these models.
So it’s first pre trained on language and then codex.
Then I don’t know if you can say fine tuned because there’s a lot of code,
but it’s language and code.
It’s language and code.
It’s also optimized for various things.
I can, let’s say low latency and so on.
Codex is the API, the similar to GPT tree.
We expect that there will be proliferation of the potential products that can use
coding capabilities and I can, I can speak about it in a second.
Copilot is a first product and developed by GitHub.
So as we’re building, uh, models, we wanted to make sure that these
models are useful and we work together with GitHub on building the first product.
Copilot is actually, as you code, it suggests you code completions.
And we have seen in the past, there are like a various tools that can suggest
how to like a few characters of the code or a line of code.
Then the thing about Copilot is it can generate 10 lines of code.
You, it’s often the way how it works is you often write in the comment
what you want to happen because people in comments, they describe what happens next.
So, um, these days when I code, instead of going to Google to search, uh, for
the appropriate code to solve my problem, I say, Oh, for this area, could you
smooth it and then, you know, it imports some appropriate libraries and say it
uses NumPy convolution or so I, that I was not even aware that exists and
it does the appropriate thing.
Um, so you, uh, you write a comment, maybe the header of a function
and it completes the function.
Of course, you don’t know what is the space of all the possible small
programs that can generate.
What are the failure cases?
How many edge cases, how many subtle errors there are, how many big errors
there are, it’s hard to know, but the fact that it works at all in a large
number of cases is incredible.
It’s like, uh, it’s a kind of search engine into code that’s
been written on the internet.
Correct.
So for instance, when you search things online, then usually you get to the,
some particular case, like if you go to stack overflow and people describe
that one particular situation, uh, and then they seek for a solution.
But in case of a copilot, it’s aware of your entire context and in
context is, Oh, these are the libraries that they are using.
That’s the set of the variables that is initialized.
And on the spot, it can actually tell you what to do.
So the interesting thing is, and we think that the copilot is one
possible product using codecs, but there is a place for many more.
So internally we tried out, you know, to create other fun products.
So it turns out that a lot of tools out there, let’s say Google
calendar or Microsoft word or so, they all have a internal API
to build plugins around them.
So there is a way in the sophisticated way to control calendar or Microsoft word.
Today, if you want, if you want more complicated behaviors from these
programs, you have to add the new button for every behavior.
But it is possible to use codecs and tell for instance, to calendar, uh,
could you schedule an appointment with Lex next week after 2 PM and it
writes corresponding piece of code.
And that’s the thing that actually you want.
So interesting.
So you figure out is there’s a lot of programs with which
you can interact through code.
And so there you can generate that code from natural language.
That’s fascinating.
And that’s somewhat like also closest to what was the promise of Siri or Alexa.
So previously all these behaviors, they were hard coded and it seems
that codecs on the fly can pick up the API of let’s say, given software.
And then it can turn language into use of this API.
So without hard coding, you can find, it can translate to machine language.
Correct.
To, uh, so for example, this would be really exciting for me, like for, um,
Adobe products, like Photoshop, uh, which I think action scripted, I think
there’s a scripting language that communicates with them, same with Premier.
And do you could imagine that that allows even to do coding by voice on your phone?
So for instance, in the past, okay.
As of today, I’m not editing Word documents on my phone because it’s
just the keyboard is too small.
But if I would be able to tell, uh, to my phone, you know, uh, make the
header large, then move the paragraphs around and that’s actually what I want.
So I can tell you one more cool thing, or even how I’m thinking about codecs.
So if you look actually at the evolution of, uh, of computers, we started with
a very primitive interfaces, which is a punch card and punch card.
So Charlie, you make a holes in the, in the plastic card to indicate zeros and ones.
And, uh, during that time, there was a small number of specialists
who were able to use computers.
And by the way, people even suspected that there is no need for many
more people to use computers.
Um, but then we moved from punch cards to at first assembly and see, and
at these programming languages, they were slightly higher level.
They allowed many more people to code and they also, uh, led to more
of a proliferation of technology.
And, uh, you know, further on, there was a jump to say from C++ to Java and Python.
And every time it has happened, more people are able to code
and we build more technology.
And it’s even, you know, hard to imagine now, if someone will tell you that you
should write code in assembly instead of let’s say, Python or Java or JavaScript.
And codecs is yet another step toward kind of bringing computers closer to
humans such that you communicate with a computer with your own language rather
than with a specialized language, and, uh, I think that it will lead to an
increase of number of people who can code.
Yeah.
And then, and the kind of technologies that those people will create is it’s
innumerable, it could, you know, it could be a huge number of technologies.
We’re not predicting at all because that’s less and less requirement
of having a technical mind, a programming mind, you’re not opening it to the world
of, um, other kinds of minds, creative minds, artistic minds, all that kind of stuff.
I would like, for instance, biologists who work on DNA to be able to program
and not to need to spend a lot of time learning it.
And I, I believe that’s a good thing to the world.
And I would actually add, I would add, so at the moment I’m a managing codecs
team and also language team, and I believe that there is like a plenty
of brilliant people out there and they should have a lot of experience.
There and they should apply.
Oh, okay.
Yeah.
Awesome.
So what’s the language and the codecs is, so those are kind of,
they’re overlapping teams.
It’s like GPT, the raw language, and then the codecs is like applied to programming.
Correct.
And they are quite intertwined.
There are many more things involved making this, uh, models,
uh, extremely efficient and deployable.
Okay.
For instance, there are people who are working to, you know, make our data
centers, uh, amazing, or there are people who work on putting these
models into production or, uh, or even pushing it at the very limit of the scale.
So all aspects from, from the infrastructure to the actual machine.
So I’m just saying there are multiple teams while the, and the team working
on codecs and language, uh, I guess I’m, I’m directly managing them.
I would like, I would love to hire more interested in machine learning.
This is probably one of the most exciting problems and like systems
to be working on is it’s actually, it’s, it’s, it’s pretty cool.
Like what, what, uh, the program synthesis, like generating a
programs is very interesting, very interesting problem that has echoes
of reasoning and intelligence in it.
It’s and I think there’s a lot of fundamental questions that you might
be able to sneak, uh, sneak up to by generating programs.
Yeah, that one more exciting thing about the programs is that, so I said
that the, um, you know, the, in case of language, that one of the travels
is even evaluating language.
So when the things are made up, you, you need somehow either a human to,
to say that this doesn’t make sense or so in case of program, there is one extra
lever that we can actually execute programs and see what they evaluate to.
So that process might be somewhat, uh, more automated in, in order to improve
the, uh, qualities of generations.
Oh, that’s fascinating.
So like the, wow, that’s really interesting.
So, so for the language, the, you know, the simulation to actually
execute it as a human mind.
Yeah.
For programs, there is a, there is a computer on which you can evaluate it.
Wow.
That’s a brilliant little insight.
Insight that the thing compiles and runs that’s first and second, you can evaluate
on a, like do automated unit testing and in some sense, it seems to me that we’ll
be able to make a tremendous progress.
You know, we are in the paradigm that there is way more data.
There is like a transcription of millions of, uh, of, uh, software engineers.
Yeah.
Yeah.
So, uh, I mean, you just mean, cause I was going to ask you about reliability.
The thing about programs is you don’t know if they’re going to, like a program
that’s controlling a nuclear power plant has to be very reliable.
So I wouldn’t start with controlling nuclear power plant maybe one day,
but that’s not actually, that’s not on the current roadmap.
That’s not the step one.
And you know, it’s the Russian thing.
You just want to go to the most powerful, destructive, most powerful
the most powerful, destructive thing right away run by JavaScript.
But I got you.
So this is a lower impact, but nevertheless, when you make me
realize it is possible to achieve some levels of reliability by doing testing.
And you could, you could imagine that, you know, maybe there are ways for
model to write event code for testing itself and so on, and there exists
a ways to create the feedback loops that the model could keep on improving.
Yeah. By writing programs that generate tests for the instance, for instance.
And that’s how we get consciousness, because it’s metacompression.
That’s what you’re going to write.
That’s the comment.
That’s the prompt that generates consciousness.
Compressor of compressors.
You just write that.
Do you think the code that generates consciousness will be simple?
So let’s see.
I mean, ultimately, the core idea behind will be simple,
but there will be also decent amount of engineering involved.
Like in some sense, it seems that, you know, spreading these models
on many machines, it’s not that trivial.
Yeah.
And we find all sorts of innovations that make our models more efficient.
I believe that first models that I guess are conscious or like a truly intelligent,
they will have all sorts of tricks, but then again, there’s a Richard Sutton
argument that maybe the tricks are temporary things that they might be
temporary things and in some sense, it’s also even important to, to know
that even the cost of a trick.
So sometimes people are eager to put the trick while forgetting that
there is a cost of maintenance or like a long term cost, long term cost
or maintenance, or maybe even flexibility of code to actually implement new ideas.
So even if you have something that gives you 2x, but it requires, you know,
1000 lines of code, I’m not sure if it’s actually worth it.
So in some sense, you know, if it’s five lines of code and 2x, I would take it.
And we see many of this, but also, you know, that requires some level of,
I guess, lack of attachment to code that we are willing to remove it.
Yeah.
So you led the OpenAI robotics team.
Can you give an overview of the cool things you were able to
accomplish, what are you most proud of?
So when we started robotics, we knew that actually reinforcement learning works
and it is possible to solve fairly complicated problems.
Like for instance, AlphaGo is an evidence that it is possible to build superhuman
Go players, DOTA2 is an evidence that it’s possible to build superhuman agents
playing DOTA, so I asked myself a question, you know, what about robots out there?
Could we train machines to solve arbitrary tasks in the physical world?
Our approach was, I guess, let’s pick a complicated problem that if we would
solve it, that means that we made some significant progress in the domain.
And if can progress the domain, and then we went after the problem.
So we noticed that actually the robots out there, they are kind of at the moment
optimized per task, so you can have a robot that it’s like, if you have a robot
opening a bottle, it’s very likely that the end factor is that bottle opener.
And the, and in some sense, that’s a hack to be able to solve a task,
which makes any task easier and ask myself, so what would be a robot that
can actually solve many tasks?
And we conclude that human hands have such a quality that indeed they are, you
know, you have five kind of tiny arms attached individually.
They can manipulate pretty broad spectrum of objects.
So we went after a single hand, like trying to solve Rubik’s cube single handed.
We picked this task because we thought that there is no way to hard code it.
And it’s also, we picked the robot on which it would be hard to hard code it.
And we went after the solution such that it could generalize to other problems.
And just to clarify, it’s one robotic hand solving the Rubik’s cube.
The hard part isn’t the solution to the Rubik’s cube is the manipulation of the,
of like having it not fall out of the hand, having it use the, uh, five baby
arms to, uh, what is it like rotate different parts of the Rubik’s cube to
achieve the solution.
Correct.
Yeah.
So what, uh, what was the hardest part about that?
What was the approach taken there?
What are you most proud of?
Obviously we have like a strong belief in reinforcement learning.
And, uh, you know, one path it is to do reinforcement learning, the real
world other path is to, uh, uh, that simulation in some sense, the tricky
part about the real world is at the moment, our models, they require a lot
of data and there is essentially no data.
And, uh, I did, we decided to go through the path of the simulation.
And in simulation, you can have infinite amount of data.
The tricky part is the fidelity of the simulation.
And also can you in simulation represent everything that you represent
otherwise in the real world.
And, you know, it turned out that, uh, that, you know, because there is
lack of fidelity, it is possible to what we, what we arrived at is training
a model that doesn’t solve one simulation, but it actually solves the
entire range of simulations, which, uh, uh, in terms of like, uh, what’s
the, exactly the friction of the cube or the weight or so, and the single AI
that can solve all of them ends up working well with the reality.
How do you generate the different simulations?
So, uh, you know, there’s plenty of parameters out there.
We just pick them randomly.
And, uh, and in simulation model just goes for thousands of years and keeps
on solving Rubik’s cube in each of them.
And the thing is that neural network that we used, it has a memory.
And as it presses, for instance, the side of the, of the cube, it can sense,
oh, that’s actually, this side was, uh, difficult to press.
I should press it stronger and throughout this process kind of, uh, learn it’s even
how to, uh, how to solve this particular instance of the Rubik’s cube, like even
mass, it’s kind of like, uh, you know, sometimes when you go to a gym and after,
um, after bench press, you try to leave the class and you kind of forgot, uh, and,
and your head goes like up right away because kind of you got used to maybe
different weight and it takes a second to adjust and this kind of, of a memory,
the model gained through the process of interacting with the cube in the
simulation, I appreciate you speaking to the audience with the bench press,
all the bros in the audience, probably working out right now.
There’s probably somebody listening to this actually doing bench press.
Um, so maybe, uh, put the bar down and pick up the water bottle and you’ll
know exactly what, uh, what Jack is talking about.
Okay.
So what, uh, what was the hardest part of getting the whole thing to work?
So the hardest part is at the moment when it comes to, uh, physical work, when it
comes to robots, uh, they require maintenance, it’s hard to replicate a
million times it’s, uh, it’s also, it’s hard to replay things exactly.
I remember this situation that one guy at our company, he had like a model that
performs way better than other models in solving Rubik’s cube.
And, uh, you know, we kind of didn’t know what’s going on, why it’s that.
And, uh, it turned out, uh, that, you know, he was running it from his laptop
that had better CPU or better, maybe local GPU as well.
And, uh, because of that, there was less of a latency and the model was the same.
And that actually made solving Rubik’s cube more reliable.
So in some sense, there might be some subtle bugs like that when it comes
to running things in the real world.
Even hinting on that, you could imagine that the initial models you would like
to have models, which are insanely huge neural networks, and you would like to
give them even more time for thinking.
And when you have these real time systems, uh, then you might be constrained
actually by the amount of latency.
And, uh, ultimately I would like to build a system that it is worth for you to wait
five minutes because it gives you the answer that you’re willing to wait for
five minutes.
So latency is a very unpleasant constraint under which to operate.
Correct.
And also there is actually one more thing, which is tricky about robots.
Uh, there is actually, uh, no, uh, not much data.
So the data that I’m speaking about would be a data of, uh, first person
experience from the robot and like a gigabytes of data like that, if we would
have gigabytes of data like that, of robots solving various problems, it would
be very easy to make a progress on robotics.
And you can see that in case of text or code, there is a lot of data, like a
first person perspective, they don’t writing code.
Yeah. So you had this, you mentioned this really interesting idea that if you were
to build like a successful robotics company, so open as mission is much
bigger than robotics, this is one of the, one of the things you’ve worked on, but
if it was a robotics company, they, you wouldn’t so quickly dismiss supervised
learning, uh, correct that you would build a robot that, uh, was perhaps what
like, um, an empty shell, like dumb, and they would operate under teleoperation.
So you would invest, that’s just one way to do it, invest in human supervision,
like direct human control of the robots as it’s learning and over time, add
more and more automation.
That’s correct.
So let’s say that’s how I would build a robotics company today.
If I would be building a robotics company, which is, you know, spend 10
million dollars or so recording human trajectories, controlling a robot.
After you find a thing that the robot should be doing, that there’s a market
fit for, like you can make a lot of money with that product.
Correct.
Yeah.
Uh, so I would record data and then I would essentially train supervised
learning model on it.
That might be the path today.
Long term.
I think that actually what is needed is to have a robot that can
train powerful models over video.
So, um, you have seen maybe a models that can generate images like Dali and people
are looking into models, generating videos, they’re like, uh, bodies,
algorithmic questions, even how to do it.
And it’s unclear if there is enough compute for this purpose, but, uh, I, I
suspect that the models that which would have a level of understanding of video,
same as GPT has a level of understanding of text, could be used, uh, to train
robots to solve tasks.
They would have a lot of common sense.
If one day, I’m pretty sure one day there will be a robotics company by robotics
company, I mean, the primary source of income is, is from robots that is worth
over $1 trillion.
What do you think that company will do?
I think self driving cars.
No, it’s interesting.
Cause my mind went to personal robotics, robots in the home.
It seems like there’s much more market opportunity there.
I think it’s very difficult to achieve.
I mean, this, this, this might speak to something important, which is I understand
self driving much better than understand robotics in the home.
So I understand how difficult it is to actually solve self driving to a, to a
level, not just the actual computer vision and the control problem and just the
basic problem of self driving, but creating a product that would undeniably
be, um, that will cost less money.
Like it will save you a lot of money, like orders of magnitude, less money
that could replace Uber drivers, for example.
So car sharing that’s autonomous, that creates a similar or better
experience in terms of how quickly you get from A to B or just whatever, the
pleasantness of the experience, the efficiency of the experience, the value
of the experience, and at the same time, the car itself costs cheaper.
I think that’s very difficult to achieve.
I think there’s a lot more, um, low hanging fruit in the home.
That, that, that could be, I also want to give you a perspective on like how
challenging it would be at home or like it maybe kind of depends on that exact
problem that you’d be solving.
Like if we’re speaking about these robotic arms and hands, these things,
they cost tens of thousands of dollars or maybe a hundred K and, um, you know,
maybe, obviously, maybe there would be economy of scale.
These things would be cheaper, but actually for any household to buy it,
the price would have to go down to maybe a thousand bucks.
Yeah.
I personally think that, uh, so self driving car, it provides a clear service.
I don’t think robots in the home, there’ll be a trillion dollar company
will just be all about service, meaning it will not necessarily be about like
a robotic arm that’s helps you.
I don’t know, open a bottle or wash the dishes or, uh, any of that kind of stuff.
It has to be able to take care of that whole, the therapist thing.
You mentioned, I think that’s, um, of course there’s a line between what
is a robot and what is not like, does it really need a body?
But you know, some, um, uh, AI system with some embodiment, I think.
So the tricky part when you think actually what’s the difficult part is,
um, when the robot has like, when there is a diversity of the environment
with which the robot has to interact, that becomes hard.
So, you know, on the one spectrum, you have, uh, industrial robots as they
are doing over and over the same thing, it is possible to some extent to
prescribe the movements and we’ve very small amount of intelligence, the, the
movement can be repeated millions of times.
Um, the, it, there are also, you know, various pieces of industrial robots
where it becomes harder and harder.
I can, for instance, in case of Tesla, it might be a matter of putting a, a
rack inside of a car and, you know, because the rack kind of moves around,
it’s, uh, it’s not that easy.
It’s not exactly the same every time.
That’s not being the case that you need actually humans to do it.
Uh, while, you know, welding cars together, it’s a very repetitive process.
Um, then in case of self driving itself, uh, that difficulty has to do with the
diversity of the environment, but still the car itself, um, the problem
that they are solving is you try to avoid even interacting with things.
You are not touching anything around because touching itself is hard.
And then if you would have in the home, uh, robot that, you know, has to
touch things and like if these things, they change the shape, if there is a huge
variety of things to be touched, then that’s difficult.
If you are speaking about the robot, which there is, you know, head that
is smiling in some way with cameras that either doesn’t, you know, touch things.
That’s relatively simple.
Okay. So to both agree and to push back.
So you’re referring to touch, like soft robotics, like the actual touch, but.
I would argue that you could formulate just basic interaction between, um, like
non contact interaction is also a kind of touch and that might be very difficult
to solve that’s the basic, this not disagreement, but that’s the basic open
question to me with self driving cars and this agreement with Elon, which
is how much interaction is required to solve self driving cars.
How much touch is required?
You said that in your intuition, touch is not required.
And my intuition to create a product that’s compelling to use, you’re going
to have to, uh, interact with pedestrians, not just avoid pedestrians,
but interact with them when we drive around.
In major cities, we’re constantly threatening everybody’s life with
our movements, um, and that’s how they respect us.
There’s a game to ready going out with pedestrians and I’m afraid you can’t
just formulate autonomous driving as a collision avoidance problem.
So I think it goes beyond like a collision avoidance is the
first order approximation.
Uh, but then at least in case of Tesla, you can’t just
at least in case of Tesla, they are gathering data from people driving their
cars and I believe that’s an example of supervised data that they can train
their models, uh, on, and they are doing it, uh, which, you know, can give
a model dislike, uh, another level of, uh, of, uh, behavior that is needed
to actually interact with the real world.
Yeah.
It’s interesting how much data is required to achieve that.
Um, w what do you think of the whole Tesla autopilot approach, the computer
vision based approach with multiple cameras and there’s a data engine.
It’s a multitask, multiheaded neural network, and it’s this fascinating
process of, uh, similar to what you’re talking about with the robotics
approach, uh, which is, you know, you deploy in your own network and
then there’s humans that use it and then it runs into trouble in a bunch
of places and that stuff is sent back.
So like the deployment discovers a bunch of edge cases and those edge
cases are sent back for supervised annotation, thereby improving the
neural network and that’s deployed again.
It goes over and over until the network becomes really good at the task of
driving becomes safer and safer.
What do you think of that kind of approach to robotics?
I believe that’s the way to go.
So in some sense, even when I was speaking about, you know, collecting
trajectories from humans, that’s like a first step and then you deploy
the system and then you have humans revising the, all the issues.
And in some sense, like at this approach converges to system that doesn’t make
mistakes because for the cases where there are mistakes, you got their
data, how to fix them and the system will keep on improving.
So there’s a very, to me, difficult question of how hard that, you know,
how long that converging takes, how hard it is.
The other aspect of autonomous vehicles, this probably applies to certain
robotics applications is society, right?
They put as, as the quality of the system converges.
So one, there’s a human factors perspective of psychology of humans being
able to supervise those even with teleoperation, those robots.
And the other is society willing to accept robots.
Currently society is much harsher on self driving cars than it is on human
driven cars in terms of the expectation of safety.
So the bar is set much higher than for humans.
And so if there’s a death in an autonomous vehicle, that’s seen as a much more,
much more dramatic than a death in the human driven vehicle.
Part of the success of deployment of robots is figuring out how to make robots
part of society, both on the, just the human side, on the media side, on the
media journalist side, and also on the policy government side.
And that seems to be, maybe you can put that into the objective function to
optimize, but that is, that is definitely a tricky one.
And I wonder if that is actually the trickiest part for self driving cars or
any system that’s safety critical.
It’s not the algorithm, it’s the society accepting it.
Yeah, I would say, I believe that the part of the process of deployment is actually
showing people that the given things can be trusted and, you know, trust is also
like a glass that is actually really easy to crack it and damage it.
And I think that’s actually very common with, with innovation, that there’s
some resistance toward it and it’s just the natural progression.
So in some sense, people will have to keep on proving that indeed these
systems are worth being used.
And I would say, I also found out that often the best way to convince people
is by letting them experience it.
Yeah, absolutely.
That’s the case with Tesla autopilot, for example, that’s the case with, yeah,
with basically robots in general.
It’s kind of funny to hear people talk about robots.
Like there’s a lot of fear, even with like legged robots, but when they
actually interact with them, there’s joy.
I love interacting with them.
And the same with the car, with a robot, if it starts being useful, I think
people immediately understand.
And if the product is designed well, they fall in love.
You’re right.
It’s actually even similar when I’m thinking about the car.
It’s actually even similar when I’m thinking about Copilot, the GitHub Copilot.
There was a spectrum of responses that people had.
And ultimately the important piece was to let people try it out.
And then many people just loved it.
Especially like programmers.
Yeah, programmers, but like some of them, you know, they came with a fear.
Yeah.
But then you try it out and you think, actually, that’s cool.
And, you know, you can try to resist the same way as, you know, you could
resist moving from punch cards to, let’s say, C++ or so.
And it’s a little bit futile.
So we talked about generation of program, generation of language, even
self supervised learning in the visual space for robotics and then
reinforcement learning.
What do you, in like this whole beautiful spectrum of AI, do you think is a
good benchmark, a good test to strive for to achieve intelligence?
That’s a strong test of intelligence.
You know, it started with Alan Turing and the Turing test.
Maybe you think natural language conversation is a good test.
So, you know, it would be nice if, for instance, machine would be able to
solve Riemann hypothesis in math.
That would be, I think that would be very impressive.
So theorem proving, is that to you, proving theorems is a good, oh, oh,
like one thing that the machine did, you would say, damn.
Exactly.
Okay.
That would be quite, quite impressive.
I mean, the tricky part about the benchmarks is, you know, as we are
getting closer with them, we have to invent new benchmarks.
There is actually no ultimate benchmark out there.
Yeah.
See, my thought with the Riemann hypothesis would be the moment the
machine proves it, we would say, okay, well then the problem was easy.
That’s what happens.
And I mean, in some sense, that’s actually what happens over the years
in AI that like, we get used to things very quickly.
You know something, I talked to Rodney Brooks.
I don’t know if you know who that is.
He called AlphaZero homework problem.
Cause he was saying like, there’s nothing special about it.
It’s not a big leap.
And I didn’t, well, he’s coming from one of the aspects that we referred
to is he was part of the founding of iRobot, which deployed now tens
of millions of robot in the home.
So if you see robots that are actually in the homes of people as the
legitimate instantiation of artificial intelligence, then yes, maybe an AI
that plays a silly game like go and chess is not a real accomplishment,
but to me it’s a fundamental leap.
But I think we as humans then say, okay, well then that that game of
chess or go wasn’t that difficult compared to the thing that’s currently
unsolved.
So my intuition is that from perspective of the evolution of these AI
systems will at first seen the tremendous progress in digital space.
And the, you know, the main thing about digital space is also that you
can, everything is that there is a lot of recorded data.
Plus you can very rapidly deploy things to billions of people.
While in case of a physical space, the deployment part takes multiple
years.
You have to manufacture things and, you know, delivering it to actual
people, it’s very hard.
So I’m expecting that the first and that prices in digital space of
goods, they would go, you know, down to the, let’s say marginal costs
are two zero.
And also the question is how much of our life will be in digital because
it seems like we’re heading towards more and more of our lives being in
the digital space.
So like innovation in the physical space might become less and less
significant.
Like why do you need to drive anywhere if most of your life is spent in
virtual reality?
I still would like, you know, to at least at the moment, my impression
is that I would like to have a physical contact with other people.
And that’s very important to me.
We don’t have a way to replicate it in the computer.
It might be the case that over the time it will change.
Like in 10 years from now, why not have like an arbitrary infinite number
of people you can interact with?
Some of them are real, some are not with arbitrary characteristics that
you can define based on your own preferences.
I think that’s maybe where we are heading and maybe I’m resisting the
future.
Yeah, I’m telling you, if I got to choose, if I could live in Elder
Scrolls Skyrim versus the real world, I’m not so sure I would stay with
the real world.
Yeah, I mean, the question is, so will VR be sufficient to get us there
or do you need to, you know, plug electrodes in the brain?
And it would be nice if these electrodes wouldn’t be invasive.
Or at least like provably non destructive.
But in the digital space, do you think we’ll be able to solve the
Turing test, the spirit of the Turing test, which is, do you think we’ll
be able to achieve compelling natural language conversation between
people, like have friends that are AI systems on the internet?
I totally think it’s doable.
Do you think the current approach of GPT will take us there?
So there is, you know, the part of at first learning all the content
out there and I think that Steel System should keep on learning as
it speaks with you.
Yeah.
Yeah, and I think that should work.
The question is how exactly to do it.
And, you know, obviously we have people at OpenAI asking these
questions and kind of at first pre training on all existing content
is like a backbone and is a decent backbone.
Do you think AI needs a body connecting to our robotics question to
truly connect with humans or can most of the connection be in the
digital space?
So let’s see, we know that there are people who met each other online
and they fell in love.
Yeah.
So it seems that it’s conceivable to establish connection, which is
purely through internet.
Of course, it might be more compelling the more modalities you add.
So it would be like you’re proposing like a Tinder, but for AI, you
like swipe right and left and half the systems are AI and the other is
humans and you don’t know which is which.
That would be our formulation of Turing test.
The moment AI is able to achieve more swipe right or left, whatever,
the moment it’s able to be more attractive than other humans, it
passes the Turing test.
Then you would pass the Turing test in attractiveness.
That’s right.
Well, no, like attractiveness just to clarify.
There will be conversation.
Not just visual.
Right, right.
It’s also attractiveness with wit and humor and whatever makes
conversation is pleasant for humans.
Okay.
All right.
So you’re saying it’s possible to achieve in the digital space.
In some sense, I would almost ask that question.
Why wouldn’t that be possible?
Well, I have this argument with my dad all the time.
He thinks that touch and smell are really important.
So they can be very important.
And I’m saying the initial systems, they won’t have it.
Still, there are people being born without these senses and I believe
that they can still fall in love and have meaningful life.
Yeah.
I wonder if it’s possible to go close to all the way by just training
on transcripts of conversations.
I wonder how far that takes us.
So I think that actually still you want images like I would like.
So I don’t have kids, but like I could imagine having AI Tutor.
It has to see, you know, kids drawing some pictures on the paper.
And also facial expressions, all that kind of stuff.
We use dogs and humans use their eyes to communicate with each other.
I think that’s a really powerful mechanism of communication.
Body language too, that words are much lower bandwidth.
And for body language, we still, you know, we kind of have a system
that displays an image of its or facial expression on the computer.
Doesn’t have to move, you know, mechanical pieces or so.
So I think that, you know, that there is like kind of a progression.
You can imagine that text might be the simplest to tackle.
But this is not a complete human experience at all.
You expand it to, let’s say images, both for input and output.
And what you describe is actually the final, I guess, frontier.
What makes us human, the fact that we can touch each other or smell or so.
And it’s the hardest from perspective of data and deployment.
And I believe that these things might happen gradually.
Are you excited by that possibility?
This particular application of human to AI friendship and interaction?
So let’s see.
Like would you, do you look forward to a world?
You said you’re living with a few folks and you’re very close friends with them.
Do you look forward to a day where one or two of those friends are AI systems?
So if the system would be truly wishing me well, rather than being in the situation
that it optimizes for my time to interact with the system.
The line between those is, it’s a gray area.
I think that’s the distinction between love and possession.
And these things, they might be often correlated for humans, but you might find that there are
some friends with whom you haven’t spoke for months.
Yeah.
And then you pick up the phone, it’s as the time hasn’t passed.
They are not holding to you.
And I will, I wouldn’t like to have AI system that, you know, it’s trying to convince me
to spend time with it.
I would like the system to optimize for what I care about and help me in achieving my own goals.
But there’s some, I mean, I don’t know, there’s some manipulation, there’s some possessiveness,
there’s some insecurities, this fragility, all those things are necessary to form a close
friendship over time, to go through some dark shit together, some bliss and happiness together.
I feel like there’s a lot of greedy self centered behavior within that process.
My intuition, but I might be wrong, is that human computer interaction doesn’t have to
go through a computer being greedy, possessive, and so on.
It is possible to train systems, maybe, that they actually
they are, I guess, prompted or fine tuned or so to truly optimize for what you care about.
And you could imagine that, you know, the way how the process would look like is at
some point, we as humans, we look at the transcript of the conversation or like an entire
interaction and we say, actually here, there was more loving way to go about it.
And we supervise system toward being more loving, or maybe we train the system such
that it has a reward function toward being more loving.
Yeah.
Or maybe the possibility of the system being an asshole and manipulative and possessive
every once in a while is a feature, not a bug.
Because some of the happiness that we experience when two souls meet each other, when two humans
meet each other, is a kind of break from the assholes in the world.
And so you need assholes in AI as well, because, like, it’ll be like a breath of fresh air
to discover an AI that the three previous AIs you had are too friendly or no, or cruel
or whatever.
It’s like some kind of mix.
And then this one is just right, but you need to experience the full spectrum.
Like, I think you need to be able to engineer assholes.
So let’s see.
Because there’s some level to us being appreciated to appreciate the human experience.
We need the dark and the light.
So that kind of reminds me.
I met a while ago at the meditation retreat, one woman, and she told me, you know,
beautiful, beautiful woman, and she had a she had a crutch.
Okay.
She had the trouble walking on one leg.
I asked her what has happened.
And she said that five years ago she was in Maui, Hawaii, and she was eating a salad and
some snail fell into the salad.
And apparently there are neurotoxic snails over there.
And she got into coma for a year.
Okay.
And apparently there is, you know, high chance of even just dying.
But she was in the coma.
At some point, she regained partially consciousness.
She was able to hear people in the room.
People behave as she wouldn’t be there.
You know, at some point she started being able to speak, but she was mumbling like a
barely able to express herself.
Then at some point she got into wheelchair.
Then at some point she actually noticed that she can move her toe and then she knew that
she will be able to walk.
And then, you know, that’s where she was five years after.
And she said that since then she appreciates the fact that she can move her toe.
And I was thinking, hmm, do I need to go through such experience to appreciate that I have
I can move my toe?
Wow, that’s a really good story and really deep example.
Yeah.
And in some sense, it might be the case that we don’t see light if we haven’t went through
the darkness.
But I wouldn’t say that we should.
We shouldn’t assume that that’s the case, which it may be able to engineer shortcuts.
Yeah.
Ilya had this, you know, belief that maybe one has to go for a week or six months to
do some challenging camp to just experience, you know, a lot of difficulties and then comes
back and actually everything is bright, everything is beautiful.
I’m with Ilya on this.
It must be a Russian thing.
Where are you from originally?
I’m Polish.
Polish.
Okay.
I’m tempted to say that explains a lot.
But yeah, there’s something about the Russian, the necessity of suffering.
I believe suffering or rather struggle is necessary.
I believe that struggle is necessary.
I mean, in some sense, you even look at the story of any superhero in the movie.
It’s not that it was like everything goes easy, easy, easy, easy.
I like how that’s your ground truth is the story of superheroes.
Okay.
You mentioned that you used to do research at night and go to bed at like 6 a.m.
or 7 a.m.
I still do that often.
What sleep schedules have you tried to make for a productive and happy life?
Like, is there is there some interesting wild sleeping patterns that you engaged that you
found that works really well for you?
I tried at some point decreasing number of hours of sleep like a gradually like a half
an hour every few days to this.
You know, I was hoping to just save time.
That clearly didn’t work for me.
Like at some point, there’s like a phase shift and I felt tired all the time.
You know, there was a time that I used to work during the nights.
The nice thing about the nights is that no one disturbs you.
And even I remember when I was meeting for the first time with Greg Brockman, his
CTO and chairman of OpenAI, our meeting was scheduled to 5 p.m.
And I overstepped for the meeting.
Over slept for the meeting at 5 p.m.
Yeah, now you sound like me.
That’s hilarious.
OK, yeah.
And at the moment, in some sense, my sleeping schedule also has to do with the fact that
I’m interacting with people.
I sleep without an alarm.
So, yeah, the the team thing you mentioned, the extrovert thing, because most humans operate
during a certain set of hours, you’re forced to then operate at the same set of hours.
But I’m not quite there yet.
I found a lot of joy, just like you said, working through the night because it’s quiet
because the world doesn’t disturb you.
And there’s some aspect counter to everything you’re saying.
There’s some joyful aspect to sleeping through the mess of the day because people are having
people are having meetings and sending emails and there’s drama meetings.
I can sleep through all the meetings.
You know, I have meetings every day and they prevent me from having sufficient amount of
time for focused work.
And then I modified my calendar and I said that I’m out of office Wednesday, Thursday
and Friday every day and I’m having meetings only Monday and Tuesday.
And that busty positively influenced my mood that I have literally like at three days for
fully focused work.
Yeah.
So there’s better solutions to this problem than staying awake all night.
OK, you’ve been part of development of some of the greatest ideas in artificial intelligence.
What would you say is your process for developing good novel ideas?
You have to be aware that clearly there are many other brilliant people around.
So you have to ask yourself a question, why the given idea, let’s say, wasn’t tried by
someone else and in some sense, it has to do with, you know, kind of simple.
It might sound simple, but like a thinking outside of the box.
And what do I mean here?
So, for instance, for a while, people in academia, they assumed that you have a feeling that
you have a fixed data set and then you optimize the algorithms in order to get the best performance.
And that was so in great assumption that no one thought about training models on
anti internet or like that.
Maybe some people thought about it, but it felt to many as unfair.
And in some sense, that’s almost like a it’s not my idea or so, but that’s an example of
breaking at the typical assumption.
So you want to be in the paradigm that you’re breaking at the typical assumption.
In the context of the community, getting to pick your data set is cheating.
Correct.
And in some sense, so that was that was assumption that many people had out there.
And then if you free yourself from assumptions, then they are likely to achieve something
that others cannot do.
And in some sense, if you are trying to do exactly the same things as others, it’s very
likely that you’re going to have the same results.
Yeah, I but there’s also that kind of tension, which is asking yourself the question, why
haven’t others done this?
Because, I mean, I get a lot of good ideas, but I think probably most of them suck when
they meet reality.
So so actually, I think the other big piece is getting into habit of generating ideas,
training your brain towards generating ideas and not even suspending judgment of the ideas.
So in some sense, I noticed myself that even if I’m in the process of generating ideas,
if I tell myself, oh, that was a bad idea, then that actually interrupts the process
and I cannot generate more ideas because I’m actually focused on the negative part, why
it won’t work.
Yes.
But I created also environment in the way that it’s very easy for me to store new ideas.
So, for instance, next to my bed, I have a voice recorder and it happens to me often
like I wake up during the night and I have some idea.
In the past, I was writing them down on my phone, but that means, you know, turning on
the screen and that wakes me up or like pulling a paper, which requires, you know, turning
on the light.
These days, I just start recording it.
What do you think, I don’t know if you know who Jim Keller is.
I know Jim Keller.
He’s a big proponent of thinking harder on a problem right before sleep so that he can
sleep through it and solve it in his sleep or like come up with radical stuff in his
sleep that’s trying to get me to do this.
So it happened from my experience perspective, it happened to me many times during the high
school days when I was doing mathematics that I had a solution to my problem as I woke up.
At the moment, regarding thinking hard about the given problem is I’m trying to actually
devote substantial amount of time to think about important problems, not just before
the sleep.
I’m organizing amount of the huge chunks of time such that I’m not constantly working
on the urgent problems, but I actually have time to think about the important one.
So you do it naturally.
But his idea is that you kind of prime your brain to make sure that that’s the focus.
Oftentimes people have other worries in their life that’s not fundamentally deep problems
like I don’t know, just stupid drama in your life and even at work, all that kind of stuff.
He wants to kind of pick the most important problem that you’re thinking about and go
to bed on that.
I think that’s wise.
I mean, the other thing that comes to my mind is also I feel the most fresh in the morning.
So during the morning, I try to work on the most important things rather than just being
pulled by urgent things or checking email or so.
What do you do with the…
Because I’ve been doing the voice recorder thing too, but I end up recording so many
messages it’s hard to organize.
I have the same problem.
Now I have heard that Google Pixel is really good in transcribing text and I might get
a Google Pixel just for the sake of transcribing text.
Yeah, people listening to this, if you have a good voice recorder suggestion that transcribe,
please let me know.
Some of it has to do with OpenAI codecs too.
Like some of it is simply like the friction.
I need apps that remove that friction between voice and the organization of the resulting
transcripts and all that kind of stuff.
But yes, you’re right.
Absolutely, like during, for me it’s walking, sleep too, but walking and running, especially
running, get a lot of thoughts during running and there’s no good mechanism for recording
thoughts.
So one more thing that I do, I have a separate phone which has no apps.
Maybe it has like audible or let’s say Kindle.
No one has this phone number, this kind of my meditation phone.
Yeah.
And I try to expand the amount of time that that’s the phone that I’m having.
It has also Google Maps if I need to go somewhere and I also use this phone to write down ideas.
Ah, that’s a really good idea.
That’s a really good idea.
Often actually what I end up doing is even sending a message from that phone to the other
phone.
So that’s actually my way of recording messages or I just put them into notes.
I love it.
What advice would you give to a young person, high school, college, about how to be successful?
You’ve done a lot of incredible things in the past decade, so maybe, maybe have some.
There’s something, there might be something.
There might be something.
I mean, might sound like a simplistic or so, but I would say literally just follow your
passion, double down on it.
And if you don’t know what’s your passion, just figure out what could be a, what could
be a passion.
So that might be an exploration.
When I was in elementary school was math and chemistry.
And I remember for some time I gave up on math because my school teacher, she told me
that I’m dumb.
And I guess maybe an advice would be just ignore people if they tell you that you’re
dumb.
You’re dumb.
You’re dumb. You mentioned something offline about chemistry and explosives.
What was that about?
So let’s see.
So a story goes like that.
I got into chemistry.
Maybe I was like a second grade of my elementary school, third grade.
I started going to chemistry classes.
I really love building stuff.
And I did all the experiments that they describe in the book, like, you know, how to create
oxygen with vinegar and baking soda or so.
Okay.
So I did all the experiments and at some point I was, you know, so what’s next?
What can I do?
And explosives, they also, it’s like a, you have a clear reward signal, you know, if the
thing worked or not.
So I remember at first I got interested in producing hydrogen.
That was kind of funny experiment from school.
You can just burn it.
And then I moved to nitroglycerin.
So that’s also relatively easy to synthesize.
I started producing essentially dynamite and detonating it with a friend.
I remember there was a, you know, there was at first like maybe two attempts that I went
with a friend to detonate what we built and it didn’t work out.
And like a third time he was like, ah, it won’t work.
Like, let’s don’t waste time.
And, you know, we were, I was carrying this, this, you know, that tube with dynamite, I
don’t know, pound or so, dynamite in my backpack, we’re like riding on the bike to the edges
of the city.
Yeah, and attempt number three, this was be attempt number three.
Attempt number three.
And now we dig a hole to put it inside.
It actually had the, you know, electrical detonator.
We draw a cable behind the tree.
I even, I never had, I haven’t ever seen like a explosion before.
So I thought that there would be a lot of, you know, a lot of, you know, a lot of, you
know, there will be a lot of sound.
But, you know, we’re like laying down and I’m holding the cable and the battery.
At some point, you know, we kind of like a three to one and I just connected it and it
felt like the ground shake.
It was like more like a sound.
And then the soil started kind of lifting up and started falling on us.
Yeah.
Wow.
And then, you know, the friend said, let’s make sure the next time we have helmets.
But also, you know, I’m happy that nothing happened to me.
It could have been the case that I lost the limbo or so.
Yeah, but that’s childhood of an engineering mind with a strong reward signal of an
explosion.
I love it.
My there’s some aspect of chemists that the chemists I know, like my dad with plasma
chemistry, plasma physics, he was very much into explosives, too.
It’s a worrying quality of people that work in chemistry that they love.
I think it is that exactly is the strong signal that the thing worked.
There is no doubt.
There’s no doubt.
There’s some magic.
It’s almost like a reminder that physics works, that chemistry works.
It’s cool.
It’s almost like a little glimpse at nature that you yourself engineer.
I that’s why I really like artificial intelligence, especially robotics, is you create a little
piece of nature and in some sense, even for me with explosives, the motivation was creation
rather than destruction.
Yes, exactly.
In terms of advice, I forgot to ask about just machine learning and deep learning for
people who are specifically interested in machine learning, how would you recommend
they get into the field?
So I would say re implement everything and also there is plenty of courses.
So like from scratch?
So on different levels of abstraction in some sense, but I would say re implement something
from scratch, re implement something from a paper, re implement something, you know,
from podcasts that you have heard about.
I would say that’s a powerful way to understand things.
So it’s often the case that you read the description and you think you understand, but you truly
understand once you build it, then you actually know what really matter in the description.
Is there a particular topics that you find people just fall in love with?
I’ve seen.
I tend to really enjoy reinforcement learning because it’s much more, it’s much easier
to get to a point where you feel like you created something special, like fun games
kind of things that are rewarding.
It’s rewarding.
Yeah.
As opposed to like re implementing from scratch, more like supervised learning kind of things.
It’s yeah.
So, you know, if someone would optimize for things to be rewarding, then it feels that
the things that are somewhat generative, they have such a property.
So you have, for instance, adversarial networks, or do you have just even generative language
models?
And you can even see, internally, we have seen this thing with our releases.
So we have, we released recently two models.
There is one model called Dali that generates images, and there is other model called Clip
that actually you provide various possibilities, what could be the answer to what is on the
picture, and it can tell you which one is the most likely.
And in some sense, in case of the first one, Dali, it is very easy for you to understand
that actually there is magic going on.
And in the case of the second one, even though it is insanely powerful, and you know, people
from a vision community, they, as they started probing it inside, they actually understood
how far it goes.
How far it goes, it’s difficult for a person at first to see how well it works.
And that’s the same, as you said, that in case of supervised learning models, you might
not kind of see, or it’s not that easy for you to understand the strength.
Even though you don’t believe in magic, to see the magic.
To see the magic, yeah.
It’s a generative.
That’s really brilliant.
So anything that’s generative, because then you are at the core of the creation.
You get to experience creation without much effort.
Unless you have to do it from scratch, but.
And it feels that, you know, humans are wired.
There is some level of reward for creating stuff.
Yeah.
Of course, different people have a different weight on this reward.
Yeah.
In the big objective function.
In the big objective function of a person.
Of a person.
You wrote that beautiful is what you intensely pay attention to.
Even a cockroach is beautiful.
If you look very closely, can you expand on this?
What is beauty?
So what I’m, I wrote here actually corresponds to my subjective experience that I had through
extended periods of meditation.
It’s, it’s pretty crazy that at some point the meditation gets you to the place that
you have really increased focus, increased attention.
Increased attention.
And then you look at the very simple objects that were all the time around you can look
at the table or on the pen or at the nature.
And you notice more and more details and it becomes very pleasant to look at it.
And it, once again, it kind of reminds me of my childhood.
Like I just pure joy of being.
It’s also, I have seen even the reverse effect that by default, regardless of what we possess,
we very quickly get used to it.
And you know, you can have a very beautiful house and if you don’t put sufficient effort,
you’re just going to get used to it and it doesn’t bring any more joy,
regardless of what you have.
Yeah.
Well, I actually, I find that material possessions get in the way of that experience of pure
joy.
So I’ve always, I’ve been very fortunate to just find joy in simple things.
Just, just like you’re saying, just like, I don’t know, objects in my life, just stupid
objects like this cup, like thing, you know, just objects sounds okay.
I’m not being eloquent, but literally objects in the world, they’re just full of joy.
Cause it’s like, I can’t believe when I can’t believe that I’m fortunate enough to be alive
to experience these objects.
And then two, I can’t believe humans are clever enough to have built these objects.
The hierarchy of pleasure that that provides is infinite.
I mean, even if you look at the cup of water, so, you know, you see first like a level of
like a reflection of light, but then you think, you know, man, there’s like a trillions upon
of trillions of particles bouncing against each other.
There is also the tension on the surface that, you know, if the back, back could like a stand
on it and move around.
And you think it also has this like a magical property that as you decrease temperature,
it actually expands in volume, which allows for the, you know, legs to freeze on the,
on the surface and at the bottom to have actually not freeze, which allows for life like a crazy.
Yeah.
You look in detail at some object and you think actually, you know, this table, that
was just a figment of someone’s imagination at some point.
And then there was like a thousands of people involved to actually manufacture it and put
it here.
And by default, no one cares.
And then you can start thinking about evolution, how it all started from single cell organisms
that led to this table.
And these thoughts, they give me life appreciation and even lack of thoughts, just the pure raw
signal also gives the life appreciation.
See, the thing is, and then that’s coupled for me with the sadness that the whole ride
ends and perhaps is deeply coupled in that the fact that this experience, this moment
ends, gives it, gives it an intensity that I’m not sure I would otherwise have.
So in that same way, I tried to meditate on my own death.
Often.
Do you think about your mortality?
Are you afraid of death?
So fear of death is like one of the most fundamental fears that each of us has.
We might be not even aware of it.
It requires to look inside, to even recognize that it’s out there and there is still, let’s
say, this property of nature that if things would last forever, then they would be also
boring to us.
The fact that the things change in some way gives any meaning to them.
I also, you know, found out that it seems to be very healing to people to have these
short experiences, like, I guess, psychedelic experiences in which they experience death
of self in which they let go of this fear and then maybe can even increase the intensity
can even increase the appreciation of the moment.
It seems that many people, they can easily comprehend the fact that the money is finite
while they don’t see that time is finite.
I have this like a discussion with Ilya from time to time.
He’s like, you know, man, like the life will pass very fast.
At some point I will be 40, 50, 60, 70 and then it’s over.
This is true, which also makes me believe that, you know, that every single moment it
is so unique that should be appreciated.
And this also makes me think that I should be acting on my life because otherwise it
will pass.
I also like this framework of thinking from Jeff Bezos on regret minimization that like
I would like if I will be at that deathbed to look back on my life and not regret that
I haven’t done something.
It’s usually you might regret that you haven’t tried.
I’m fine with failing.
I haven’t tried.
What’s the Nietzsche eternal occurrence?
Try to live a life that if you had to live it infinitely many times, that would be the
you’d be okay with that kind of life.
So try to live it optimally.
I can say that it’s almost like I’m.
I’m available to me where I am in my life.
I’m extremely grateful for actually people whom I met.
I would say I think that I’m decently smart and so on.
But I think that actually to a great extent where I am has to do with the people who I
met.
Would you be okay if after this conversation you died?
So if I’m dead, then it kind of I don’t have a choice anymore.
So in some sense, there’s like plenty of things that I would like to try out in my life.
I feel that I’m gradually going one by one and I’m just doing them.
I think that the list will be always infinite.
Yeah, so might as well go today.
Yeah, I mean, to be clear, I’m not looking forward to die.
I would say if there is no choice, I would accept it.
But like in some sense, I’m if there would be a choice, if there would be a possibility
to leave, I would fight for leaving.
I find it’s more.
I find it’s more honest and real to think about, you know, dying today at the end of
the day.
That seems to me, at least to my brain, more honest slap in the face as opposed to I still
have 10 years like today, then I’m much more about appreciating the cup and the table and
so on and less about like silly worldly accomplishments and all those kinds of things.
But we have in the company a person who say at some point found out that they have cancer
and that also gives, you know, huge perspective with respect to what matters now.
Yeah.
And, you know, often people in situations like that, they conclude that actually what
matters is human connection.
And love and that’s people conclude also if you have kids, kids as family.
You, I think, tweeted, we don’t assign the minus infinity reward to our death.
Such a reward would prevent us from taking any risk.
We wouldn’t be able to cross the road in fear of being hit by a car.
So in the objective function, you mentioned fear of death might be fundamental to the
human condition.
So, as I said, let’s assume that they’re like a reward functions in our brain.
And the interesting thing is even realization, how different reward functions can play with
your behavior.
As a matter of fact, I wouldn’t say that you should assign infinite negative reward to
anything because that messes up the math.
The math doesn’t work out.
It doesn’t work out.
And as you said, even, you know, government or some insurance companies, you said they
assign $9 million to human life.
And I’m just saying it with respect to, that might be a hard statement to ourselves, but
in some sense that there is a finite value of our own life.
I’m trying to put it from perspective of being less, of being more egoless and realizing
fragility of my own life.
And in some sense, the fear of death might prevent you from acting because anything can
cause death.
Yeah.
And I’m sure actually, if you were to put death in the objective function, there’s probably
so many aspects to death and fear of death and realization of death and mortality.
There’s just whole components of finiteness of not just your life, but every experience
and so on that you’re going to have to formalize mathematically.
And also, you know, that might lead to you spending a lot of compute cycles on this like
a deliberating this terrible future instead of experiencing now.
And then in some sense, it’s also kind of unpleasant simulation to run in your head.
Yeah.
Do you think there’s an objective function that describes the entirety of human life?
So, you know, usually the way you ask that is what is the meaning of life?
Is there a universal objective functions that captures the why of life?
So, yeah, I mean, I suspect that they will ask this question, but it’s also a question
that I ask myself many, many times.
See, I can tell you a framework that I have these days to think about this question.
So I think that fundamentally, meaning of life has to do with some of our reward actions
that we have in brain and they might have to do with, let’s say, for instance, curiosity
or human connection, which might mean understanding others.
It’s also possible for a person to slightly modify their reward function.
Usually they mostly stay fixed, but it’s possible to modify reward function and you can pretty
much choose.
So in some sense, the reward functions, optimizing reward functions, they will give you a life
satisfaction.
Is there some randomness in the function?
I think when you are born, there is some randomness.
You can see that some people, for instance, they care more about building stuff.
Some people care more about caring for others.
Some people, there are all sorts of default reward functions.
And then in some sense, you can ask yourself, what is the satisfying way for you to go after
this reward function?
And you just go after this reward function.
And, you know, some people also ask, are you satisfied with your life?
And, you know, some people also ask, are these reward functions real?
I almost think about it as, let’s say, if you would have to discover mathematics, in
mathematics, you are likely to run into various objects like complex numbers or differentiation,
some other objects.
And these are very natural objects that arise.
And similarly, the reward functions that we are having in our brain, they are somewhat
very natural, that, you know, there is a reward function for understanding, like a comprehension,
curiosity, and so on.
So in some sense, they are in the same way natural as their natural objects in mathematics.
Interesting.
So, you know, there’s the old sort of debate, is mathematics invented or discovered?
You’re saying reward functions are discovered.
So nature.
So nature provided some, you can still, let’s say, expand it throughout the life.
Some of the reward functions, they might be futile.
Like, for instance, there might be a reward function, maximize amount of wealth.
Yeah.
And this is more like a learned reward function.
But we know also that some reward functions, if you optimize them, you won’t be quite satisfied.
Well, I don’t know which part of your reward function resulted in you coming today, but
I am deeply appreciative that you did spend your valuable time with me.
Wojtek is really fun talking to you.
You’re brilliant.
You’re a good human being.
And it’s an honor to meet you and an honor to talk to you.
Thanks for talking today, brother.
Thank you, Lex a lot.
I appreciated your questions, curiosity.
I had a lot of time being here.
Thanks for listening to this conversation with Wojtek Zaremba.
To support this podcast, please check out our sponsors in the description.
And now, let me leave you with some words from Arthur C. Clarke, who is the author of
2001 A Space Odyssey.
It may be that our role on this planet is not to worship God, but to create him.
Thank you for listening, and I hope to see you next time.