The following is a conversation with Jeff Hawkins.
He’s the founder of the Redwood Center
for Theoretical Neuroscience in 2002, and NuMenta in 2005.
In his 2004 book, titled On Intelligence,
and in the research before and after,
he and his team have worked to reverse engineer
the neural cortex, and propose artificial intelligence
architectures, approaches, and ideas
that are inspired by the human brain.
These ideas include Hierarchical Tupperware Memory,
HTM, from 2004, and new work,
the Thousand Brains Theory of Intelligence
from 2017, 18, and 19.
Jeff’s ideas have been an inspiration
to many who have looked for progress
beyond the current machine learning approaches,
but they have also received criticism
for lacking a body of empirical evidence
supporting the models.
This is always a challenge when seeking more
than small incremental steps forward in AI.
Jeff is a brilliant mind, and many of the ideas
he has developed and aggregated from neuroscience
are worth understanding and thinking about.
There are limits to deep learning,
as it is currently defined.
Forward progress in AI is shrouded in mystery.
My hope is that conversations like this
can help provide an inspiring spark for new ideas.
This is the Artificial Intelligence Podcast.
If you enjoy it, subscribe on YouTube, iTunes,
or simply connect with me on Twitter
at Lex Friedman, spelled F R I D.
And now, here’s my conversation with Jeff Hawkins.
Are you more interested in understanding the human brain
or in creating artificial systems
that have many of the same qualities
but don’t necessarily require that you actually understand
the underpinning workings of our mind?
So there’s a clear answer to that question.
My primary interest is understanding the human brain.
No question about it.
But I also firmly believe that we will not be able
to create fully intelligent machines
until we understand how the human brain works.
So I don’t see those as separate problems.
I think there’s limits to what can be done
with machine intelligence if you don’t understand
the principles by which the brain works.
And so I actually believe that studying the brain
is actually the fastest way to get to machine intelligence.
And within that, let me ask the impossible question,
how do you, not define, but at least think
about what it means to be intelligent?
So I didn’t try to answer that question first.
We said, let’s just talk about how the brain works
and let’s figure out how certain parts of the brain,
mostly the neocortex, but some other parts too.
The parts of the brain most associated with intelligence.
And let’s discover the principles by how they work.
Because intelligence isn’t just like some mechanism
and it’s not just some capabilities.
It’s like, okay, we don’t even know
where to begin on this stuff.
And so now that we’ve made a lot of progress on this,
after we’ve made a lot of progress
on how the neocortex works, and we can talk about that,
I now have a very good idea what’s gonna be required
to make intelligent machines.
I can tell you today, some of the things
are gonna be necessary, I believe,
to create intelligent machines.
Well, so we’ll get there.
We’ll get to the neocortex and some of the theories
of how the whole thing works.
And you’re saying, as we understand more and more
about the neocortex, about our own human mind,
we’ll be able to start to more specifically define
what it means to be intelligent.
It’s not useful to really talk about that until.
I don’t know if it’s not useful.
Look, there’s a long history of AI, as you know.
And there’s been different approaches taken to it.
And who knows, maybe they’re all useful.
So the good old fashioned AI, the expert systems,
the current convolutional neural networks,
they all have their utility.
They all have a value in the world.
But I would think almost everyone agree
that none of them are really intelligent
in a sort of a deep way that humans are.
And so it’s just the question of how do you get
from where those systems were or are today
to where a lot of people think we’re gonna go.
And there’s a big, big gap there, a huge gap.
And I think the quickest way of bridging that gap
is to figure out how the brain does that.
And then we can sit back and look and say,
oh, which of these principles that the brain works on
are necessary and which ones are not?
Clearly, we don’t have to build this in,
and intelligent machines aren’t gonna be built
out of organic living cells.
But there’s a lot of stuff that goes on the brain
that’s gonna be necessary.
So let me ask maybe, before we get into the fun details,
let me ask maybe a depressing or a difficult question.
Do you think it’s possible that we will never
be able to understand how our brain works,
that maybe there’s aspects to the human mind,
like we ourselves cannot introspectively get to the core,
that there’s a wall you eventually hit?
Yeah, I don’t believe that’s the case.
I have never believed that’s the case.
There’s not been a single thing humans have ever put
their minds to that we’ve said, oh, we reached the wall,
we can’t go any further.
It’s just, people keep saying that.
People used to believe that about life.
Alain Vital, right, there’s like,
what’s the difference between living matter
and nonliving matter, something special
that we never understand.
We no longer think that.
So there’s no historical evidence that suggests this
is the case, and I just never even consider
that’s a possibility.
I would also say, today, we understand so much
about the neocortex.
We’ve made tremendous progress in the last few years
that I no longer think of it as an open question.
The answers are very clear to me.
The pieces we don’t know are clear to me,
but the framework is all there, and it’s like,
oh, okay, we’re gonna be able to do this.
This is not a problem anymore, just takes time and effort,
but there’s no mystery, a big mystery anymore.
So then let’s get into it for people like myself
who are not very well versed in the human brain,
except my own.
Can you describe to me, at the highest level,
what are the different parts of the human brain,
and then zooming in on the neocortex,
the parts of the neocortex, and so on,
a quick overview.
Yeah, sure.
The human brain, we can divide it roughly into two parts.
There’s the old parts, lots of pieces,
and then there’s the new part.
The new part is the neocortex.
It’s new because it didn’t exist before mammals.
The only mammals have a neocortex,
and in humans, in primates, it’s very large.
In the human brain, the neocortex occupies
about 70 to 75% of the volume of the brain.
It’s huge.
And the old parts of the brain are,
there’s lots of pieces there.
There’s the spinal cord, and there’s the brain stem,
and the cerebellum, and the different parts
of the basal ganglia, and so on.
In the old parts of the brain,
you have the autonomic regulation,
like breathing and heart rate.
You have basic behaviors, so like walking and running
are controlled by the old parts of the brain.
All the emotional centers of the brain
are in the old part of the brain,
so when you feel anger or hungry, lust,
or things like that, those are all
in the old parts of the brain.
And we associate with the neocortex
all the things we think about as sort of
high level perception and cognitive functions,
anything from seeing and hearing and touching things
to language to mathematics and engineering
and science and so on.
Those are all associated with the neocortex,
and they’re certainly correlated.
Our abilities in those regards are correlated
with the relative size of our neocortex
compared to other mammals.
So that’s like the rough division,
and you obviously can’t understand
the neocortex completely isolated,
but you can understand a lot of it
with just a few interfaces to the old parts of the brain,
and so it gives you a system to study.
The other remarkable thing about the neocortex,
compared to the old parts of the brain,
is the neocortex is extremely uniform.
It’s not visibly or anatomically,
it’s very, I always like to say
it’s like the size of a dinner napkin,
about two and a half millimeters thick,
and it looks remarkably the same everywhere.
Everywhere you look in that two and a half millimeters
is this detailed architecture,
and it looks remarkably the same everywhere,
and that’s across species.
A mouse versus a cat and a dog and a human.
Where if you look at the old parts of the brain,
there’s lots of little pieces do specific things.
So it’s like the old parts of our brain evolved,
like this is the part that controls heart rate,
and this is the part that controls this,
and this is this kind of thing,
and that’s this kind of thing,
and these evolved for eons a long, long time,
and they have their specific functions,
and all of a sudden mammals come along,
and they got this thing called the neocortex,
and it got large by just replicating the same thing
over and over and over again.
This is like, wow, this is incredible.
So all the evidence we have,
and this is an idea that was first articulated
in a very cogent and beautiful argument
by a guy named Vernon Malcastle in 1978, I think it was,
that the neocortex all works on the same principle.
So language, hearing, touch, vision, engineering,
all these things are basically underlying,
are all built on the same computational substrate.
They’re really all the same problem.
So the low level of the building blocks all look similar.
Yeah, and they’re not even that low level.
We’re not talking about like neurons.
We’re talking about this very complex circuit
that exists throughout the neocortex.
It’s remarkably similar.
It’s like, yes, you see variations of it here and there,
more of the cell, less and less, and so on.
But what Malcastle argued was, he says,
you know, if you take a section of neocortex,
why is one a visual area and one is a auditory area?
Or why is, and his answer was,
it’s because one is connected to eyes
and one is connected to ears.
Literally, you mean just it’s most closest
in terms of number of connections
to the sensor. Literally, literally,
if you took the optic nerve and attached it
to a different part of the neocortex,
that part would become a visual region.
This actually, this experiment was actually done
by Merkankasur in developing, I think it was lemurs,
I can’t remember what it was, some animal.
And there’s a lot of evidence to this.
You know, if you take a blind person,
a person who’s born blind at birth,
they’re born with a visual neocortex.
It doesn’t, may not get any input from the eyes
because of some congenital defect or something.
And that region becomes, does something else.
It picks up another task.
So, and it’s, so it’s this very complex thing.
It’s not like, oh, they’re all built on neurons.
No, they’re all built in this very complex circuit
and somehow that circuit underlies everything.
And so this is the, it’s called
the common cortical algorithm, if you will.
Some scientists just find it hard to believe
and they just, I can’t believe that’s true,
but the evidence is overwhelming in this case.
And so a large part of what it means
to figure out how the brain creates intelligence
and what is intelligence in the brain
is to understand what that circuit does.
If you can figure out what that circuit does,
as amazing as it is, then you can,
then you understand what all these
other cognitive functions are.
So if you were to sort of put neocortex
outside of your book on intelligence,
you look, if you wrote a giant tome, a textbook
on the neocortex, and you look maybe
a couple of centuries from now,
how much of what we know now would still be accurate
two centuries from now?
So how close are we in terms of understanding?
I have to speak from my own particular experience here.
So I run a small research lab here.
It’s like any other research lab.
I’m sort of the principal investigator.
There’s actually two of us
and there’s a bunch of other people.
And this is what we do.
We study the neocortex and we publish our results
and so on.
So about three years ago,
we had a real breakthrough in this field.
Just tremendous breakthrough.
We’ve now published, I think, three papers on it.
And so I have a pretty good understanding
of all the pieces and what we’re missing.
I would say that almost all the empirical data
we’ve collected about the brain, which is enormous.
If you don’t know the neuroscience literature,
it’s just incredibly big.
And it’s, for the most part, all correct.
It’s facts and experimental results and measurements
and all kinds of stuff.
But none of that has been really assimilated
into a theoretical framework.
It’s data without, in the language of Thomas Kuhn,
the historian, would be a sort of a pre paradigm science.
Lots of data, but no way to fit it together.
I think almost all of that’s correct.
There’s just gonna be some mistakes in there.
And for the most part,
there aren’t really good cogent theories about it,
how to put it together.
It’s not like we have two or three competing good theories,
which ones are right and which ones are wrong.
It’s like, nah, people are just scratching their heads.
Some people have given up
on trying to figure out what the whole thing does.
In fact, there’s very, very few labs that we do
that focus really on theory
and all this unassimilated data and trying to explain it.
So it’s not like we’ve got it wrong.
It’s just that we haven’t got it at all.
So it’s really, I would say, pretty early days
in terms of understanding the fundamental theory’s forces
of the way our mind works.
I don’t think so.
I would have said that’s true five years ago.
So as I said,
we had some really big breakthroughs on this recently
and we started publishing papers on this.
So we’ll get to that.
But so I don’t think it’s,
I’m an optimist and from where I sit today,
most people would disagree with this,
but from where I sit today, from what I know,
it’s not super early days anymore.
We are, the way these things go
is it’s not a linear path, right?
You don’t just start accumulating
and get better and better and better.
No, all this stuff you’ve collected,
none of it makes sense.
All these different things are just sort of around.
And then you’re gonna have some breaking points
where all of a sudden, oh my God, now we got it right.
That’s how it goes in science.
And I personally feel like we passed that little thing
about a couple of years ago,
all that big thing a couple of years ago.
So we can talk about that.
Time will tell if I’m right,
but I feel very confident about it.
That’s why I’m willing to say it on tape like this.
At least very optimistic.
So let’s, before those few years ago,
let’s take a step back to HTM,
the hierarchical temporal memory theory,
which you first proposed on intelligence
and went through a few different generations.
Can you describe what it is,
how it evolved through the three generations
since you first put it on paper?
Yeah, so one of the things that neuroscientists
just sort of missed for many, many years,
and especially people who were thinking about theory,
was the nature of time in the brain.
Brains process information through time.
The information coming into the brain
is constantly changing.
The patterns from my speech right now,
if you were listening to it at normal speed,
would be changing on your ears
about every 10 milliseconds or so, you’d have a change.
This constant flow, when you look at the world,
your eyes are moving constantly,
three to five times a second,
and the input’s completely changing.
If I were to touch something like a coffee cup,
as I move my fingers, the input changes.
So this idea that the brain works on time changing patterns
is almost completely, or was almost completely missing
from a lot of the basic theories,
like fears of vision and so on.
It’s like, oh no, we’re gonna put this image
in front of you and flash it and say, what is it?
Convolutional neural networks work that way today, right?
Classify this picture.
But that’s not what vision is like.
Vision is this sort of crazy time based pattern
that’s going all over the place,
and so is touch and so is hearing.
So the first part of hierarchical temporal memory
was the temporal part.
It’s to say, you won’t understand the brain,
nor will you understand intelligent machines
unless you’re dealing with time based patterns.
The second thing was, the memory component of it was,
is to say that we aren’t just processing input,
we learn a model of the world.
And the memory stands for that model.
The point of the brain, the part of the neocortex,
it learns a model of the world.
We have to store things, our experiences,
in a form that leads to a model of the world.
So we can move around the world,
we can pick things up and do things and navigate
and know how it’s going on.
So that’s what the memory referred to.
And many people just, they were thinking about
like certain processes without memory at all.
They’re just like processing things.
And then finally, the hierarchical component
was a reflection to that the neocortex,
although it’s this uniform sheet of cells,
different parts of it project to other parts,
which project to other parts.
And there is a sort of rough hierarchy in terms of that.
So the hierarchical temporal memory is just saying,
look, we should be thinking about the brain
as time based, model memory based,
and hierarchical processing.
And that was a placeholder for a bunch of components
that we would then plug into that.
We still believe all those things I just said,
but we now know so much more that I’m stopping to use
the word hierarchical temporal memory yet
because it’s insufficient to capture the stuff we know.
So again, it’s not incorrect, but it’s,
I now know more and I would rather describe it
more accurately.
Yeah, so you’re basically, we could think of HTM
as emphasizing that there’s three aspects of intelligence
that are important to think about
whatever the eventual theory it converges to.
So in terms of time, how do you think of nature of time
across different time scales?
So you mentioned things changing,
sensory inputs changing every 10, 20 minutes.
What about every few minutes, every few months and years?
Well, if you think about a neuroscience problem,
the brain problem, neurons themselves can stay active
for certain periods of time, parts of the brain
where they stay active for minutes.
You could hold a certain perception or an activity
for a certain period of time,
but most of them don’t last that long.
And so if you think about your thoughts
are the activity of neurons,
if you’re gonna wanna involve something
that happened a long time ago,
even just this morning, for example,
the neurons haven’t been active throughout that time.
So you have to store that.
So if I ask you, what did you have for breakfast today?
That is memory, that is you’ve built into your model
the world now, you remember that.
And that memory is in the synapses,
is basically in the formation of synapses.
And so you’re sliding into what,
you know, it’s the different timescales.
There’s timescales of which we are like understanding
my language and moving about and seeing things rapidly
and over time, that’s the timescales
of activities of neurons.
But if you wanna get in longer timescales,
then it’s more memory.
And we have to invoke those memories to say,
oh yes, well now I can remember what I had for breakfast
because I stored that someplace.
I may forget it tomorrow, but I’d store it for now.
So does memory also need to have,
so the hierarchical aspect of reality
is not just about concepts, it’s also about time?
Do you think of it that way?
Yeah, time is infused in everything.
It’s like you really can’t separate it out.
If I ask you, what is your, you know,
how’s the brain learn a model of this coffee cup here?
I have a coffee cup and I’m at the coffee cup.
I say, well, time is not an inherent property
of the model I have of this cup,
whether it’s a visual model or a tactile model.
I can sense it through time,
but the model itself doesn’t really have much time.
If I asked you, if I said,
well, what is the model of my cell phone?
My brain has learned a model of the cell phone.
So if you have a smartphone like this,
and I said, well, this has time aspects to it.
I have expectations when I turn it on,
what’s gonna happen, what or how long it’s gonna take
to do certain things, if I bring up an app,
what sequences, and so I have,
and it’s like melodies in the world, you know?
Melody has a sense of time.
So many things in the world move and act,
and there’s a sense of time related to them.
Some don’t, but most things do actually.
So it’s sort of infused throughout the models of the world.
You build a model of the world,
you’re learning the structure of the objects in the world,
and you’re also learning how those things change
through time.
Okay, so it really is just a fourth dimension
that’s infused deeply, and you have to make sure
that your models of intelligence incorporate it.
So, like you mentioned, the state of neuroscience
is deeply empirical, a lot of data collection.
It’s, you know, that’s where it is.
You mentioned Thomas Kuhn, right?
Yeah.
And then you’re proposing a theory of intelligence,
and which is really the next step,
the really important step to take,
but why is HTM, or what we’ll talk about soon,
the right theory?
So is it more in the, is it backed by intuition?
Is it backed by evidence?
Is it backed by a mixture of both?
Is it kind of closer to where string theory is in physics,
where there’s mathematical components
which show that, you know what,
it seems that this, it fits together too well
for it not to be true, which is where string theory is.
Is that where you’re kind of seeing?
It’s a mixture of all those things,
although definitely where we are right now
is definitely much more on the empirical side
than, let’s say, string theory.
The way this goes about, we’re theorists, right?
So we look at all this data, and we’re trying to come up
with some sort of model that explains it, basically,
and there’s, unlike string theory,
there’s vast more amounts of empirical data here
that I think than most physicists deal with.
And so our challenge is to sort through that
and figure out what kind of constructs would explain this.
And when we have an idea,
you come up with a theory of some sort,
you have lots of ways of testing it.
First of all, there are 100 years of assimilated,
assimilated, unassimilated empirical data from neuroscience.
So we go back and read papers,
and we say, oh, did someone find this already?
We can predict X, Y, and Z,
and maybe no one’s even talked about it
since 1972 or something, but we go back and find that,
and we say, oh, either it can support the theory
or it can invalidate the theory.
And we say, okay, we have to start over again.
Oh, no, it’s supportive, let’s keep going with that one.
So the way I kind of view it, when we do our work,
we look at all this empirical data,
and what I call it is a set of constraints.
We’re not interested in something
that’s biologically inspired.
We’re trying to figure out how the actual brain works.
So every piece of empirical data
is a constraint on a theory.
In theory, if you have the correct theory,
it needs to explain every pin, right?
So we have this huge number of constraints on the problem,
which initially makes it very, very difficult.
If you don’t have many constraints,
you can make up stuff all the day.
You can say, oh, here’s an answer on how you can do this,
you can do that, you can do this.
But if you consider all biology as a set of constraints,
all neuroscience as a set of constraints,
and even if you’re working in one little part
of the neocortex, for example,
there are hundreds and hundreds of constraints.
These are empirical constraints
that it’s very, very difficult initially
to come up with a theoretical framework for that.
But when you do, and it solves all those constraints
at once, you have a high confidence
that you got something close to correct.
It’s just mathematically almost impossible not to be.
So that’s the curse and the advantage of what we have.
The curse is we have to solve,
we have to meet all these constraints, which is really hard.
But when you do meet them,
then you have a great confidence
that you’ve discovered something.
In addition, then we work with scientific labs.
So we’ll say, oh, there’s something we can’t find,
we can predict something,
but we can’t find it anywhere in the literature.
So we will then, we have people we’ve collaborated with,
we’ll say, sometimes they’ll say, you know what?
I have some collected data, which I didn’t publish,
but we can go back and look at it
and see if we can find that,
which is much easier than designing a new experiment.
You know, neuroscience experiments take a long time, years.
So, although some people are doing that now too.
So, but between all of these things,
I think it’s a reasonable,
actually a very, very good approach.
We are blessed with the fact that we can test our theories
out the yin yang here because there’s so much
unassimilar data and we can also falsify our theories
very easily, which we do often.
So it’s kind of reminiscent to whenever that was
with Copernicus, you know, when you figure out
that the sun’s at the center of the solar system
as opposed to earth, the pieces just fall into place.
Yeah, I think that’s the general nature of aha moments
is, and it’s Copernicus, it could be,
you could say the same thing about Darwin,
you could say the same thing about, you know,
about the double helix,
that people have been working on a problem for so long
and have all this data and they can’t make sense of it,
they can’t make sense of it.
But when the answer comes to you
and everything falls into place,
it’s like, oh my gosh, that’s it.
That’s got to be right.
I asked both Jim Watson and Francis Crick about this.
I asked them, you know, when you were working on
trying to discover the structure of the double helix,
and when you came up with the sort of the structure
that ended up being correct, but it was sort of a guess,
you know, it wasn’t really verified yet.
I said, did you know that it was right?
And they both said, absolutely.
So we absolutely knew it was right.
And it doesn’t matter if other people didn’t believe it
or not, we knew it was right.
They’d get around to thinking it
and agree with it eventually anyway.
And that’s the kind of thing you hear a lot with scientists
who really are studying a difficult problem.
And I feel that way too about our work.
Have you talked to Crick or Watson about the problem
you’re trying to solve, the, of finding the DNA of the brain?
Yeah, in fact, Francis Crick was very interested in this
in the latter part of his life.
And in fact, I got interested in brains
by reading an essay he wrote in 1979
called Thinking About the Brain.
And that was when I decided I’m gonna leave my profession
of computers and engineering and become a neuroscientist.
Just reading that one essay from Francis Crick.
I got to meet him later in life.
I spoke at the Salk Institute and he was in the audience.
And then I had a tea with him afterwards.
He was interested in a different problem.
He was focused on consciousness.
The easy problem, right?
Well, I think it’s the red herring.
And so we weren’t really overlapping a lot there.
Jim Watson, who’s still alive,
is also interested in this problem.
And he was, when he was director
of the Cold Spring Harbor Laboratories,
he was really sort of behind moving in the direction
of neuroscience there.
And so he had a personal interest in this field.
And I have met with him numerous times.
And in fact, the last time was a little bit over a year ago,
I gave a talk at Cold Spring Harbor Labs
about the progress we were making in our work.
And it was a lot of fun because he said,
well, you wouldn’t be coming here
unless you had something important to say.
So I’m gonna go attend your talk.
So he sat in the very front row.
Next to him was the director of the lab, Bruce Stillman.
So these guys are in the front row of this auditorium.
Nobody else in the auditorium wants to sit in the front row
because there’s Jim Watson and there’s the director.
And I gave a talk and then I had dinner with him afterwards.
But there’s a great picture of my colleague Subitai Amantak
where I’m up there sort of like screaming the basics
of this new framework we have.
And Jim Watson’s on the edge of his chair.
He’s literally on the edge of his chair,
like intently staring up at the screen.
And when he discovered the structure of DNA,
the first public talk he gave
was at Cold Spring Harbor Labs.
And there’s a picture, there’s a famous picture
of Jim Watson standing at the whiteboard
with an overrated thing pointing at something,
pointing at the double helix with his pointer.
And it actually looks a lot like the picture of me.
So there was a sort of funny,
there’s Arian talking about the brain
and there’s Jim Watson staring intently at it.
And of course there with, whatever, 60 years earlier,
he was standing pointing at the double helix.
That’s one of the great discoveries in all of,
whatever, biology, science, all science and DNA.
So it’s funny that there’s echoes of that in your presentation.
Do you think, in terms of evolutionary timeline and history,
the development of the neocortex was a big leap?
Or is it just a small step?
So like, if we ran the whole thing over again,
from the birth of life on Earth,
how likely would we develop the mechanism of the neocortex?
Okay, well those are two separate questions.
One is, was it a big leap?
And one was how likely it is, okay?
They’re not necessarily related.
Maybe correlated.
Maybe correlated, maybe not.
And we don’t really have enough data
to make a judgment about that.
I would say definitely it was a big leap.
And I can tell you why.
I don’t think it was just another incremental step.
I don’t get that at the moment.
I don’t really have any idea how likely it is.
If we look at evolution,
we have one data point, which is Earth, right?
Life formed on Earth billions of years ago,
whether it was introduced here or it created it here,
or someone introduced it, we don’t really know,
but it was here early.
It took a long, long time to get to multicellular life.
And then for multicellular life,
it took a long, long time to get the neocortex.
And we’ve only had the neocortex for a few 100,000 years.
So that’s like nothing, okay?
So is it likely?
Well, it certainly isn’t something
that happened right away on Earth.
And there were multiple steps to get there.
So I would say it’s probably not gonna be something
that would happen instantaneously
on other planets that might have life.
It might take several billion years on average.
Is it likely?
I don’t know, but you’d have to survive
for several billion years to find out.
Probably.
Is it a big leap?
Yeah, I think it is a qualitative difference
in all other evolutionary steps.
I can try to describe that if you’d like.
Sure, in which way?
Yeah, I can tell you how.
Pretty much, let’s start with a little preface.
Many of the things that humans are able to do
do not have obvious survival advantages precedent.
We could create music, is that,
is there a really survival advantage to that?
Maybe, maybe not.
What about mathematics?
Is there a real survival advantage to mathematics?
Well, you could stretch it.
You can try to figure these things out, right?
But most of evolutionary history,
everything had immediate survival advantages to it.
So, I’ll tell you a story, which I like,
may or may not be true, but the story goes as follows.
Organisms have been evolving for,
since the beginning of life here on Earth,
and adding this sort of complexity onto that,
and this sort of complexity onto that,
and the brain itself is evolved this way.
In fact, there’s old parts, and older parts,
and older, older parts of the brain
that kind of just keeps calming on new things,
and we keep adding capabilities.
When we got to the neocortex,
initially it had a very clear survival advantage
in that it produced better vision,
and better hearing, and better touch,
and maybe, and so on.
But what I think happens is that evolution discovered,
it took a mechanism, and this is in our recent theories,
but it took a mechanism evolved a long time ago
for navigating in the world, for knowing where you are.
These are the so called grid cells and place cells
of an old part of the brain.
And it took that mechanism for building maps of the world,
and knowing where you are on those maps,
and how to navigate those maps,
and turns it into a sort of a slimmed down,
idealized version of it.
And that idealized version could now apply
to building maps of other things.
Maps of coffee cups, and maps of phones,
maps of mathematics.
Concepts almost.
Concepts, yes, and not just almost, exactly.
And so, and it just started replicating this stuff, right?
You just think more, and more, and more.
So we went from being sort of dedicated purpose
neural hardware to solve certain problems
that are important to survival,
to a general purpose neural hardware
that could be applied to all problems.
And now it’s escaped the orbit of survival.
We are now able to apply it to things
which we find enjoyment,
but aren’t really clearly survival characteristics.
And that it seems to only have happened in humans,
to the large extent.
And so that’s what’s going on,
where we sort of have,
we’ve sort of escaped the gravity of evolutionary pressure,
in some sense, in the neocortex.
And it now does things which are not,
that are really interesting,
discovering models of the universe,
which may not really help us.
Does it matter?
How does it help us surviving,
knowing that there might be multiverses,
or that there might be the age of the universe,
or how do various stellar things occur?
It doesn’t really help us survive at all.
But we enjoy it, and that’s what happened.
Or at least not in the obvious way, perhaps.
It is required,
if you look at the entire universe in an evolutionary way,
it’s required for us to do interplanetary travel,
and therefore survive past our own sun.
But you know, let’s not get too.
Yeah, but evolution works at one time frame,
it’s survival, if you think of survival of the phenotype,
survival of the individual.
What you’re talking about there is spans well beyond that.
So there’s no genetic,
I’m not transferring any genetic traits to my children
that are gonna help them survive better on Mars.
Totally different mechanism, that’s right.
So let’s get into the new, as you’ve mentioned,
this idea of the, I don’t know if you have a nice name,
thousand.
We call it the thousand brain theory of intelligence.
I like it.
Can you talk about this idea of a spatial view of concepts
and so on?
Yeah, so can I just describe sort of the,
there’s an underlying core discovery,
which then everything comes from that.
That’s a very simple, this is really what happened.
We were deep into problems about understanding
how we build models of stuff in the world
and how we make predictions about things.
And I was holding a coffee cup just like this in my hand.
And my finger was touching the side, my index finger.
And then I moved it to the top
and I was gonna feel the rim at the top of the cup.
And I asked myself a very simple question.
I said, well, first of all, I say,
I know that my brain predicts what it’s gonna feel
before it touches it.
You can just think about it and imagine it.
And so we know that the brain’s making predictions
all the time.
So the question is, what does it take to predict that?
And there’s a very interesting answer.
First of all, it says the brain has to know
it’s touching a coffee cup.
It has to have a model of a coffee cup.
It needs to know where the finger currently is
on the cup relative to the cup.
Because when I make a movement,
it needs to know where it’s going to be on the cup
after the movement is completed relative to the cup.
And then it can make a prediction
about what it’s gonna sense.
So this told me that the neocortex,
which is making this prediction,
needs to know that it’s sensing it’s touching a cup.
And it needs to know the location of my finger
relative to that cup in a reference frame of the cup.
It doesn’t matter where the cup is relative to my body.
It doesn’t matter its orientation.
None of that matters.
It’s where my finger is relative to the cup,
which tells me then that the neocortex
has a reference frame that’s anchored to the cup.
Because otherwise I wouldn’t be able to say the location
and I wouldn’t be able to predict my new location.
And then we quickly, very instantly can say,
well, every part of my skin could touch this cup.
And therefore every part of my skin is making predictions
and every part of my skin must have a reference frame
that it’s using to make predictions.
So the big idea is that throughout the neocortex,
there are, everything is being stored
and referenced in reference frames.
You can think of them like XYZ reference frames,
but they’re not like that.
We know a lot about the neural mechanisms for this,
but the brain thinks in reference frames.
And as an engineer, if you’re an engineer,
this is not surprising.
You’d say, if I wanted to build a CAD model
of the coffee cup, well, I would bring it up
and some CAD software, and I would assign
some reference frame and say this features
at this locations and so on.
But the fact that this, the idea that this is occurring
throughout the neocortex everywhere, it was a novel idea.
And then a zillion things fell into place after that,
a zillion.
So now we think about the neocortex
as processing information quite differently
than we used to do it.
We used to think about the neocortex
as processing sensory data and extracting features
from that sensory data and then extracting features
from the features, very much like a deep learning network
does today.
But that’s not how the brain works at all.
The brain works by assigning everything,
every input, everything to reference frames.
And there are thousands, hundreds of thousands
of them active at once in your neocortex.
It’s a surprising thing to think about,
but once you sort of internalize this,
you understand that it explains almost every,
almost all the mysteries we’ve had about this structure.
So one of the consequences of that
is that every small part of the neocortex,
say a millimeter square, and there’s 150,000 of those.
So it’s about 150,000 square millimeters.
If you take every little square millimeter of the cortex,
it’s got some input coming into it
and it’s gonna have reference frames
where it’s assigned that input to.
And each square millimeter can learn
complete models of objects.
So what do I mean by that?
If I’m touching the coffee cup,
well, if I just touch it in one place,
I can’t learn what this coffee cup is
because I’m just feeling one part.
But if I move it around the cup
and touch it at different areas,
I can build up a complete model of the cup
because I’m now filling in that three dimensional map,
which is the coffee cup.
I can say, oh, what am I feeling
at all these different locations?
That’s the basic idea, it’s more complicated than that.
But so through time, and we talked about time earlier,
through time, even a single column,
which is only looking at, or a single part of the cortex,
which is only looking at a small part of the world,
can build up a complete model of an object.
And so if you think about the part of the brain,
which is getting input from all my fingers,
so they’re spread across the top of your head here.
This is the somatosensory cortex.
There’s columns associated
with all the different areas of my skin.
And what we believe is happening
is that all of them are building models of this cup,
every one of them, or things.
They’re not all building,
not every column or every part of the cortex
builds models of everything,
but they’re all building models of something.
And so you have, so when I touch this cup with my hand,
there are multiple models of the cup being invoked.
If I look at it with my eyes,
there are, again, many models of the cup being invoked,
because each part of the visual system,
the brain doesn’t process an image.
That’s a misleading idea.
It’s just like your fingers touching the cup,
so different parts of my retina
are looking at different parts of the cup.
And thousands and thousands of models of the cup
are being invoked at once.
And they’re all voting with each other,
trying to figure out what’s going on.
So that’s why we call it the thousand brains theory
of intelligence, because there isn’t one model of a cup.
There are thousands of models of this cup.
There are thousands of models of your cellphone
and about cameras and microphones and so on.
It’s a distributed modeling system,
which is very different
than the way people have thought about it.
And so that’s a really compelling and interesting idea.
I have two first questions.
So one, on the ensemble part of everything coming together,
you have these thousand brains.
How do you know which one has done the best job
of forming the…
Great question.
Let me try to explain it.
There’s a problem that’s known in neuroscience
called the sensor fusion problem.
Yes.
And so the idea is there’s something like,
oh, the image comes from the eye.
There’s a picture on the retina
and then it gets projected to the neocortex.
Oh, by now it’s all spread out all over the place
and it’s kind of squirrely and distorted
and pieces are all over the…
It doesn’t look like a picture anymore.
When does it all come back together again?
Or you might say, well, yes,
but I also have sounds or touches associated with the cup.
So I’m seeing the cup and touching the cup.
How do they get combined together again?
So it’s called the sensor fusion problem.
As if all these disparate parts
have to be brought together into one model someplace.
That’s the wrong idea.
The right idea is that you’ve got all these guys voting.
There’s auditory models of the cup.
There’s visual models of the cup.
There’s tactile models of the cup.
In the vision system,
there might be ones that are more focused on black and white
and ones focusing on color.
It doesn’t really matter.
There’s just thousands and thousands of models of this cup.
And they vote.
They don’t actually come together in one spot.
Just literally think of it this way.
Imagine you have these columns
that are like about the size of a little piece of spaghetti.
Like a two and a half millimeters tall
and about a millimeter in wide.
They’re not physical, but you could think of them that way.
And each one’s trying to guess what this thing is
or touching.
Now, they can do a pretty good job
if they’re allowed to move over time.
So I can reach my hand into a black box
and move my finger around an object.
And if I touch enough spaces, I go, okay,
now I know what it is.
But often we don’t do that.
Often I can just reach and grab something with my hand
all at once and I get it.
Or if I had to look through the world through a straw,
so I’m only invoking one little column,
I can only see part of something
because I have to move the straw around.
But if I open my eyes, I see the whole thing at once.
So what we think is going on
is all these little pieces of spaghetti,
if you will, all these little columns in the cortex,
are all trying to guess what it is that they’re sensing.
They’ll do a better guess if they have time
and can move over time.
So if I move my eyes, I move my fingers.
But if they don’t, they have a poor guess.
It’s a probabilistic guess of what they might be touching.
Now, imagine they can post their probability
at the top of a little piece of spaghetti.
Each one of them says,
I think, and it’s not really a probability distribution.
It’s more like a set of possibilities.
In the brain, it doesn’t work as a probability distribution.
It works as more like what we call a union.
So you could say, and one column says,
I think it could be a coffee cup,
a soda can, or a water bottle.
And another column says,
I think it could be a coffee cup
or a telephone or a camera or whatever, right?
And all these guys are saying what they think it might be.
And there’s these long range connections
in certain layers in the cortex.
So there’s in some layers in some cells types
in each column, send the projections across the brain.
And that’s the voting occurs.
And so there’s a simple associative memory mechanism.
We’ve described this in a recent paper
and we’ve modeled this that says,
they can all quickly settle on the only
or the one best answer for all of them.
If there is a single best answer,
they all vote and say, yep, it’s gotta be the coffee cup.
And at that point, they all know it’s a coffee cup.
And at that point, everyone acts as if it’s a coffee cup.
They’re like, yep, we know it’s a coffee,
even though I’ve only seen one little piece of this world,
I know it’s a coffee cup I’m touching
or I’m seeing or whatever.
And so you can think of all these columns
are looking at different parts in different places,
different sensory input, different locations,
they’re all different.
But this layer that’s doing the voting, it solidifies.
It’s just like it crystallizes and says,
oh, we all know what we’re doing.
And so you don’t bring these models together in one model,
you just vote and there’s a crystallization of the vote.
Great, that’s at least a compelling way
to think about the way you form a model of the world.
Now, you talk about a coffee cup.
Do you see this, as far as I understand,
you are proposing this as well,
that this extends to much more than coffee cups?
Yeah.
It does.
Or at least the physical world,
it expands to the world of concepts.
Yeah, it does.
And well, first, the primary thing is evidence for that
is that the regions of the neocortex
that are associated with language
or high level thought or mathematics
or things like that,
they look like the regions of the neocortex
that process vision, hearing, and touch.
They don’t look any different.
Or they look only marginally different.
And so one would say, well, if Vernon Mountcastle,
who proposed that all the parts of the neocortex
do the same thing, if he’s right,
then the parts that are doing language
or mathematics or physics
are working on the same principle.
They must be working on the principle of reference frames.
So that’s a little odd thought.
But of course, we had no prior idea
how these things happen.
So let’s go with that.
And we, in our recent paper,
we talked a little bit about that.
I’ve been working on it more since.
I have better ideas about it now.
I’m sitting here very confident
that that’s what’s happening.
And I can give you some examples
that help you think about that.
It’s not we understand it completely,
but I understand it better than I’ve described it
in any paper so far.
So, but we did put that idea out there.
It says, okay, this is,
it’s a good place to start, you know?
And the evidence would suggest it’s how it’s happening.
And then we can start tackling that problem
one piece at a time.
Like, what does it mean to do high level thought?
What does it mean to do language?
How would that fit into a reference frame framework?
Yeah, so there’s a,
I don’t know if you could tell me if there’s a connection,
but there’s an app called Anki
that helps you remember different concepts.
And they talk about like a memory palace
that helps you remember completely random concepts
by trying to put them in a physical space in your mind
and putting them next to each other.
It’s called the method of loci.
Loci, yeah.
For some reason, that seems to work really well.
Now, that’s a very narrow kind of application
of just remembering some facts.
But that’s a very, very telling one.
Yes, exactly.
So this seems like you’re describing a mechanism
why this seems to work.
So basically the way what we think is going on
is all things you know, all concepts, all ideas,
words, everything you know are stored in reference frames.
And so if you want to remember something,
you have to basically navigate through a reference frame
the same way a rat navigates through a maze
and the same way my finger navigates to this coffee cup.
You are moving through some space.
And so if you have a random list of things
you were asked to remember,
by assigning them to a reference frame,
you’ve already know very well to see your house, right?
And the idea of the method of loci is you can say,
okay, in my lobby, I’m going to put this thing.
And then the bedroom, I put this one.
I go down the hall, I put this thing.
And then you want to recall those facts
or recall those things.
You just walk mentally, you walk through your house.
You’re mentally moving through a reference frame
that you already had.
And that tells you,
there’s two things that are really important about that.
It tells us the brain prefers to store things
in reference frames.
And that the method of recalling things
or thinking, if you will,
is to move mentally through those reference frames.
You could move physically through some reference frames,
like I could physically move through the reference frame
of this coffee cup.
I can also mentally move through the reference frame
of the coffee cup, imagining me touching it.
But I can also mentally move my house.
And so now we can ask yourself,
or are all concepts stored this way?
There was some recent research using human subjects
in fMRI, and I’m going to apologize for not knowing
the name of the scientists who did this.
But what they did is they put humans in this fMRI machine,
which is one of these imaging machines.
And they gave the humans tasks to think about birds.
So they had different types of birds,
and birds that look big and small,
and long necks and long legs, things like that.
And what they could tell from the fMRI
was a very clever experiment.
You get to tell when humans were thinking about the birds,
that the birds, the knowledge of birds
was arranged in a reference frame,
similar to the ones that are used
when you navigate in a room.
That these are called grid cells,
and there are grid cell like patterns of activity
in the neocortex when they do this.
So it’s a very clever experiment.
And what it basically says,
that even when you’re thinking about something abstract,
and you’re not really thinking about it as a reference frame,
it tells us the brain is actually using a reference frame.
And it’s using the same neural mechanisms.
These grid cells are the basic same neural mechanism
that we propose that grid cells,
which exist in the old part of the brain,
the entorhinal cortex, that that mechanism
is now similar mechanism is used throughout the neocortex.
It’s the same nature to preserve this interesting way
of creating reference frames.
And so now they have empirical evidence
that when you think about concepts like birds,
that you’re using reference frames
that are built on grid cells.
So that’s similar to the method of loci,
but in this case, the birds are related.
So they create their own reference frame,
which is consistent with bird space.
And when you think about something, you go through that.
You can make the same example,
let’s take mathematics.
Let’s say you wanna prove a conjecture.
What is a conjecture?
A conjecture is a statement you believe to be true,
but you haven’t proven it.
And so it might be an equation.
I wanna show that this is equal to that.
And you have some places you start with.
You say, well, I know this is true,
and I know this is true.
And I think that maybe to get to the final proof,
I need to go through some intermediate results.
What I believe is happening is literally these equations
or these points are assigned to a reference frame,
a mathematical reference frame.
And when you do mathematical operations,
a simple one might be multiply or divide,
but you might be a little plus transform or something else.
That is like a movement in the reference frame of the math.
And so you’re literally trying to discover a path
from one location to another location
in a space of mathematics.
And if you can get to these intermediate results,
then you know your map is pretty good,
and you know you’re using the right operations.
Much of what we think about is solving hard problems
is designing the correct reference frame for that problem,
figuring out how to organize the information
and what behaviors I wanna use in that space
to get me there.
Yeah, so if you dig in an idea of this reference frame,
whether it’s the math, you start a set of axioms
to try to get to proving the conjecture.
Can you try to describe, maybe take a step back,
how you think of the reference frame in that context?
Is it the reference frame that the axioms are happy in?
Is it the reference frame that might contain everything?
Is it a changing thing as you?
You have many, many reference frames.
I mean, in fact, the way the theory,
the thousand brain theory of intelligence says
that every single thing in the world
has its own reference frame.
So every word has its own reference frames.
And we can talk about this.
The mathematics work out,
this is no problem for neurons to do this.
But how many reference frames does a coffee cup have?
Well, it’s on a table.
Let’s say you ask how many reference frames
could a column in my finger
that’s touching the coffee cup have?
Because there are many, many copy,
there are many, many models of the coffee cup.
So the coffee, there is no one model of a coffee cup.
There are many models of a coffee cup.
And you could say, well,
how many different things can my finger learn?
Is this the question you want to ask?
Imagine I say every concept, every idea,
everything you’ve ever know about that you can say,
I know that thing has a reference frame
associated with it.
And what we do when we build composite objects,
we assign reference frames
to point another reference frame.
So my coffee cup has multiple components to it.
It’s got a limb, it’s got a cylinder, it’s got a handle.
And those things have their own reference frames
and they’re assigned to a master reference frame,
which is called this cup.
And now I have this Numenta logo on it.
Well, that’s something that exists elsewhere in the world.
It’s its own thing.
So it has its own reference frame.
So we now have to say,
how can I assign the Numenta logo reference frame
onto the cylinder or onto the coffee cup?
So it’s all, we talked about this in the paper
that came out in December of this last year.
The idea of how you can assign reference frames
to reference frames, how neurons could do this.
So, well, my question is,
even though you mentioned reference frames a lot,
I almost feel it’s really useful to dig into
how you think of what a reference frame is.
I mean, it was already helpful for me to understand
that you think of reference frames
as something there is a lot of.
Okay, so let’s just say that we’re gonna have
some neurons in the brain, not many, actually,
10,000, 20,000 are gonna create
a whole bunch of reference frames.
What does it mean?
What is a reference frame?
First of all, these reference frames are different
than the ones you might be used to.
We know lots of reference frames.
For example, we know the Cartesian coordinates, X, Y, Z,
that’s a type of reference frame.
We know longitude and latitude,
that’s a different type of reference frame.
If I look at a printed map,
you might have columns A through M,
and rows one through 20,
that’s a different type of reference frame.
It’s kind of a Cartesian coordinate reference frame.
The interesting thing about the reference frames
in the brain, and we know this because these
have been established through neuroscience
studying the entorhinal cortex.
So I’m not speculating here, okay?
This is known neuroscience in an old part of the brain.
The way these cells create reference frames,
they have no origin.
So what it’s more like, you have a point,
a point in some space, and you,
given a particular movement,
you can then tell what the next point should be.
And you can then tell what the next point would be,
and so on.
You can use this to calculate
how to get from one point to another.
So how do I get from my house to my home,
or how do I get my finger from the side of my cup
to the top of the cup?
How do I get from the axioms to the conjecture?
So it’s a different type of reference frame,
and I can, if you want, I can describe in more detail,
I can paint a picture of how you might want
to think about that.
It’s really helpful to think it’s something
you can move through, but is there,
is it helpful to think of it as spatial in some sense,
or is there something that’s more?
No, it’s definitely spatial.
It’s spatial in a mathematical sense.
How many dimensions?
Can it be a crazy number of dimensions?
Well, that’s an interesting question.
In the old part of the brain, the entorhinal cortex,
they studied rats, and initially it looks like,
oh, this is just two dimensional.
It’s like the rat is in some box in the maze or whatever,
and they know where the rat is using
these two dimensional reference frames
to know where it is in the maze.
We said, well, okay, but what about bats?
That’s a mammal, and they fly in three dimensional space.
How do they do that?
They seem to know where they are, right?
So this is a current area of active research,
and it seems like somehow the neurons
in the entorhinal cortex can learn three dimensional space.
We just, two members of our team,
along with Elif Fett from MIT,
just released a paper this literally last week.
It’s on bioRxiv, where they show that you can,
if you, the way these things work,
and I won’t get, unless you want to,
I won’t get into the detail,
but grid cells can represent any n dimensional space.
It’s not inherently limited.
You can think of it this way.
If you had two dimensional, the way it works
is you had a bunch of two dimensional slices.
That’s the way these things work.
There’s a whole bunch of two dimensional models,
and you can just, you can slice up
any n dimensional space with two dimensional projections.
So, and you could have one dimensional models.
So there’s nothing inherent about the mathematics
about the way the neurons do this,
which constrain the dimensionality of the space,
which I think was important.
So obviously I have a three dimensional map of this cup.
Maybe it’s even more than that, I don’t know.
But it’s clearly a three dimensional map of the cup.
I don’t just have a projection of the cup.
But when I think about birds,
or when I think about mathematics,
perhaps it’s more than three dimensions.
Who knows?
So in terms of each individual column
building up more and more information over time,
do you think that mechanism is well understood?
In your mind, you’ve proposed a lot of architectures there.
Is that a key piece, or is it,
is the big piece, the thousand brain theory of intelligence,
the ensemble of it all?
Well, I think they’re both big.
I mean, clearly the concept, as a theorist,
the concept is most exciting, right?
The high level concept.
This is a totally new way of thinking
about how the neocortex works.
So that is appealing.
It has all these ramifications.
And with that, as a framework for how the brain works,
you can make all kinds of predictions
and solve all kinds of problems.
Now we’re trying to work through
many of these details right now.
Okay, how do the neurons actually do this?
Well, it turns out, if you think about grid cells
and place cells in the old parts of the brain,
there’s a lot that’s known about them,
but there’s still some mysteries.
There’s a lot of debate about exactly the details,
how these work and what are the signs.
And we have that still, that same level of detail,
that same level of concern.
What we spend here most of our time doing
is trying to make a very good list
of the things we don’t understand yet.
That’s the key part here.
What are the constraints?
It’s not like, oh, this thing seems to work, we’re done.
No, it’s like, okay, it kind of works,
but these are other things we know it has to do
and it’s not doing those yet.
I would say we’re well on the way here.
We’re not done yet.
There’s a lot of trickiness to this system,
but the basic principles about how different layers
in the neocortex are doing much of this, we understand.
But there’s some fundamental parts
that we don’t understand as well.
So what would you say is one of the harder open problems
or one of the ones that have been bothering you,
keeping you up at night the most?
Oh, well, right now, this is a detailed thing
that wouldn’t apply to most people, okay?
Sure.
But you want me to answer that question?
Yeah, please.
We’ve talked about as if, oh,
to predict what you’re going to sense on this coffee cup,
I need to know where my finger is gonna be
on the coffee cup.
That is true, but it’s insufficient.
Think about my finger touches the edge of the coffee cup.
My finger can touch it at different orientations.
I can rotate my finger around here and that doesn’t change.
I can make that prediction and somehow,
so it’s not just the location.
There’s an orientation component of this as well.
This is known in the old parts of the brain too.
There’s things called head direction cells,
which way the rat is facing.
It’s the same kind of basic idea.
So if my finger were a rat, you know, in three dimensions,
I have a three dimensional orientation
and I have a three dimensional location.
If I was a rat, I would have a,
you might think of it as a two dimensional location,
a two dimensional orientation,
a one dimensional orientation,
like just which way is it facing?
So how the two components work together,
how it is that I combine orientation,
the orientation of my sensor,
as well as the location is a tricky problem.
And I think I’ve made progress on it.
So at a bigger version of that,
so perspective is super interesting, but super specific.
Yeah, I warned you.
No, no, no, that’s really good,
but there’s a more general version of that.
Do you think context matters,
the fact that we’re in a building in North America,
that we, in the day and age where we have mugs?
I mean, there’s all this extra information
that you bring to the table about everything else
in the room that’s outside of just the coffee cup.
How does it get connected, do you think?
Yeah, and that is another really interesting question.
I’m gonna throw that under the rubric
or the name of attentional problems.
First of all, we have this model,
I have many, many models.
And also the question, does it matter?
Well, it matters for certain things, of course it does.
Maybe what we think of that as a coffee cup
in another part of the world
is viewed as something completely different.
Or maybe our logo, which is very benign
in this part of the world,
it means something very different
in another part of the world.
So those things do matter.
I think the way to think about it is the following,
one way to think about it,
is we have all these models of the world, okay?
And we model everything.
And as I said earlier, I kind of snuck it in there,
our models are actually, we build composite structure.
So every object is composed of other objects,
which are composed of other objects,
and they become members of other objects.
So this room has chairs and a table and a room
and walls and so on.
Now we can just arrange these things in a certain way
and go, oh, that’s the nomenclature conference room.
So, and what we do is when we go around the world
and we experience the world,
by walking into a room, for example,
the first thing I do is I can say,
oh, I’m in this room, do I recognize the room?
Then I can say, oh, look, there’s a table here.
And by attending to the table,
I’m then assigning this table in the context of the room.
Then I can say, oh, on the table, there’s a coffee cup.
Oh, and on the table, there’s a logo.
And in the logo, there’s the word Nementa.
Oh, and look in the logo, there’s the letter E.
Oh, and look, it has an unusual serif.
And it doesn’t actually, but I pretended to serif.
So the point is your attention is kind of drilling
deep in and out of these nested structures.
And I can pop back up and I can pop back down.
I can pop back up and I can pop back down.
So when I attend to the coffee cup,
I haven’t lost the context of everything else,
but it’s sort of, there’s this sort of nested structure.
So the attention filters the reference frame information
for that particular period of time?
Yes, it basically, moment to moment,
you attend the sub components,
and then you can attend the sub components
to sub components.
And you can move up and down.
You can move up and down.
We do that all the time.
You’re not even, now that I’m aware of it,
I’m very conscious of it.
But until, but most people don’t even think about this.
You just walk in a room and you don’t say,
oh, I looked at the chair and I looked at the board
and looked at that word on the board
and I looked over here, what’s going on, right?
So what percent of your day are you deeply aware of this?
And what part can you actually relax and just be Jeff?
Me personally, like my personal day?
Yeah.
Unfortunately, I’m afflicted with too much of the former.
Well, unfortunately or unfortunately.
Yeah.
You don’t think it’s useful?
Oh, it is useful, totally useful.
I think about this stuff almost all the time.
And one of my primary ways of thinking
is when I’m in sleep at night,
I always wake up in the middle of the night.
And then I stay awake for at least an hour
with my eyes shut in sort of a half sleep state
thinking about these things.
I come up with answers to problems very often
in that sort of half sleeping state.
I think about it on my bike ride, I think about it on walks.
I’m just constantly thinking about this.
I have to almost schedule time
to not think about this stuff
because it’s very, it’s mentally taxing.
Are you, when you’re thinking about this stuff,
are you thinking introspectively,
like almost taking a step outside of yourself
and trying to figure out what is your mind doing right now?
I do that all the time, but that’s not all I do.
I’m constantly observing myself.
So as soon as I started thinking about grid cells,
for example, and getting into that,
I started saying, oh, well, grid cells
can have my place of sense in the world.
That’s where you know where you are.
And it’s interesting, we always have a sense
of where we are unless we’re lost.
And so I started at night when I got up
to go to the bathroom, I would start trying to do it
completely with my eyes closed all the time.
And I would test my sense of grid cells.
I would walk five feet and say, okay, I think I’m here.
Am I really there?
What’s my error?
And then I would calculate my error again
and see how the errors could accumulate.
So even something as simple as getting up
in the middle of the night to go to the bathroom,
I’m testing these theories out.
It’s kind of fun.
I mean, the coffee cup is an example of that too.
So I find that these sort of everyday introspections
are actually quite helpful.
It doesn’t mean you can ignore the science.
I mean, I spend hours every day
reading ridiculously complex papers.
That’s not nearly as much fun,
but you have to sort of build up those constraints
and the knowledge about the field and who’s doing what
and what exactly they think is happening here.
And then you can sit back and say,
okay, let’s try to piece this all together.
Let’s come up with some, I’m very,
in this group here, people, they know they do,
I do this all the time.
I come in with these introspective ideas and say,
well, have you ever thought about this?
Now watch, well, let’s all do this together.
And it’s helpful.
It’s not, as long as you don’t,
all you did was that, then you’re just making up stuff.
But if you’re constraining it by the reality
of the neuroscience, then it’s really helpful.
So let’s talk a little bit about deep learning
and the successes in the applied space of neural networks,
ideas of training model on data
and these simple computational units,
artificial neurons that with backpropagation,
statistical ways of being able to generalize
from the training set onto data
that’s similar to that training set.
So where do you think are the limitations
of those approaches?
What do you think are its strengths
relative to your major efforts
of constructing a theory of human intelligence?
Well, I’m not an expert in this field.
I’m somewhat knowledgeable.
So, but I’m not.
Some of it is in just your intuition.
What are your?
Well, I have a little bit more than intuition,
but I just want to say like,
you know, one of the things that you asked me,
do I spend all my time thinking about neuroscience?
I do.
That’s to the exclusion of thinking about things
like convolutional neural networks.
But I try to stay current.
So look, I think it’s great, the progress they’ve made.
It’s fantastic.
And as I mentioned earlier,
it’s very highly useful for many things.
The models that we have today are actually derived
from a lot of neuroscience principles.
There are distributed processing systems
and distributed memory systems,
and that’s how the brain works.
They use things that we might call them neurons,
but they’re really not neurons at all.
So we can just, they’re not really neurons.
So they’re distributed processing systems.
And that nature of hierarchy,
that came also from neuroscience.
And so there’s a lot of things,
the learning rules, basically,
not back prop, but other, you know,
sort of heavy on top of that.
I’d be curious to say they’re not neurons at all.
Can you describe in which way?
I mean, some of it is obvious,
but I’d be curious if you have specific ways
in which you think are the biggest differences.
Yeah, we had a paper in 2016 called
Why Neurons Have Thousands of Synapses.
And if you read that paper,
you’ll know what I’m talking about here.
A real neuron in the brain is a complex thing.
And let’s just start with the synapses on it,
which is a connection between neurons.
Real neurons can have everywhere
from five to 30,000 synapses on them.
The ones near the cell body,
the ones that are close to the soma of the cell body,
those are like the ones that people model
in artificial neurons.
There is a few hundred of those.
Maybe they can affect the cell.
They can make the cell become active.
95% of the synapses can’t do that.
They’re too far away.
So if you activate one of those synapses,
it just doesn’t affect the cell body enough
to make any difference.
Any one of them individually.
Any one of them individually,
or even if you do a mass of them.
What real neurons do is the following.
If you activate or you get 10 to 20 of them
active at the same time,
meaning they’re all receiving an input at the same time,
and those 10 to 20 synapses or 40 synapses
within a very short distance on the dendrite,
like 40 microns, a very small area.
So if you activate a bunch of these
right next to each other at some distant place,
what happens is it creates
what’s called the dendritic spike.
And the dendritic spike travels through the dendrites
and can reach the soma or the cell body.
Now, when it gets there, it changes the voltage,
which is sort of like gonna make the cell fire,
but never enough to make the cell fire.
It’s sort of what we call, it says we depolarize the cell,
you raise the voltage a little bit,
but not enough to do anything.
It’s like, well, what good is that?
And then it goes back down again.
So we propose a theory,
which I’m very confident in basics are,
is that what’s happening there is
those 95% of the synapses are recognizing
dozens to hundreds of unique patterns.
They can write about 10, 20 synapses at a time,
and they’re acting like predictions.
So the neuron actually is a predictive engine on its own.
It can fire when it gets enough,
what they call proximal input
from those ones near the cell fire,
but it can get ready to fire from dozens to hundreds
of patterns that it recognizes from the other guys.
And the advantage of this to the neuron
is that when it actually does produce a spike
in action potential,
it does so slightly sooner than it would have otherwise.
And so what could is slightly sooner?
Well, the slightly sooner part is it,
all the excitatory neurons in the brain
are surrounded by these inhibitory neurons,
and they’re very fast, the inhibitory neurons,
these basket cells.
And if I get my spike out
a little bit sooner than someone else,
I inhibit all my neighbors around me, right?
And what you end up with is a different representation.
You end up with a reputation that matches your prediction.
It’s a sparser representation,
meaning fewer neurons are active,
but it’s much more specific.
And so we showed how networks of these neurons
can do very sophisticated temporal prediction, basically.
So this, summarize this,
real neurons in the brain are time based prediction engines,
and there’s no concept of this at all
in artificial, what we call point neurons.
I don’t think you can build a brain without them.
I don’t think you can build intelligence without them,
because it’s where a large part of the time comes from.
These are predictive models, and the time is,
there’s a prior and a prediction and an action,
and it’s inherent through every neuron in the neocortex.
So I would say that point neurons sort of model
a piece of that, and not very well at that either.
But like for example, synapses are very unreliable,
and you cannot assign any precision to them.
So even one digit of precision is not possible.
So the way real neurons work is they don’t add these,
they don’t change these weights accurately
like artificial neural networks do.
They basically form new synapses,
and so what you’re trying to always do is
detect the presence of some 10 to 20
active synapses at the same time,
as opposed, and they’re almost binary.
It’s like, because you can’t really represent
anything much finer than that.
So these are the kind of,
and I think that’s actually another essential component,
because the brain works on sparse patterns,
and all that mechanism is based on sparse patterns,
and I don’t actually think you could build real brains
or machine intelligence without
incorporating some of those ideas.
It’s hard to even think about the complexity
that emerges from the fact that
the timing of the firing matters in the brain,
the fact that you form new synapses,
and I mean, everything you just mentioned
in the past couple minutes.
Trust me, if you spend time on it,
you can get your mind around it.
It’s not like, it’s no longer a mystery to me.
No, but sorry, as a function, in a mathematical way,
can you start getting an intuition about
what gets it excited, what not,
and what kind of representation?
Yeah, it’s not as easy as,
there’s many other types of neural networks
that are more amenable to pure analysis,
especially very simple networks.
Oh, I have four neurons, and they’re doing this.
Can we describe to them mathematically
what they’re doing type of thing?
Even the complexity of convolutional neural networks today,
it’s sort of a mystery.
They can’t really describe the whole system.
And so it’s different.
My colleague Subitai Ahmad, he did a nice paper on this.
You can get all this stuff on our website
if you’re interested,
talking about sort of the mathematical properties
of sparse representations.
And so what we can do is we can show mathematically,
for example, why 10 to 20 synapses to recognize a pattern
is the correct number, is the right number you’d wanna use.
And by the way, that matches biology.
We can show mathematically some of these concepts
about the show why the brain is so robust
to noise and error and fallout and so on.
We can show that mathematically
as well as empirically in simulations.
But the system can’t be analyzed completely.
Any complex system can’t, and so that’s out of the realm.
But there is mathematical benefits and intuitions
that can be derived from mathematics.
And we try to do that as well.
Most of our papers have a section about that.
So I think it’s refreshing and useful for me
to be talking to you about deep neural networks,
because your intuition basically says
that we can’t achieve anything like intelligence
with artificial neural networks.
Well, not in the current form.
Not in the current form.
I’m sure we can do it in the ultimate form, sure.
So let me dig into it
and see what your thoughts are there a little bit.
So I’m not sure if you read this little blog post
called Bitter Lesson by Rich Sutton recently.
He’s a reinforcement learning pioneer.
I’m not sure if you’re familiar with him.
His basic idea is that all the stuff we’ve done in AI
in the past 70 years, he’s one of the old school guys.
The biggest lesson learned is that all the tricky things
we’ve done, they benefit in the short term,
but in the long term, what wins out
is a simple general method that just relies on Moore’s law,
on computation getting faster and faster.
This is what he’s saying.
This is what has worked up to now.
If you’re trying to build a system,
if we’re talking about,
he’s not concerned about intelligence.
He’s concerned about a system that works
in terms of making predictions
on applied narrow AI problems, right?
That’s what this discussion is about.
That you just try to go as general as possible
and wait years or decades for the computation
to make it actually.
Is he saying that as a criticism
or is he saying this is a prescription
of what we ought to be doing?
Well, it’s very difficult.
He’s saying this is what has worked
and yes, a prescription, but it’s a difficult prescription
because it says all the fun things
you guys are trying to do, we are trying to do.
He’s part of the community.
He’s saying it’s only going to be short term gains.
So this all leads up to a question, I guess,
on artificial neural networks
and maybe our own biological neural networks
is do you think if we just scale things up significantly,
so take these dumb artificial neurons,
the point neurons, I like that term.
If we just have a lot more of them,
do you think some of the elements
that we see in the brain may start emerging?
No, I don’t think so.
We can do bigger problems of the same type.
I mean, it’s been pointed out by many people
that today’s convolutional neural networks
aren’t really much different
than the ones we had quite a while ago.
They’re bigger and train more
and we have more labeled data and so on.
But I don’t think you can get to the kind of things
I know the brain can do and that we think about
as intelligence by just scaling it up.
So that may be, it’s a good description
of what’s happened in the past,
what’s happened recently with the reemergence
of artificial neural networks.
It may be a good prescription
for what’s gonna happen in the short term.
But I don’t think that’s the path.
I’ve said that earlier.
There’s an alternate path.
I should mention to you, by the way,
that we’ve made sufficient progress
on the whole cortical theory in the last few years
that last year we decided to start actively pursuing
how do we get these ideas embedded into machine learning?
Well, that’s, again, being led by my colleague,
Subed Tariman, and he’s more of a machine learning guy.
I’m more of a neuroscience guy.
So this is now, I wouldn’t say our focus,
but it is now an equal focus here
because we need to proselytize what we’ve learned
and we need to show how it’s beneficial
to the machine learning layer.
So we’re putting, we have a plan in place right now.
In fact, we just did our first paper on this.
I can tell you about that.
But one of the reasons I wanna talk to you
is because I’m trying to get more people
in the machine learning community to say,
I need to learn about this stuff.
And maybe we should just think about this a bit more
about what we’ve learned about the brain
and what are those team at Nimenta, what have they done?
Is that useful for us?
Yeah, so is there elements of all the cortical theory
that things we’ve been talking about
that may be useful in the short term?
Yes, in the short term, yes.
This is the, sorry to interrupt,
but the open question is,
it certainly feels from my perspective
that in the long term,
some of the ideas we’ve been talking about
will be extremely useful.
The question is whether in the short term.
Well, this is always what I would call
the entrepreneur’s dilemma.
So you have this long term vision,
oh, we’re gonna all be driving electric cars
or we’re all gonna have computers
or we’re all gonna, whatever.
And you’re at some point in time and you say,
I can see that long term vision,
I’m sure it’s gonna happen.
How do I get there without killing myself?
Without going out of business, right?
That’s the challenge.
That’s the dilemma.
That’s the really difficult thing to do.
So we’re facing that right now.
So ideally what you’d wanna do
is find some steps along the way
that you can get there incrementally.
You don’t have to like throw it all out
and start over again.
The first thing that we’ve done
is we focus on the sparse representations.
So just in case you don’t know what that means
or some of the listeners don’t know what that means,
in the brain, if I have like 10,000 neurons,
what you would see is maybe 2% of them active at a time.
You don’t see 50%, you don’t see 30%,
you might see 2%.
And it’s always like that.
For any set of sensory inputs?
It doesn’t matter if anything,
doesn’t matter any part of the brain.
But which neurons differs?
Which neurons are active?
Yeah, so let’s say I take 10,000 neurons
that are representing something.
They’re sitting there in a little block together.
It’s a teeny little block of neurons, 10,000 neurons.
And they’re representing a location,
they’re representing a cup,
they’re representing the input from my sensors.
I don’t know, it doesn’t matter.
It’s representing something.
The way the representations occur,
it’s always a sparse representation.
Meaning it’s a population code.
So which 200 cells are active tells me what’s going on.
It’s not, individual cells aren’t that important at all.
It’s the population code that matters.
And when you have sparse population codes,
then all kinds of beautiful properties come out of them.
So the brain uses sparse population codes.
We’ve written and described these benefits
in some of our papers.
So they give this tremendous robustness to the systems.
Brains are incredibly robust.
Neurons are dying all the time and spasming
and synapses are falling apart all the time.
And it keeps working.
So what Sibutai and Louise, one of our other engineers here
have done, have shown they’re introducing sparseness
into convolutional neural networks.
Now other people are thinking along these lines,
but we’re going about it in a more principled way, I think.
And we’re showing that if you enforce sparseness
throughout these convolutional neural networks
in both the act, which sort of,
which neurons are active and the connections between them,
that you get some very desirable properties.
So one of the current hot topics in deep learning right now
are these adversarial examples.
So, you know, you give me any deep learning network
and I can give you a picture that looks perfect
and you’re going to call it, you know,
you’re going to say the monkey is, you know, an airplane.
So that’s a problem.
And DARPA just announced some big thing.
They’re trying to, you know, have some contest for this.
But if you enforce sparse representations here,
many of these problems go away.
They’re much more robust and they’re not easy to fool.
So we’ve already shown some of those results,
just literally in January or February,
just like last month we did that.
And you can, I think it’s on bioRxiv right now,
or on iRxiv, you can read about it.
But, so that’s like a baby step, okay?
That’s taking something from the brain.
We know about sparseness.
We know why it’s important.
We know what it gives the brain.
So let’s try to enforce that onto this.
What’s your intuition why sparsity leads to robustness?
Because it feels like it would be less robust.
Why would you feel the rest robust to you?
So it just feels like if the fewer neurons are involved,
the more fragile the representation.
But I didn’t say there was lots of few neurons.
I said, let’s say 200.
That’s a lot.
There’s still a lot, it’s just.
So here’s an intuition for it.
This is a bit technical, so for engineers,
machine learning people, this will be easy,
but all the listeners, maybe not.
If you’re trying to classify something,
you’re trying to divide some very high dimensional space
into different pieces, A and B.
And you’re trying to create some point where you say,
all these points in this high dimensional space are A,
and all these points in this high dimensional space are B.
And if you have points that are close to that line,
it’s not very robust.
It works for all the points you know about,
but it’s not very robust,
because you can just move a little bit
and you’ve crossed over the line.
When you have sparse representations,
imagine I pick, I’m gonna pick 200 cells active
out of 10,000, okay?
So I have 200 cells active.
Now let’s say I pick randomly another,
a different representation, 200.
The overlap between those is gonna be very small,
just a few.
I can pick millions of samples randomly of 200 neurons,
and not one of them will overlap more than just a few.
So one way to think about it is,
if I wanna fool one of these representations
to look like one of those other representations,
I can’t move just one cell, or two cells,
or three cells, or four cells.
I have to move 100 cells.
And that makes them robust.
In terms of further, so you mentioned sparsity.
What would be the next thing?
Yeah.
Okay, so we have, we picked one.
We don’t know if it’s gonna work well yet.
So again, we’re trying to come up with incremental ways
to moving from brain theory to add pieces
to machine learning, current machine learning world,
and one step at a time.
So the next thing we’re gonna try to do
is sort of incorporate some of the ideas
of the thousand brains theory,
that you have many, many models that are voting.
Now that idea is not new.
There’s a mixture of models that’s been around
for a long time.
But the way the brain does it is a little different.
And the way it votes is different.
And the kind of way it represents uncertainty
is different.
So we’re just starting this work,
but we’re gonna try to see if we can sort of incorporate
some of the principles of voting,
or principles of the thousand brain theory.
Like lots of simple models that talk to each other
in a certain way.
And can we build more machines, systems that learn faster
and also, well mostly are multimodal
and robust to multimodal type of issues.
So one of the challenges there
is the machine learning computer vision community
has certain sets of benchmarks,
sets of tests based on which they compete.
And I would argue, especially from your perspective,
that those benchmarks aren’t that useful
for testing the aspects that the brain is good at,
or intelligence.
They’re not really testing intelligence.
They’re very fine.
And it’s been extremely useful
for developing specific mathematical models,
but it’s not useful in the long term
for creating intelligence.
So you think you also have a role in proposing
better tests?
Yeah, this is a very,
you’ve identified a very serious problem.
First of all, the tests that they have
are the tests that they want.
Not the tests of the other things
that we’re trying to do, right?
You know, what are the, so on.
The second thing is sometimes these,
to be competitive in these tests,
you have to have huge data sets and huge computing power.
And so, you know, and we don’t have that here.
We don’t have it as well as other big teams
that big companies do.
So there’s numerous issues there.
You know, we come out, you know,
where our approach to this is all based on,
in some sense, you might argue, elegance.
We’re coming at it from like a theoretical base
that we think, oh my God, this is so clearly elegant.
This is how brains work.
This is what intelligence is.
But the machine learning world has gotten in this phase
where they think it doesn’t matter.
Doesn’t matter what you think,
as long as you do, you know, 0.1% better on this benchmark,
that’s what, that’s all that matters.
And that’s a problem.
You know, we have to figure out how to get around that.
That’s a challenge for us.
That’s one of the challenges that we have to deal with.
So I agree, you’ve identified a big issue.
It’s difficult for those reasons.
But you know, part of the reasons I’m talking to you here
today is I hope I’m gonna get some machine learning people
to say, I’m gonna read those papers.
Those might be some interesting ideas.
I’m tired of doing this 0.1% improvement stuff, you know?
Well, that’s why I’m here as well,
because I think machine learning now as a community
is at a place where the next step needs to be orthogonal
to what has received success in the past.
Well, you see other leaders saying this,
machine learning leaders, you know,
Jeff Hinton with his capsules idea.
Many people have gotten up to say, you know,
we’re gonna hit road map, maybe we should look at the brain,
you know, things like that.
So hopefully that thinking will occur organically.
And then we’re in a nice position for people to come
and look at our work and say,
well, what can we learn from these guys?
Yeah, MIT is launching a billion dollar computing college
that’s centered around this idea, so.
Is it on this idea of what?
Well, the idea that, you know,
the humanities, psychology, and neuroscience
have to work all together to get to build the S.
Yeah, I mean, Stanford just did
this Human Centered AI Center.
I’m a little disappointed in these initiatives
because, you know, they’re focusing
on sort of the human side of it,
and it could very easily slip into
how humans interact with intelligent machines,
which is nothing wrong with that,
but that’s not, that is orthogonal
to what we’re trying to do.
We’re trying to say, like,
what is the essence of intelligence?
I don’t care.
In fact, I wanna build intelligent machines
that aren’t emotional, that don’t smile at you,
that, you know, that aren’t trying to tuck you in at night.
Yeah, there is that pattern that you,
when you talk about understanding humans
is important for understanding intelligence,
that you start slipping into topics of ethics
or, yeah, like you said,
the interactive elements as opposed to,
no, no, no, we have to zoom in on the brain,
study what the human brain, the baby, the…
Let’s study what a brain does.
Does.
And then we can decide which parts of that
we wanna recreate in some system,
but until you have that theory about what the brain does,
what’s the point, you know, it’s just,
you’re gonna be wasting time, I think.
Right, just to break it down
on the artificial neural network side,
maybe you could speak to this
on the biological neural network side,
the process of learning versus the process of inference.
Maybe you can explain to me,
is there a difference between,
you know, in artificial neural networks,
there’s a difference between the learning stage
and the inference stage.
Do you see the brain as something different?
One of the big distinctions that people often say,
I don’t know how correct it is,
is artificial neural networks need a lot of data.
They’re very inefficient learning.
Do you see that as a correct distinction
from the biology of the human brain,
that the human brain is very efficient,
or is that just something we deceive ourselves?
No, it is efficient, obviously.
We can learn new things almost instantly.
And so what elements do you think are useful?
Yeah, I can talk about that.
You brought up two issues there.
So remember I talked early about the constraints
we always feel, well, one of those constraints
is the fact that brains are continually learning.
That’s not something we said, oh, we can add that later.
That’s something that was upfront,
had to be there from the start,
made our problems harder.
But we showed, going back to the 2016 paper
on sequence memory, we showed how that happens,
how the brains infer and learn at the same time.
And our models do that.
And they’re not two separate phases,
or two separate sets of time.
I think that’s a big, big problem in AI,
at least for many applications, not for all.
So I can talk about that.
There are some, it gets detailed,
there are some parts of the neocortex in the brain
where actually what’s going on,
there’s these cycles of activity in the brain.
And there’s very strong evidence
that you’re doing more of inference
on one part of the phase,
and more of learning on the other part of the phase.
So the brain can actually sort of separate
different populations of cells
or going back and forth like this.
But in general, I would say that’s an important problem.
We have all of our networks that we’ve come up with do both.
And they’re continuous learning networks.
And you mentioned benchmarks earlier.
Well, there are no benchmarks about that.
So we have to, we get in our little soapbox,
and hey, by the way, this is important,
and here’s a mechanism for doing that.
But until you can prove it to someone
in some commercial system or something, it’s a little harder.
So yeah, one of the things I had to linger on that
is in some ways to learn the concept of a coffee cup,
you only need this one coffee cup
and maybe some time alone in a room with it.
Well, the first thing is,
imagine I reach my hand into a black box
and I’m reaching, I’m trying to touch something.
I don’t know upfront if it’s something I already know
or if it’s a new thing.
And I have to, I’m doing both at the same time.
I don’t say, oh, let’s see if it’s a new thing.
Oh, let’s see if it’s an old thing.
I don’t do that.
As I go, my brain says, oh, it’s new or it’s not new.
And if it’s new, I start learning what it is.
And by the way, it starts learning from the get go,
even if it’s gonna recognize it.
So they’re not separate problems.
And so that’s the thing there.
The other thing you mentioned was the fast learning.
So I was just talking about continuous learning,
but there’s also fast learning.
Literally, I can show you this coffee cup
and I say, here’s a new coffee cup.
It’s got the logo on it.
Take a look at it, done, you’re done.
You can predict what it’s gonna look like,
you know, in different positions.
So I can talk about that too.
In the brain, the way learning occurs,
I mentioned this earlier, but I’ll mention it again.
The way learning occurs,
imagine I am a section of a dendrite of a neuron,
and I’m gonna learn something new.
Doesn’t matter what it is.
I’m just gonna learn something new.
I need to recognize a new pattern.
So what I’m gonna do is I’m gonna form new synapses.
New synapses, we’re gonna rewire the brain
onto that section of the dendrite.
Once I’ve done that, everything else that neuron has learned
is not affected by it.
That’s because it’s isolated
to that small section of the dendrite.
They’re not all being added together, like a point neuron.
So if I learn something new on this segment here,
it doesn’t change any of the learning
that occur anywhere else in that neuron.
So I can add something without affecting previous learning.
And I can do it quickly.
Now let’s talk, we can talk about the quickness,
how it’s done in real neurons.
You might say, well, doesn’t it take time to form synapses?
Yes, it can take maybe an hour to form a new synapse.
We can form memories quicker than that,
and I can explain that how it happens too, if you want.
But it’s getting a bit neurosciencey.
That’s great, but is there an understanding
of these mechanisms at every level?
Yeah.
So from the short term memories and the forming.
So this idea of synaptogenesis, the growth of new synapses,
that’s well described, it’s well understood.
And that’s an essential part of learning.
That is learning.
Okay.
Going back many, many years,
people, you know, it was, what’s his name,
the psychologist who proposed, Hebb, Donald Hebb.
He proposed that learning was the modification
of the strength of a connection between two neurons.
People interpreted that as the modification
of the strength of a synapse.
He didn’t say that.
He just said there’s a modification
between the effect of one neuron and another.
So synaptogenesis is totally consistent
with what Donald Hebb said.
But anyway, there’s these mechanisms,
the growth of new synapses.
You can go online, you can watch a video
of a synapse growing in real time.
It’s literally, you can see this little thing going boop.
It’s pretty impressive.
So those mechanisms are known.
Now there’s another thing that we’ve speculated
and we’ve written about,
which is consistent with known neuroscience,
but it’s less proven.
And this is the idea, how do I form a memory
really, really quickly?
Like instantaneous.
If it takes an hour to grow a synapse,
like that’s not instantaneous.
So there are types of synapses called silent synapses.
They look like a synapse, but they don’t do anything.
They’re just sitting there.
It’s like if an action potential comes in,
it doesn’t release any neurotransmitter.
Some parts of the brain have more of these than others.
For example, the hippocampus has a lot of them,
which is where we associate most short term memory with.
So what we speculated, again, in that 2016 paper,
we proposed that the way we form very quick memories,
very short term memories, or quick memories,
is that we convert silent synapses into active synapses.
It’s like saying a synapse has a zero weight
and a one weight,
but the longterm memory has to be formed by synaptogenesis.
So you can remember something really quickly
by just flipping a bunch of these guys from silent to active.
It’s not from 0.1 to 0.15.
It’s like, it doesn’t do anything
till it releases transmitter.
And if I do that over a bunch of these,
I’ve got a very quick short term memory.
So I guess the lesson behind this
is that most neural networks today are fully connected.
Every neuron connects every other neuron
from layer to layer.
That’s not correct in the brain.
We don’t want that.
We actually don’t want that.
It’s bad.
You want a very sparse connectivity
so that any neuron connects to some subset of the neurons
in the other layer.
And it does so on a dendrite by dendrite segment basis.
So it’s a very some parcelated out type of thing.
And that then learning is not adjusting all these weights,
but learning is just saying,
okay, connect to these 10 cells here right now.
In that process, you know, with artificial neural networks,
it’s a very simple process of backpropagation
that adjusts the weights.
The process of synaptogenesis.
Synaptogenesis.
It’s even easier.
Backpropagation requires something
that really can’t happen in brains.
This backpropagation of this error signal,
that really can’t happen.
People are trying to make it happen in brains,
but it doesn’t happen in brains.
This is pure Hebbian learning.
Well, synaptogenesis is pure Hebbian learning.
It’s basically saying,
there’s a population of cells over here
that are active right now.
And there’s a population of cells over here
active right now.
How do I form connections between those active cells?
And it’s literally saying this guy became active.
These 100 neurons here became active
before this neuron became active.
So form connections to those ones.
That’s it.
There’s no propagation of error, nothing.
All the networks we do,
all the models we have work on almost completely on
Hebbian learning,
but on dendritic segments
and multiple synapses at the same time.
So now let’s sort of turn the question
that you already answered,
and maybe you can answer it again.
If you look at the history of artificial intelligence,
where do you think we stand?
How far are we from solving intelligence?
You said you were very optimistic.
Can you elaborate on that?
Yeah, it’s always the crazy question to ask
because no one can predict the future.
Absolutely.
So I’ll tell you a story.
I used to run a different neuroscience institute
called the Redwood Neuroscience Institute,
and we would hold these symposiums
and we’d get like 35 scientists
from around the world to come together.
And I used to ask them all the same question.
I would say, well, how long do you think it’ll be
before we understand how the neocortex works?
And everyone went around the room
and they had introduced the name
and they have to answer that question.
So I got, the typical answer was 50 to 100 years.
Some people would say 500 years.
Some people said never.
I said, why are you a neuroscientist?
It’s never gonna, it’s a good pay.
It’s interesting.
So, you know, but it doesn’t work like that.
As I mentioned earlier, these are not,
these are step functions.
Things happen and then bingo, they happen.
You can’t predict that.
I feel I’ve already passed a step function.
So if I can do my job correctly over the next five years,
then, meaning I can proselytize these ideas.
I can convince other people they’re right.
We can show that other people,
machine learning people should pay attention
to these ideas.
Then we’re definitely in an under 20 year timeframe.
If I can do those things, if I’m not successful in that,
and this is the last time anyone talks to me
and no one reads our papers and you know,
and I’m wrong or something like that,
then I don’t know.
But it’s not 50 years.
Think about electric cars.
How quickly are they gonna populate the world?
It probably takes about a 20 year span.
It’ll be something like that.
But I think if I can do what I said, we’re starting it.
And of course there could be other,
you said step functions.
It could be everybody gives up on your ideas for 20 years
and then all of a sudden somebody picks it up again.
Wait, that guy was onto something.
Yeah, so that would be a failure on my part, right?
Think about Charles Babbage.
Charles Babbage, he’s the guy who invented the computer
back in the 18 something, 1800s.
And everyone forgot about it until 100 years later.
And say, hey, this guy figured this stuff out
a long time ago.
But he was ahead of his time.
I don’t think, as I said,
I recognize this is part of any entrepreneur’s challenge.
I use entrepreneur broadly in this case.
I’m not meaning like I’m building a business
or trying to sell something.
I mean, I’m trying to sell ideas.
And this is the challenge as to how you get people
to pay attention to you, how do you get them
to give you positive or negative feedback,
how do you get the people to act differently
based on your ideas.
So we’ll see how well we do on that.
So you know that there’s a lot of hype
behind artificial intelligence currently.
Do you, as you look to spread the ideas
that are of neocortical theory, the things you’re working on,
do you think there’s some possibility
we’ll hit an AI winter once again?
Yeah, it’s certainly a possibility.
No question about it.
Is that something you worry about?
Yeah, well, I guess, do I worry about it?
I haven’t decided yet if that’s good or bad for my mission.
That’s true, that’s very true.
Because it’s almost like you need the winter
to refresh the palette.
Yeah, it’s like, I want, here’s what you wanna have it is.
You want, like to the extent that everyone is so thrilled
about the current state of machine learning and AI
and they don’t imagine they need anything else,
it makes my job harder.
If everything crashed completely
and every student left the field
and there was no money for anybody to do anything
and it became an embarrassment
to talk about machine intelligence and AI,
that wouldn’t be good for us either.
You want sort of the soft landing approach, right?
You want enough people, the senior people in AI
and machine learning to say, you know,
we need other approaches.
We really need other approaches.
Damn, we need other approaches.
Maybe we should look to the brain.
Okay, let’s look to the brain.
Who’s got some brain ideas?
Okay, let’s start a little project on the side here
trying to do brain idea related stuff.
That’s the ideal outcome we would want.
So I don’t want a total winter
and yet I don’t want it to be sunny all the time either.
So what do you think it takes to build a system
with human level intelligence
where once demonstrated you would be very impressed?
So does it have to have a body?
Does it have to have the C word we used before,
consciousness as an entirety in a holistic sense?
First of all, I don’t think the goal
is to create a machine that is human level intelligence.
I think it’s a false goal.
Back to Turing, I think it was a false statement.
We want to understand what intelligence is
and then we can build intelligent machines
of all different scales, all different capabilities.
A dog is intelligent.
I don’t need, that’d be pretty good to have a dog.
But what about something that doesn’t look
like an animal at all, in different spaces?
So my thinking about this is that
we want to define what intelligence is,
agree upon what makes an intelligent system.
We can then say, okay, we’re now gonna build systems
that work on those principles or some subset of them
and we can apply them to all different types of problems.
And the kind, the idea, it’s not computing.
We don’t ask, if I take a little one chip computer,
I don’t say, well, that’s not a computer
because it’s not as powerful as this big server over here.
No, no, because we know that what the principles
of computing are and I can apply those principles
to a small problem or into a big problem.
And same, intelligence needs to get there.
We have to say, these are the principles.
I can make a small one, a big one.
I can make them distributed.
I can put them on different sensors.
They don’t have to be human like at all.
Now, you did bring up a very interesting question
about embodiment.
Does it have to have a body?
It has to have some concept of movement.
It has to be able to move through these reference frames
I talked about earlier.
Whether it’s physically moving,
like I need, if I’m gonna have an AI
that understands coffee cups,
it’s gonna have to pick up the coffee cup
and touch it and look at it with its eyes and hands
or something equivalent to that.
If I have a mathematical AI,
maybe it needs to move through mathematical spaces.
I could have a virtual AI that lives in the internet
and its movements are traversing links
and digging into files,
but it’s got a location that it’s traveling
through some space.
You can’t have an AI that just take some flash thing input.
We call it flash inference.
Here’s a pattern, done.
No, it’s movement pattern, movement pattern,
movement pattern, attention, digging, building structure,
figuring out the model of the world.
So some sort of embodiment,
whether it’s physical or not, has to be part of it.
So self awareness and the way to be able to answer
where am I?
Well, you’re bringing up self,
that’s a different topic, self awareness.
No, the very narrow definition of self,
meaning knowing a sense of self enough to know
where am I in the space where it’s actually.
Yeah, basically the system needs to know its location
or each component of the system needs to know
where it is in the world at that point in time.
So self awareness and consciousness.
Do you think one, from the perspective of neuroscience
and neurocortex, these are interesting topics,
solvable topics.
Do you have any ideas of why the heck it is
that we have a subjective experience at all?
Yeah, I have a lot of thoughts on that.
And is it useful or is it just a side effect of us?
It’s interesting to think about.
I don’t think it’s useful as a means to figure out
how to build intelligent machines.
It’s something that systems do
and we can talk about what it is that are like,
well, if I build a system like this,
then it would be self aware.
Or if I build it like this, it wouldn’t be self aware.
So that’s a choice I can have.
It’s not like, oh my God, it’s self aware.
I can’t turn, I heard an interview recently
with this philosopher from Yale,
I can’t remember his name, I apologize for that.
But he was talking about,
well, if these computers are self aware,
then it would be a crime to unplug them.
And I’m like, oh, come on, that’s not,
I unplug myself every night, I go to sleep.
Is that a crime?
I plug myself in again in the morning and there I am.
So people get kind of bent out of shape about this.
I have very definite, very detailed understanding
or opinions about what it means to be conscious
and what it means to be self aware.
I don’t think it’s that interesting a problem.
You’ve talked to Christoph Koch.
He thinks that’s the only problem.
I didn’t actually listen to your interview with him,
but I know him and I know that’s the thing he cares about.
He also thinks intelligence and consciousness are disjoint.
So I mean, it’s not, you don’t have to have one or the other.
So he is.
I disagree with that.
I just totally disagree with that.
So where’s your thoughts and consciousness,
where does it emerge from?
Because it is.
So then we have to break it down to the two parts, okay?
Because consciousness isn’t one thing.
That’s part of the problem with that term
is it means different things to different people
and there’s different components of it.
There is a concept of self awareness, okay?
That can be very easily explained.
You have a model of your own body.
The neocortex models things in the world
and it also models your own body.
And then it has a memory.
It can remember what you’ve done, okay?
So it can remember what you did this morning,
can remember what you had for breakfast and so on.
And so I can say to you, okay, Lex,
were you conscious this morning when you had your bagel?
And you’d say, yes, I was conscious.
Now what if I could take your brain
and revert all the synapses back
to the state they were this morning?
And then I said to you, Lex,
were you conscious when you ate the bagel?
And you said, no, I wasn’t conscious.
I said, here’s a video of eating the bagel.
And you said, I wasn’t there.
That’s not possible
because I must’ve been unconscious at that time.
So we can just make this one to one correlation
between memory of your body’s trajectory through the world
over some period of time,
a memory and the ability to recall that memory
is what you would call conscious.
I was conscious of that, it’s a self awareness.
And any system that can recall,
memorize what it’s done recently
and bring that back and invoke it again
would say, yeah, I’m aware.
I remember what I did.
All right, I got it.
That’s an easy one.
Although some people think that’s a hard one.
The more challenging part of consciousness
is this one that’s sometimes used
going by the word of qualia,
which is, why does an object seem red?
Or what is pain?
And why does pain feel like something?
Why do I feel redness?
Or why do I feel painness?
And then I could say, well,
why does sight seems different than hearing?
It’s the same problem.
It’s really, these are all just neurons.
And so how is it that,
why does looking at you feel different than hearing you?
It feels different, but there’s just neurons in my head.
They’re all doing the same thing.
So that’s an interesting question.
The best treatise I’ve read about this
is by a guy named Oregon.
He wrote a book called,
Why Red Doesn’t Sound Like a Bell.
It’s a little, it’s not a trade book, easy to read,
but it, and it’s an interesting question.
Take something like color.
Color really doesn’t exist in the world.
It’s not a property of the world.
Property of the world that exists is light frequency.
And that gets turned into,
we have certain cells in the retina
that respond to different frequencies
different than others.
And so when they enter the brain,
you just have a bunch of axons
that are firing at different rates.
And from that, we perceive color.
But there is no color in the brain.
I mean, there’s no color coming in on those synapses.
It’s just a correlation between some axons
and some property of frequency.
And that isn’t even color itself.
Frequency doesn’t have a color.
It’s just what it is.
So then the question is,
well, why does it even appear to have a color at all?
Just as you’re describing it,
there seems to be a connection to those ideas
of reference frames.
I mean, it just feels like consciousness
having the subject,
assigning the feeling of red to the actual color
or to the wavelength is useful for intelligence.
Yeah, I think that’s a good way of putting it.
It’s useful as a predictive mechanism
or useful as a generalization idea.
It’s a way of grouping things together to say,
it’s useful to have a model like this.
So think about the well known syndrome
that people who’ve lost a limb experience
called phantom limbs.
And what they claim is they can have their arm is removed,
but they feel their arm.
That not only feel it, they know it’s there.
It’s there, I know it’s there.
They’ll swear to you that it’s there.
And then they can feel pain in their arm
and they’ll feel pain in their finger.
And if they move their non existent arm behind their back,
then they feel the pain behind their back.
So this whole idea that your arm exists
is a model of your brain.
It may or may not really exist.
And just like, but it’s useful to have a model of something
that sort of correlates to things in the world.
So you can make predictions about what would happen
when those things occur.
It’s a little bit of a fuzzy,
but I think you’re getting quite towards the answer there.
It’s useful for the model to express things certain ways
that we can then map them into these reference frames
and make predictions about them.
I need to spend more time on this topic.
It doesn’t bother me.
Do you really need to spend more time?
Yeah, I know.
It does feel special that we have subjective experience,
but I’m yet to know why.
I’m just personally curious.
It’s not necessary for the work we’re doing here.
I don’t think I need to solve that problem
to build intelligent machines at all, not at all.
But there is sort of the silly notion
that you described briefly
that doesn’t seem so silly to us humans is,
if you’re successful building intelligent machines,
it feels wrong to then turn them off.
Because if you’re able to build a lot of them,
it feels wrong to then be able to turn off the…
Well, why?
Let’s break that down a bit.
As humans, why do we fear death?
There’s two reasons we fear death.
Well, first of all, I’ll say,
when you’re dead, it doesn’t matter at all.
Who cares?
You’re dead.
So why do we fear death?
We fear death for two reasons.
One is because we are programmed genetically to fear death.
That’s a survival and pop beginning of the genes thing.
And we also are programmed to feel sad
when people we know die.
We don’t feel sad for someone we don’t know dies.
There’s people dying right now,
they’re only just gonna say,
I don’t feel bad about them,
because I don’t know them.
But if I knew them, I’d feel really bad.
So again, these are old brain,
genetically embedded things that we fear death.
It’s outside of those uncomfortable feelings.
There’s nothing else to worry about.
Well, wait, hold on a second.
Do you know the denial of death by Becker?
No.
There’s a thought that death is,
our whole conception of our world model
kind of assumes immortality.
And then death is this terror that underlies it all.
So like…
Some people’s world model, not mine.
But, okay, so what Becker would say
is that you’re just living in an illusion.
You’ve constructed an illusion for yourself
because it’s such a terrible terror,
the fact that this…
What’s the illusion?
The illusion that death doesn’t matter.
You’re still not coming to grips with…
The illusion of what?
That death is…
Going to happen.
Oh, like it’s not gonna happen?
You’re actually operating.
You haven’t, even though you said you’ve accepted it,
you haven’t really accepted the notion that you’re gonna die
is what you say.
So it sounds like you disagree with that notion.
Yeah, yeah, totally.
I literally, every night I go to bed, it’s like dying.
Like little deaths.
It’s little deaths.
And if I didn’t wake up, it wouldn’t matter to me.
Only if I knew that was gonna happen would it be bothersome.
If I didn’t know it was gonna happen, how would I know?
Then I would worry about my wife.
So imagine I was a loner and I lived in Alaska
and I lived out there and there was no animals.
Nobody knew I existed.
I was just eating these roots all the time.
And nobody knew I was there.
And one day I didn’t wake up.
What pain in the world would there exist?
Well, so most people that think about this problem
would say that you’re just deeply enlightened
or are completely delusional.
One of the two.
But I would say that’s a very enlightened way
to see the world.
That’s the rational one as well.
It’s rational, that’s right.
But the fact is we don’t,
I mean, we really don’t have an understanding
of why the heck it is we’re born and why we die
and what happens after we die.
Well, maybe there isn’t a reason, maybe there is.
So I’m interested in those big problems too, right?
You interviewed Max Tegmark,
and there’s people like that, right?
I’m interested in those big problems as well.
And in fact, when I was young,
I made a list of the biggest problems I could think of.
First, why does anything exist?
Second, why do we have the laws of physics that we have?
Third, is life inevitable?
And why is it here?
Fourth, is intelligence inevitable?
And why is it here?
I stopped there because I figured
if you can make a truly intelligent system,
that will be the quickest way
to answer the first three questions.
I’m serious.
And so I said, my mission, you asked me earlier,
my first mission is to understand the brain,
but I felt that is the shortest way
to get to true machine intelligence.
And I wanna get to true machine intelligence
because even if it doesn’t occur in my lifetime,
other people will benefit from it
because I think it’ll occur in my lifetime,
but 20 years, you never know.
But that will be the quickest way for us to,
we can make super mathematicians,
we can make super space explorers,
we can make super physicist brains that do these things
and that can run experiments that we can’t run.
We don’t have the abilities to manipulate things and so on,
but we can build intelligent machines that do all those things
with the ultimate goal of finding out the answers
to the other questions.
Let me ask you another depressing and difficult question,
which is once we achieve that goal of creating,
no, of understanding intelligence,
do you think we would be happier,
more fulfilled as a species?
The understanding intelligence
or understanding the answers to the big questions?
Understanding intelligence.
Oh, totally, totally.
It would be far more fun place to live.
You think so?
Oh yeah, why not?
I mean, just put aside this terminator nonsense
and just think about, you can think about,
we can talk about the risks of AI if you want.
I’d love to, so let’s talk about.
But I think the world would be far better knowing things.
We’re always better than know things.
Do you think it’s better, is it a better place to live in
that I know that our planet is one of many
in the solar system and the solar system’s one of many
in the galaxy?
I think it’s a more, I dread, I sometimes think like,
God, what would it be like to live 300 years ago?
I’d be looking up at the sky, I can’t understand anything.
Oh my God, I’d be like going to bed every night going,
what’s going on here?
Well, I mean, in some sense I agree with you,
but I’m not exactly sure.
So I’m also a scientist, so I share your views,
but I’m not, we’re like rolling down the hill together.
What’s down the hill?
I feel like we’re climbing a hill.
Whatever.
We’re getting closer to enlightenment
and you’re going down the hill.
We’re climbing, we’re getting pulled up a hill
by our curiosity.
Our curiosity is, we’re pulling ourselves up the hill
by our curiosity.
Yeah, Sisyphus was doing the same thing with the rock.
Yeah, yeah, yeah, yeah.
But okay, our happiness aside, do you have concerns
about, you talk about Sam Harris, Elon Musk,
of existential threats of intelligent systems?
No, I’m not worried about existential threats at all.
There are some things we really do need to worry about.
Even today’s AI, we have things we have to worry about.
We have to worry about privacy
and about how it impacts false beliefs in the world.
And we have real problems and things to worry about
with today’s AI.
And that will continue as we create more intelligent systems.
There’s no question, the whole issue
about making intelligent armaments and weapons
is something that really we have to think about carefully.
I don’t think of those as existential threats.
I think those are the kind of threats we always face
and we’ll have to face them here
and we’ll have to deal with them.
We could talk about what people think
are the existential threats,
but when I hear people talking about them,
they all sound hollow to me.
They’re based on ideas, they’re based on people
who really have no idea what intelligence is.
And if they knew what intelligence was,
they wouldn’t say those things.
So those are not experts in the field.
Yeah, so there’s two, right?
So one is like super intelligence.
So a system that becomes far, far superior
in reasoning ability than us humans.
How is that an existential threat?
Then, so there’s a lot of ways in which it could be.
One way is us humans are actually irrational, inefficient
and get in the way of, not happiness,
but whatever the objective function is
of maximizing that objective function.
Super intelligent.
The paperclip problem and things like that.
So the paperclip problem but with the super intelligent.
Yeah, yeah, yeah, yeah.
So we already face this threat in some sense.
They’re called bacteria.
These are organisms in the world
that would like to turn everything into bacteria.
And they’re constantly morphing,
they’re constantly changing to evade our protections.
And in the past, they have killed huge swaths
of populations of humans on this planet.
So if you wanna worry about something
that’s gonna multiply endlessly, we have it.
And I’m far more worried in that regard.
I’m far more worried that some scientists in the laboratory
will create a super virus or a super bacteria
that we cannot control.
That is a more of an existential threat.
Putting an intelligence thing on top of it
actually seems to make it less existential to me.
It’s like, it limits its power.
It limits where it can go.
It limits the number of things it can do in many ways.
A bacteria is something you can’t even see.
So that’s only one of those problems.
Yes, exactly.
So the other one, just in your intuition about intelligence,
when you think about intelligence of us humans,
do you think of that as something,
if you look at intelligence on a spectrum
from zero to us humans,
do you think you can scale that to something far,
far superior to all the mechanisms we’ve been talking about?
I wanna make another point here, Lex, before I get there.
Intelligence is the neocortex.
It is not the entire brain.
The goal is not to make a human.
The goal is not to make an emotional system.
The goal is not to make a system
that wants to have sex and reproduce.
Why would I build that?
If I wanna have a system that wants to reproduce
and have sex, make bacteria, make computer viruses.
Those are bad things, don’t do that.
Those are really bad, don’t do those things.
Regulate those.
But if I just say I want an intelligent system,
why does it have to have any of the human like emotions?
Why does it even care if it lives?
Why does it even care if it has food?
It doesn’t care about those things.
It’s just, you know, it’s just in a trance
thinking about mathematics or it’s out there
just trying to build the space for it on Mars.
That’s a choice we make.
Don’t make human like things,
don’t make replicating things,
don’t make things that have emotions,
just stick to the neocortex.
So that’s a view actually that I share
but not everybody shares in the sense that
you have faith and optimism about us as engineers of systems,
humans as builders of systems to not put in stupid, not.
So this is why I mentioned the bacteria one.
Because you might say, well, some person’s gonna do that.
Well, some person today could create a bacteria
that’s resistant to all the known antibacterial agents.
So we already have that threat.
We already know this is going on.
It’s not a new threat.
So just accept that and then we have to deal with it, right?
Yeah, so my point is nothing to do with intelligence.
Intelligence is a separate component
that you might apply to a system
that wants to reproduce and do stupid things.
Let’s not do that.
Yeah, in fact, it is a mystery
why people haven’t done that yet.
My dad is a physicist, believes that the reason,
he says, for example, nuclear weapons haven’t proliferated
amongst evil people.
So one belief that I share is that
there’s not that many evil people in the world
that would use, whether it’s bacteria or nuclear weapons
or maybe the future AI systems to do bad.
So the fraction is small.
And the second is that it’s actually really hard,
technically, so the intersection between evil
and competent is small in terms of, and that’s the.
And by the way, to really annihilate humanity,
you’d have to have sort of the nuclear winter phenomenon,
which is not one person shooting or even 10 bombs.
You’d have to have some automated system
that detonates a million bombs
or whatever many thousands we have.
So extreme evil combined with extreme competence.
And to start with building some stupid system
that would automatically, Dr. Strangelove type of thing,
you know, I mean, look, we could have
some nuclear bomb go off in some major city in the world.
I think that’s actually quite likely, even in my lifetime.
I don’t think that’s an unlikely thing.
And it’d be a tragedy.
But it won’t be an existential threat.
And it’s the same as, you know, the virus of 1917,
whatever it was, you know, the influenza.
These bad things can happen and the plague and so on.
We can’t always prevent them.
We always try, but we can’t.
But they’re not existential threats
until we combine all those crazy things together.
So on the spectrum of intelligence from zero to human,
do you have a sense of whether it’s possible
to create several orders of magnitude
or at least double that of human intelligence?
Talking about neuro context.
I think it’s the wrong thing to say double the intelligence.
Break it down into different components.
Can I make something that’s a million times fast
than a human brain?
Yes, I can do that.
Could I make something that is,
has a lot more storage than the human brain?
Yes, I could do that.
More common, more copies of common.
Can I make something that attaches
to different sensors than human brain?
Yes, I can do that.
Could I make something that’s distributed?
So these people, yeah, we talked early
about the departure of the neocortex voting.
They don’t have to be co located.
Like, you know, they can be all around the place.
I could do that too.
Those are the levers I have, but is it more intelligent?
Well, it depends what I train it on.
What is it doing?
If it’s.
Well, so here’s the thing.
So let’s say larger neocortex
and or whatever size that allows for higher
and higher hierarchies to form,
we’re talking about reference frames and concepts.
Could I have something that’s a super physicist
or a super mathematician?
Yes.
And the question is, once you have a super physicist,
will they be able to understand something?
Do you have a sense that it will be orders of math,
like us compared to ants?
Could we ever understand it?
Yeah.
Most people cannot understand general relativity.
It’s a really hard thing to get.
I mean, yeah, you can paint it in a fuzzy picture,
stretchy space, you know?
But the field equations to do that
and the deep intuitions are really, really hard.
And I’ve tried, I’m unable to do it.
Like it’s easy to get special relativity,
but general relativity, man, that’s too much.
And so we already live with this to some extent.
The vast majority of people can’t understand actually
what the vast majority of other people actually know.
We’re just, either we don’t have the effort to,
or we can’t, or we don’t have time,
or just not smart enough, whatever.
But we have ways of communicating.
Einstein has spoken in a way that I can understand.
He’s given me analogies that are useful.
I can use those analogies from my own work
and think about concepts that are similar.
It’s not stupid.
It’s not like he’s existing some other plane
and there’s no connection with my plane in the world here.
So that will occur.
It already has occurred.
That’s what my point of this story is.
It already has occurred.
We live it every day.
One could argue that when we create machine intelligence
that think a million times faster than us
that it’ll be so far we can’t make the connections.
But you know, at the moment,
everything that seems really, really hard
to figure out in the world,
when you actually figure it out, it’s not that hard.
You know, almost everyone can understand the multiverses.
Almost everyone can understand quantum physics.
Almost everyone can understand these basic things,
even though hardly any people could figure those things out.
Yeah, but really understand.
But you don’t need to really.
Only a few people really understand.
You need to only understand the projections,
the sprinkles of the useful insights from that.
That was my example of Einstein, right?
His general theory of relativity is one thing
that very, very, very few people can get.
And what if we just said those other few people
are also artificial intelligences?
How bad is that?
In some sense they are, right?
Yeah, they say already.
I mean, Einstein wasn’t a really normal person.
He had a lot of weird quirks.
And so did the other people who worked with him.
So, you know, maybe they already were sort of
this astral plane of intelligence that,
we live with it already.
It’s not a problem.
It’s still useful and, you know.
So do you think we are the only intelligent life
out there in the universe?
I would say that intelligent life
has and will exist elsewhere in the universe.
I’ll say that.
There was a question about
contemporaneous intelligence life,
which is hard to even answer
when we think about relativity and the nature of space time.
Can’t say what exactly is this time
someplace else in the world.
But I think it’s, you know,
I do worry a lot about the filter idea,
which is that perhaps intelligent species
don’t last very long.
And so we haven’t been around very long.
And as a technological species,
we’ve been around for almost nothing, you know.
What, 200 years, something like that.
And we don’t have any data,
a good data point on whether it’s likely
that we’ll survive or not.
So do I think that there have been intelligent life
elsewhere in the universe?
Almost certainly, of course.
In the past, in the future, yes.
Does it survive for a long time?
I don’t know.
This is another reason I’m excited about our work,
is our work meaning the general world of AI.
I think we can build intelligent machines
that outlast us.
You know, they don’t have to be tied to Earth.
They don’t have to, you know,
I’m not saying they’re recreating, you know,
aliens, I’m just saying,
if I asked myself,
and this might be a good point to end on here.
If I asked myself, you know,
what’s special about our species?
We’re not particularly interesting physically.
We don’t fly, we’re not good swimmers,
we’re not very fast, we’re not very strong, you know.
It’s our brain, that’s the only thing.
And we are the only species on this planet
that’s built the model of the world
that extends beyond what we can actually sense.
We’re the only people who know about
the far side of the moon, and the other universes,
and I mean, other galaxies, and other stars,
and about what happens in the atom.
There’s no, that knowledge doesn’t exist anywhere else.
It’s only in our heads.
Cats don’t do it, dogs don’t do it,
monkeys don’t do it, it’s just on.
And that is what we’ve created that’s unique.
Not our genes, it’s knowledge.
And if I asked me, what is the legacy of humanity?
What should our legacy be?
It should be knowledge.
We should preserve our knowledge
in a way that it can exist beyond us.
And I think the best way of doing that,
in fact you have to do it,
is it has to go along with intelligent machines
that understand that knowledge.
It’s a very broad idea, but we should be thinking,
I call it a state planning for humanity.
We should be thinking about what we wanna leave behind
when as a species we’re no longer here.
And that’ll happen sometime.
Sooner or later it’s gonna happen.
And understanding intelligence and creating intelligence
gives us a better chance to prolong.
It does give us a better chance to prolong life, yes.
It gives us a chance to live on other planets.
But even beyond that, I mean our solar system
will disappear one day, just given enough time.
So I don’t know, I doubt we’ll ever be able to travel
to other things, but we could tell the stars,
but we could send intelligent machines to do that.
So you have an optimistic, a hopeful view of our knowledge
of the echoes of human civilization
living through the intelligent systems we create?
Oh, totally.
Well, I think the intelligent systems we create
are in some sense the vessel for bringing them beyond Earth
or making them last beyond humans themselves.
How do you feel about that?
That they won’t be human, quote unquote?
Who cares?
Human, what is human?
Our species are changing all the time.
Human today is not the same as human just 50 years ago.
What is human?
Do we care about our genetics?
Why is that important?
As I point out, our genetics are no more interesting
than a bacterium’s genetics.
It’s no more interesting than a monkey’s genetics.
What we have, what’s unique and what’s valuable
is our knowledge, what we’ve learned about the world.
And that is the rare thing.
That’s the thing we wanna preserve.
It’s, who cares about our genes?
That’s not.
It’s the knowledge.
That’s a really good place to end.
Thank you so much for talking to me.
No, it was fun.