The following is a conversation with Carl Friston,
one of the greatest neuroscientists in history.
Cited over 245,000 times,
known for many influential ideas in brain imaging,
neuroscience, and theoretical neurobiology,
including especially the fascinating idea
of the free energy principle for action and perception.
Carl’s mix of humor, brilliance, and kindness,
to me, are inspiring and captivating.
This was a huge honor and a pleasure.
This is the Artificial Intelligence Podcast.
If you enjoy it, subscribe on YouTube,
review it with five stars on Apple Podcast,
support it on Patreon,
or simply connect with me on Twitter,
at Lex Friedman, spelled F R I D M A N.
As usual, I’ll do a few minutes of ads now,
and never any ads in the middle
that can break the flow of the conversation.
I hope that works for you,
and doesn’t hurt the listening experience.
This show is presented by Cash App,
the number one finance app in the App Store.
When you get it, use code LEXPODCAST.
Cash App lets you send money to friends by Bitcoin,
and invest in the stock market with as little as $1.
Since Cash App allows you to send
and receive money digitally,
let me mention a surprising fact related to physical money.
Of all the currency in the world,
roughly 8% of it is actual physical money.
The other 92% of money only exists digitally.
So again, if you get Cash App from the App Store,
Google Play, and use the code LEXPODCAST, you get $10,
and Cash App will also donate $10 to FIRST,
an organization that is helping to advance robotics
and STEM education for young people around the world.
And now, here’s my conversation with Carl Friston.
How much of the human brain do we understand
from the low level of neuronal communication
to the functional level to the highest level,
maybe the psychiatric disorder level?
Well, we’re certainly in a better position
than we were last century.
How far we’ve got to go, I think,
is almost an unanswerable question.
So you’d have to set the parameters,
you know, what constitutes understanding, what level
of understanding do you want?
I think we’ve made enormous progress
in terms of broad brush principles.
Whether that affords a detailed cartography
of the functional anatomy of the brain and what it does,
right down to the microcircuitry and the neurons,
that’s probably out of reach at the present time.
So the cartography, so mapping the brain,
do you think mapping of the brain,
the detailed, perfect imaging of it,
does that get us closer to understanding
of the mind, of the brain?
So how far does it get us if we have
that perfect cartography of the brain?
I think there are lower bounds on that.
It’s a really interesting question.
And it would determine the sort of scientific career
you’d pursue if you believe that knowing
every dendritic connection, every sort of microscopic,
synaptic structure right down to the molecular level
was gonna give you the right kind of information
to understand the computational anatomy,
then you’d choose to be a microscopist
and you would study little cubic millimeters of brain
for the rest of your life.
If on the other hand you were interested
in holistic functions and a sort of functional anatomy
of the sort that a neuropsychologist would understand,
you’d study brain lesions and strokes,
just looking at the whole person.
So again, it comes back to at what level
do you want understanding?
I think there are principled reasons not to go too far.
If you commit to a view of the brain
as a machine that’s performing a form of inference
and representing things, that level of understanding
is necessarily cast in terms of probability densities
and ensemble densities, distributions.
And what that tells you is that you don’t really want
to look at the atoms to understand the thermodynamics
of probabilistic descriptions of how the brain works.
So I personally wouldn’t look at the molecules
or indeed the single neurons in the same way
if I wanted to understand the thermodynamics
of some non equilibrium steady state of a gas
or an active material, I wouldn’t spend my life
looking at the individual molecules
that constitute that ensemble.
I’d look at their collective behavior.
On the other hand, if you go too coarse grain,
you’re gonna miss some basic canonical principles
of connectivity and architectures.
I’m thinking here this bit colloquial,
but this current excitement about high field
magnetic resonance imaging at seven Tesla, why?
Well, it gives us for the first time the opportunity
to look at the brain in action at the level
of a few millimeters that distinguish
between different layers of the cortex
that may be very important in terms of evincing
generic principles of conical microcircuitry
that are replicated throughout the brain
that may tell us something fundamental
about message passing in the brain
and these density dynamics or neuronal
or some more population dynamics
that underwrite our brain function.
So somewhere between a millimeter and a meter.
Lingering for a bit on the big questions if you allow me,
what to use the most beautiful or surprising characteristic
of the human brain?
I think it’s hierarchical and recursive aspect.
It’s recurrent aspect.
Of the structure or of the actual
representation of power of the brain?
Well, I think one speaks to the other.
I was actually answering in a dull minded way
from the point of view of purely its anatomy
and its structural aspects.
I mean, there are many marvelous organs in the body.
Let’s take your liver for example.
Without it, you wouldn’t be around for very long
and it does some beautiful and delicate biochemistry
and homeostasis and evolved with a finesse
that would easily parallel the brain
but it doesn’t have a beautiful anatomy.
It has a simple anatomy which is attractive
in a minimalist sense but it doesn’t have
that crafted structure of sparse connectivity
and that recurrence and that specialization
that the brain has.
So you said a lot of interesting terms here.
So the recurrence, the sparsity,
but you also started by saying hierarchical.
So I’ve never thought of our brain as hierarchical.
Sort of I always thought it’s just like a giant mess,
interconnected mess where it’s very difficult
to figure anything out.
But in what sense do you see the brain as hierarchical?
Well, I see it, it’s not a magic soup.
Which of course is what I used to think
before I studied medicine and the like.
So a lot of those terms imply each other.
So hierarchies, if you just think about
the nature of a hierarchy,
how would you actually build one?
And what you would have to do is basically
carefully remove the right connections
that destroy the completely connected soups
that you might have in mind.
So a hierarchy is in and of itself defined
by a sparse and particular connectivity structure.
I’m not committing to any particular form of hierarchy.
But your sense is there is some.
Oh, absolutely, yeah.
In virtue of the fact that there is a sparsity
of connectivity, not necessarily of a qualitative sort,
but certainly of a quantitative sort.
So it is demonstrably so that the further apart
two parts of the brain are,
the less likely they are to be wired,
to possess axonal processes, neuronal processes
that directly communicate one message
or messages from one part of that brain
to the other part of the brain.
So we know there’s a sparse connectivity.
And furthermore, on the basis of anatomical connectivity
in traces studies, we know that that sparsity
underwrites a hierarchical and very structured
sort of connectivity that might be best understood
like a little bit like an onion.
There is a concentric, sometimes referred to as centripetal
by people like Marcel Masulam,
hierarchical organization to the brain.
So you can think of the brain as in a rough sense,
like an onion, and all the sensory information
and all the afferent outgoing messages
that supply commands to your muscles
or to your secretory organs come from the surface.
So there’s a massive exchange interface
with the world out there on the surface.
And then underneath, there’s a little layer
that sits and looks at the exchange on the surface.
And then underneath that, there’s a layer
right the way down to the very center,
to the deepest part of the onion.
That’s what I mean by a hierarchical organization.
There’s a discernible structure defined
by the sparsity of connections
that lends the architecture a hierarchical structure
that tells one a lot about the kinds of representations
and messages.
Coming back to your earlier question,
is this about the representational capacity
or is it about the anatomy?
Well, one underwrites the other.
If one simply thinks of the brain
as a message passing machine,
a process that is in the service of doing something,
then the circuitry and the connectivity
that shape that message passing also dictate its function.
So you’ve done a lot of amazing work
in a lot of directions.
So let’s look at one aspect of that,
of looking into the brain
and trying to study this onion structure.
What can we learn about the brain by imaging it?
Which is one way to sort of look at the anatomy of it.
Broadly speaking, what are the methods of imaging,
but even bigger, what can we learn about it?
Right, so well, most human neuroimaging
that you might see in science journals
that speaks to the way the brain works,
measures brain activity over time.
So that’s the first thing to say,
that we’re effectively looking at fluctuations
in neuronal responses,
usually in response to some sensory input
or some instruction, some task.
Not necessarily, there’s a lot of interest
in just looking at the brain
in terms of resting state, endogenous,
or intrinsic activity.
But crucially, at every point,
looking at these fluctuations,
either induced or intrinsic in the neural activity,
and understanding them at two levels.
So normally, people would recourse
to two principles of brain organization
that are complementary.
One, functional specialization or segregation.
So what does that mean?
It simply means that there are certain parts of the brain
that may be specialized for certain kinds of processing.
For example, visual motion,
our ability to recognize or to perceive movement
in the visual world.
And furthermore, that specialized processing
may be spatially or anatomically segregated,
leading to functional segregation.
Which means that if I were to compare your brain activity
during a period of viewing a static image,
and then compare that to the responses of fluctuations
in the brain when you were exposed to a moving image,
say a flying bird,
we’d expect to see
restricted, segregated differences in activity.
And those are basically the hotspots
that you see in the statistical parametric maps
that test for the significance of the responses
that are circumscribed.
So now, basically, we’re talking about
some people have perhaps unkindly called a neocartography.
This is a phrenology augmented by modern day neuroimaging,
basically finding blobs or bumps on the brain
that do this or do that,
and trying to understand the cartography
of that functional specialization.
So how much is there such,
this is such a beautiful sort of ideal to strive for.
We humans, scientists, would like this,
to hope that there’s a beautiful structure to this
where it’s, like you said, there’s segregated regions
that are responsible for the different function.
How much hope is there to find such regions
in terms of looking at the progress of studying the brain?
Oh, I think enormous progress has been made
in the past 20 or 30 years.
So this is beyond incremental.
At the advent of brain imaging,
the very notion of functional segregation
was just a hypothesis based upon a century,
if not more, of careful neuropsychology,
looking at people who had lost via insult
or traumatic brain injury particular parts of the brain,
and then saying, well, they can’t do this
or they can’t do that.
For example, losing the visual cortex
and not being able to see,
or losing particular parts of the visual cortex
or regions known as V5
or the middle temporal region, MT,
and noticing that they selectively
could not see moving things.
And so that created the hypothesis
that perhaps visual movement processing
was located in this functionally segregated area.
And you could then go and put invasive electrodes
in animal models and say, yes, indeed,
we can excite activity here.
We can form receptive fields that are sensitive to
or defined in terms of visual motion.
But at no point could you exclude the possibility
that everywhere else in the brain
was also very interested in visual motion.
By the way, I apologize to interrupt,
but a tiny little tangent.
You said animal models, just out of curiosity,
from your perspective, how different is the human brain
versus the other animals
in terms of our ability to study the brain?
Well, clearly, the further away you go from a human brain,
the greater the differences,
but not as remarkable as you might think.
So people will choose their level of approximation
to the human brain,
depending upon the kinds of questions
that they want to answer.
So if you’re talking about sort of canonical principles
of microcircuitry, it might be perfectly okay
to look at a mouse, indeed.
You could even look at flies, worms.
If, on the other hand, you wanted to look at
the finer details of organization of visual cortex
and V1, V2, these are designated patches of cortex
that may do different things, indeed, do.
You’d probably want to use a primate
that looked a little bit more like a human,
because there are lots of ethical issues
in terms of the use of nonhuman primates
to answer questions about human anatomy.
But I think most people assume
that most of the important principles are conserved
in a continuous way, right from, well, yes,
worms right through to you and me.
So now returning to, so that was the early sort of ideas
of studying the functional regions of the brain
by if there’s some damage to it,
to try to infer that that part of the brain
might be somewhat responsible for this type of function.
So where does that lead us?
What are the next steps beyond that?
Right, well, I’ll just actually just reverse a bit,
come back to your sort of notion
that the brain is a magic soup.
That was actually a very prominent idea at one point,
notions such as Lashley’s law of mass action
inherited from the observation that for certain animals,
if you just took out spoonfuls of the brain,
it didn’t matter where you took these spoonfuls out,
they always showed the same kinds of deficits.
So it was very difficult to infer functional specialization
purely on the basis of lesion deficit studies.
But once we had the opportunity
to look at the brain lighting up
and it’s literally it’s sort of excitement, neuronal excitement
when looking at this versus that,
one was able to say, yes, indeed,
these functionally specialized responses
are very restricted and they’re here or they’re over there.
If I do this, then this part of the brain lights up.
And that became doable in the early 90s.
In fact, shortly before with the advent
of positron emission tomography.
And then functional magnetic resonance imaging
came along in the early 90s.
And since that time, there has been an explosion
of discovery, refinement, confirmation.
There are people who believe that it’s all in the anatomy.
If you understand the anatomy,
then you understand the function at some level.
And many, many hypotheses were predicated
on a deep understanding of the anatomy and the connectivity,
but they were all confirmed
and taken much further with neuroimaging.
So that’s what I meant by we’ve made an enormous amount
of progress in this century indeed,
and in relation to the previous century,
by looking at these functionally selective responses.
But that wasn’t the whole story.
So there’s this sort of near phrenology,
but finding bumps and hot spots in the brain
that did this or that.
The bigger question was, of course,
the functional integration.
How all of these regionally specific responses
were orchestrated, how they were distributed,
how did they relate to distributed processing
and indeed representations in the brain.
So then you turn to the more challenging issue
of the integration, the connectivity.
And then we come back to this beautiful,
sparse, recurrent, hierarchical connectivity
that seems characteristic of the brain
and probably not many other organs.
But nevertheless, we come back to this challenge
of trying to figure out how everything is integrated.
But what’s your feeling?
What’s the general consensus?
Have we moved away from the magic soup view of the brain?
So there is a deep structure to it.
And then maybe a further question.
You said some people believe that the structure
is most of it, that you can really get
at the core of the function
by just deeply understanding the structure.
Where do you sit on that, do you?
I think it’s got some mileage to it, yes, yeah.
So it’s a worthy pursuit of going,
of studying through imaging and all the different methods
to actually study the structure.
No, absolutely, yeah, yeah.
Sorry, I’m just noting, you were accusing me
of using lots of long words
and then you introduced one there, which is deep,
which is interesting.
Because deep is the sort of millennial equivalent
of hierarchical.
So if you’ve put deep in front of anything,
not only are you very millennial and very trending,
but you’re also implying a hierarchical architecture.
So it is a depth, which is, for me, the beautiful thing.
That’s right, the word deep kind of,
yeah, exactly, it implies hierarchy.
I didn’t even think about that.
That indeed, the implicit meaning
of the word deep is hierarchy.
Yep. Yeah.
So deep inside the onion is the center of your soul.
Beautifully put.
Maybe briefly, if you could paint a picture
of the kind of methods of neuroimaging,
maybe the history which you were a part of,
from statistical parametric mapping.
I mean, just what’s out there that’s interesting
for people maybe outside the field
to understand of what are the actual methodologies
of looking inside the human brain?
Right, well, you can answer that question
from two perspectives.
Basically, it’s the modality.
What kind of signal are you measuring?
And they can range from,
and let’s limit ourselves
to sort of imaging based noninvasive techniques.
So you’ve essentially got brain scanners,
and brain scanners can either measure
the structural attributes, the amount of water,
the amount of fat, or the amount of iron
in different parts of the brain,
and you can make lots of inferences
about the structure of the organ of the sort
that you might have produced from an X ray,
but a very nuanced X ray that is looking
at this kind of property or that kind of property.
So looking at the anatomy noninvasively
would be the first sort of neuroimaging
that people might want to employ.
Then you move on to the kinds of measurements
that reflect dynamic function,
and the most prevalent of those fall into two camps.
You’ve got these metabolic, sometimes hemodynamic,
blood related signals.
So these metabolic and or hemodynamic signals
are basically proxies for elevated activity
and message passing and neuronal dynamics
in particular parts of the brain.
Characteristically though, the time constants
of these hemodynamic or metabolic responses
to neural activity are much longer
than the neural activity itself.
And this is referring,
forgive me for the dumb questions,
but this would be referring to blood,
like the flow of blood.
Absolutely, absolutely.
So there’s a ton of,
it seems like there’s a ton of blood vessels in the brain.
Yeah.
So what’s the interaction between the flow of blood
and the function of the neurons?
Is there an interplay there or?
Yup, yup, and that interplay accounts for several careers
of world renowned scientists, yes, absolutely.
So this is known as neurovascular coupling,
is exactly what you said.
It’s how does the neural activity,
the neuronal infrastructure, the actual message passing
that we think underlies our capacity to perceive and act,
how is that coupled to the vascular responses
that supply the energy for that neural processing?
So there’s a delicate web of large vessels,
arteries and veins, that gets progressively finer
and finer in detail until it perfuses
at a microscopic level,
the machinery where little neurons lie.
So coming back to this sort of onion perspective,
we were talking before using the onion as a metaphor
for a deep hierarchical structure,
but also I think it’s just anatomically quite
a useful metaphor.
All the action, all the heavy lifting
in terms of neural computation is done
on the surface of the brain,
and then the interior of the brain is constituted
by fatty wires, essentially, axonal processes
that are enshrouded by myelin sheaths.
And these, when you dissect them, they look fatty and white,
and so it’s called white matter,
as opposed to the actual neuro pill,
which does the computation constituted largely by neurons,
and that’s known as gray matter.
So the gray matter is a surface or a skin
that sits on top of this big ball,
now we are talking magic soup,
but a big ball of connections like spaghetti,
very carefully structured with sparse connectivity
that preserves this deep hierarchical structure,
but all the action takes place on the surface,
on the cortex of the onion, and that means
that you have to supply the right amount of blood flow,
the right amount of nutrient,
which is rapidly absorbed and used by neural cells
that don’t have the same capacity
that your leg muscles would have
to basically spend their energy budget
and then claim it back later.
So one peculiar thing about cerebral metabolism,
brain metabolism, is it really needs to be driven
in the moment, which means you basically
have to turn on the taps.
So if there’s lots of neural activity
in one part of the brain, a little patch
of a few millimeters, even less possibly,
you really do have to water that piece
of the garden now and quickly,
and by quickly I mean within a couple of seconds.
So that contains a lot of, hence the imaging
could tell you a story of what’s happening.
Absolutely, but it is slightly compromised
in terms of the resolution.
So the deployment of these little microvessels
that water the garden to enable the neural activity
to play out, the spatial resolution
is in order of a few millimeters,
and crucially, the temporal resolution
is the order of a few seconds.
So you can’t get right down and dirty
into the actual spatial and temporal scale
of neural activity in and of itself.
To do that, you’d have to turn
to the other big imaging modality,
which is the recording of electromagnetic signals
as they’re generated in real time.
So here, the temporal bandwidth, if you like,
or the low limit on the temporal resolution
is incredibly small, talking about milliseconds.
And then you can get into the phasic fast responses
that is in and of itself the neural activity,
and start to see the succession or cascade
of hierarchical recurrent message passing
evoked by a particular stimulus.
But the problem is you’re looking
at electromagnetic signals that have passed
through an enormous amount of magic soup
or spaghetti of collectivity, and through the scalp
and the skull, and it’s become spatially very diffuse.
So it’s very difficult to know where you are.
So you’ve got this sort of catch 22.
You can either use an imaging modality
that tells you within millimeters
which part of the brain is activated,
but you don’t know when,
or you’ve got these electromagnetic EEG, MEG setups
that tell you to within a few milliseconds
when something has responded, but you’re not aware.
So you’ve got these two complementary measures,
either indirect via the blood flow,
or direct via the electromagnetic signals
caused by neural activity.
These are the two big imaging devices.
And then the second level of responding to your question,
what are the, from the outside,
what are the big ways of using this technology?
So once you’ve chosen the kind of neural imaging
that you want to use to answer your set questions,
and sometimes it would have to be both,
then you’ve got a whole raft of analyses,
time series analyses usually, that you can bring to bear
in order to answer your questions
or address your hypothesis about those data.
And interestingly, they both fall
into the same two camps we were talking about before,
this dialectic between specialization and integration,
differentiation and integration.
So it’s the cartography, the blobology analyses.
I apologize, I probably shouldn’t interrupt so much,
but just heard a fun word, the blah.
Blobology.
It’s a neologism, which means the study of blobs.
So nothing bob.
Are you being witty and humorous,
or does the word blobology ever appear
in a textbook somewhere?
It would appear in a popular book.
It would not appear in a worthy specialist journal.
Yeah, I thought so.
It’s the fond word for the study of literally little blobs
on brain maps showing activations.
So the kind of thing that you’d see in the newspapers
on ABC or BBC reporting the latest finding
from brain imaging.
Interestingly though, the maths involved
in that stream of analysis does actually call upon
the mathematics of blobs.
So seriously, they’re actually called Euler characteristics
and they have a lot of fancy names in mathematics.
We’ll talk about it, about your ideas
in free energy principle.
I mean, there’s echoes of blobs there
when you consider sort of entities,
mathematically speaking.
Yes, absolutely.
Well, circumscribed, well defined,
you entities of, well, from the free energy point of view,
entities of anything, but from the point of view
of the analysis, the cartography of the brain,
these are the entities that constitute the evidence
for this functional segregation.
You have segregated this function in this blob
and it is not outside of the blob.
And that’s basically the, if you were a map maker
of America and you did not know its structure,
the first thing were you doing constituting
or creating a map would be to identify the cities,
for example, or the mountains or the rivers.
All of these uniquely spatially localizable features,
possibly topological features have to be placed somewhere
because that requires a mathematics of identifying
what does a city look like on a satellite image
or what does a river look like
or what does a mountain look like?
What would it, you know, what data features
would evidence that particular top,
you know, that particular thing
that you wanted to put on the map?
And they normally are characterized
in terms of literally these blobs
or these sort of, another way of looking at this
is that a certain statistical measure
of the degree of activation crosses a threshold
and in crossing that threshold
in the spatially restricted part of the brain,
it creates a blob.
And that’s basically what statistical parametric mapping does.
It’s basically mathematically finessed blobology.
Okay, so those are the,
you kind of described these two methodologies for,
one is temporally noisy, one is spatially noisy
and you kind of have to play and figure out
what can be useful.
It’d be great if you can sort of comment.
I got a chance recently to spend a day
at a company called Neuralink
that uses brain computer interfaces
and their dream is to, well,
there’s a bunch of sort of dreams,
but one of them is to understand the brain
by sort of, you know, getting in there
past the so called sort of factory wall,
getting in there and be able to listen,
communicate both directions.
What are your thoughts about this,
the future of this kind of technology
of brain computer interfaces
to be able to now have a window
or direct contact within the brain
to be able to measure some of the signals,
to be able to sense signals,
to understand some of the functionality of the brain?
Ambivalent, my sense is ambivalent.
So it’s a mixture of good and bad
and I acknowledge that freely.
So the good bits, if you just look at the legacy
of that kind of reciprocal but invasive
your brain stimulation,
I didn’t paint a complete picture
when I was talking about sort of the ways
we understand the brain prior to neuroimaging.
It wasn’t just lesion deficit studies.
Some of the early work, in fact,
literally 100 years from where we’re sitting
at the institution of neurology,
was done by stimulating the brain of say dogs
and looking at how they responded
with their muscles or with their salivation
and imputing what that part of the brain must be doing.
If I stimulate it and I vote this kind of response,
then that tells me quite a lot
about the functional specialization.
So there’s a long history of brain stimulation
which continues to enjoy a lot of attention nowadays.
Positive attention.
Oh yes, absolutely.
You know, deep brain stimulation for Parkinson’s disease
is now a standard treatment
and also a wonderful vehicle
to try and understand the neuronal dynamics
underlying movement disorders like Parkinson’s disease.
Even interest in magnetic stimulation,
stimulating the magnetic fields
and will it work in people who are depressed, for example.
Quite a crude level of understanding what you’re doing,
but there is historical evidence
that these kinds of brute force interventions
do change things.
They, you know, it’s a little bit like banging the TV
when the valves aren’t working properly,
but it’s still, it works.
So, you know, there is a long history.
Brain computer interfacing or BCI,
I think is a beautiful example of that.
It’s sort of carved out its own niche
and its own aspirations
and there’ve been enormous advances within limits.
Advances in terms of our ability to understand
how the brain, the embodied brain,
engages with the world.
I’m thinking here of sensory substitution,
the augmenting our sensory capacities
by giving ourselves extra ways of sensing
and sampling the world,
ranging from sort of trying to replace lost visual signals
through to giving people completely new signals.
So, one of the, I think, most engaging examples of this
is equipping people with a sense of magnetic fields.
So you can actually give them magnetic sensors
that enable them to feel,
should we say, tactile pressure around their tummy,
where they are in relation to the magnetic field of the Earth.
And after a few weeks, they take it for granted.
They integrate it, they embody this,
simulate this new sensory information
into the way that they literally feel their world,
but now equipped with this sense of magnetic direction.
So that tells you something
about the brain’s plastic potential
to remodel and its plastic capacity
to suddenly try to explain the sensory data at hand
by augmenting the sensory sphere
and the kinds of things that you can measure.
Clearly, that’s purely for entertainment
and understanding the nature and the power of our brains.
I would imagine that most BCI is pitched
at solving clinical and human problems
such as locked in syndrome, such as paraplegia,
or replacing lost sensory capacities
like blindness and deafness.
So then we come to the negative part of my ambivalence,
the other side of it.
So I don’t want to be deflationary
because much of my deflationary comments
is probably large out of ignorance than anything else.
But generally speaking, the bandwidth
and the bit rates that you get
from brain computer interfaces as we currently know them,
we’re talking about bits per second.
So that would be like me only being able to communicate
with any world or with you using very, very, very slow Morse code.
And it is not even within an order of magnitude
near what we actually need for an inactive realization
of what people aspire to when they think about
sort of curing people with paraplegia or replacing sight
despite heroic efforts.
So one has to ask, is there a lower bound
on the kinds of recurrent information exchange
between a brain and some augmented or artificial interface?
And then we come back to, interestingly,
what I was talking about before,
which is if you’re talking about function
in terms of inference, and I presume we’ll get to that
later on in terms of the free energy principle,
then at the moment, there may be fundamental reasons
to assume that is the case.
We’re talking about ensemble activity.
We’re talking about basically, for example,
let’s paint the challenge facing brain computer interfacing
in terms of controlling another system
that is highly and deeply structured,
very relevant to our lives, very nonlinear,
that rests upon the kind of nonequilibrium
steady states and dynamics that the brain does,
the weather, all right?
So imagine you had some very aggressive satellites
that could produce signals that could perturb
some little parts of the weather system.
And then what you’re asking now is,
can I meaningfully get into the weather
and change it meaningfully and make the weather respond
in a way that I want it to?
You’re talking about chaos control on a scale
which is almost unimaginable.
So there may be fundamental reasons
why BCI, as you might read about it in a science fiction novel,
aspirational BCI may never actually work
in the sense that to really be integrated
and be part of the system is a requirement
that requires you to have evolved with that system,
that you have to be part of a very delicately structured,
deeply structured, dynamic, ensemble activity
that is not like rewiring a broken computer
or plugging in a peripheral interface adapter.
It is much more like getting into the weather patterns
or, come back to your magic soup,
getting into the active matter
and meaningfully relate that to the outside world.
So I think there are enormous challenges there.
So I think the example of the weather is a brilliant one.
And I think you paint a really interesting picture
and it wasn’t as negative as I thought.
It’s essentially saying that it might be
incredibly challenging, including the low bound
of the bandwidth and so on.
I kind of, so just to full disclosure,
I come from the machine learning world.
So my natural thought is the hardest part
is the engineering challenge of controlling the weather,
of getting those satellites up and running and so on.
And once they are, then the rest is fundamentally
the same approaches that allow you to win in the game of Go
will allow you to potentially play in this soup,
in this chaos.
So I have a hope that sort of machine learning methods
will help us play in this soup.
But perhaps you’re right that it is a biology
and the brain is just an incredible system
that may be almost impossible to get in.
But for me, what seems impossible
is the incredible mess of blood vessels
that you also described without,
we also value the brain.
You can’t make any mistakes, you can’t damage things.
So to me, that engineering challenge seems nearly impossible.
One of the things I was really impressed by at Neuralink
is just talking to brilliant neurosurgeons
and the roboticists that made me realize
that even though it seems impossible,
if anyone can do it, it’s some of these world class
engineers that are trying to take it on.
So I think the conclusion of our discussion here
of this part is basically that the problem is really hard
but hopefully not impossible.
Absolutely.
So if it’s okay, let’s start with the basics.
So you’ve also formulated a fascinating principle,
the free energy principle.
Could we maybe start at the basics
and what is the free energy principle?
Well, in fact, the free energy principle
inherits a lot from the building
of these data analytic approaches
to these very high dimensional time series
you get from the brain.
So I think it’s interesting to acknowledge that.
And in particular, the analysis tools
that try to address the other side,
which is a functional integration,
so the connectivity analyses.
So on the one hand, but I should also acknowledge
it inherits an awful lot from machine learning as well.
So the free energy principle is just a formal statement
that the existential imperatives for any system
that manages to survive in a changing world
can be cast as an inference problem
in the sense that you can interpret
the probability of existing as the evidence that you exist.
And if you can write down that problem of existence
as a statistical problem,
then you can use all the maths that has been developed
for inference to understand and characterize
the ensemble dynamics that must be in play
in the service of that inference.
So technically, what that means is
you can always interpret anything that exists
in virtue of being separate from the environment
in which it exists as trying to minimize
variational free energy.
And if you’re from the machine learning community,
you will know that as a negative evidence lower bound
or a negative elbow, which is the same as saying
you’re trying to maximize or it will look as if
all your dynamics are trying to maximize
the complement of that which is the marginal likelihood
or the evidence for your own existence.
So that’s basically the free energy principle.
But to even take a sort of a small step backwards,
you said the existential imperative.
There’s a lot of beautiful poetic words here,
but to put it crudely, it’s a fascinating idea
of basically just of trying to describe
if you’re looking at a blob,
how do you know this thing is alive?
What does it mean to be alive?
What does it mean to exist?
And so you can look at the brain,
you can look at parts of the brain,
or this is just a general principle
that applies to almost any system.
That’s just a fascinating sort of philosophically
at every level question and a methodology
to try to answer that question.
What does it mean to be alive?
So that’s a huge endeavor and it’s nice
that there’s at least some,
from some perspective, a clean answer.
So maybe can you talk about that optimization view of it?
So what’s trying to be minimized, maximized?
A system that’s alive, what is it trying to minimize?
Right, you’ve made a big move there.
First of all, it’s good to make big moves.
But you’ve assumed that the thing exists
in a state that could be living or nonliving.
So I may ask you, what licenses you
to say that something exists?
That’s why I use the word existential.
It’s beyond living, it’s just existence.
So if you drill down onto the definition
of things that exist, then they have certain properties
if you borrow the maths
from nonequilibrium steady state physics
that enable you to interpret their existence
in terms of this optimization procedure.
So it’s good you introduced the word optimization.
So what the free energy principle
in its sort of most ambitious,
but also most deflationary and simplest, says
is that if something exists,
then it must, by the mathematics
of nonequilibrium steady state,
exhibit properties that make it look
as if it is optimizing a particular quantity.
And it turns out that particular quantity
happens to be exactly the same
as the evidence lower bound in machine learning
or Bayesian model evidence in Bayesian statistics.
Or, and then I can list a whole other list
of ways of understanding this key quantity,
which is a bound on surprise or self information
if you have information theory.
There are a number of different perspectives
on this quantity.
It’s just basically the log probability
of being in a particular state.
I’m telling this story as an honest,
an attempt to answer your question.
And I’m answering it as if I was pretending
to be a physicist who was trying to understand
the fundaments of nonequilibrium steady state.
And I shouldn’t really be doing that
because the last time I was taught physics,
I was in my 20s.
What kind of systems,
when you think about the free energy principle,
what kind of systems are you imagining
as a sort of more specific kind of case study?
Yeah, I’m imagining a range of systems,
but at its simplest, a single celled organism
that can be identified from its eco niche
or its environment.
So at its simplest, that’s basically
what I always imagined in my head.
And you may ask, well, is there any,
how on earth can you even elaborate questions
about the existence of a single drop of oil, for example?
But there are deep questions there.
Why doesn’t the oil, why doesn’t the thing,
the interface between the drop of oil
that contains an interior
and the thing that is not the drop of oil,
which is the solvent in which it is immersed,
how does that interface persist over time?
Why doesn’t the oil just dissolve into solvent?
So what special properties of the exchange
between the surface of the oil drop
and the external states in which it’s immersed,
if you’re a physicist, say it would be the heat bath.
You’ve got a physical system, an ensemble again,
we’re talking about density dynamics, ensemble dynamics,
an ensemble of atoms or molecules immersed in the heat bath.
But the question is, how did the heat bath get there?
And why does it not dissolve?
How is it maintaining itself?
Exactly.
What actions is it?
I mean, it’s such a fascinating idea of a drop of oil
and I guess it would dissolve in water,
it wouldn’t dissolve in water.
So what?
Precisely, so why not?
So why not?
Why not?
And how do you mathematically describe,
I mean, it’s such a beautiful idea.
And also the idea of like, where does the thing,
where does the drop of oil end and where does it begin?
Right, so I mean, you’re asking deep questions,
deep in a nonmillennial sense here.
In a hierarchical sense.
But what you can do, so this is the deflationary part of it.
Can I just qualify my answer by saying that normally
when I’m asked this question,
I answer from the point of view of a psychologist,
we talk about predictive processing and predictive coding
and the brain as an inference machine,
but you haven’t asked me from that perspective,
I’m answering from the point of view of a physicist.
So the question is not so much why,
but if it exists, what properties must it display?
So that’s the deflationary part of the free energy principle.
The free energy principle does not supply an answer
as to why, it’s saying if something exists,
then it must display these properties.
That’s the sort of thing that’s on offer.
And it so happens that these properties it must display
are actually intriguing and have this inferential gloss,
this sort of self evidencing gloss that inherits on the fact
that the very preservation of the boundary
between the oil drop and the not oil drop
requires an optimization of a particular function
or a functional that defines the presence
or the existence of this oil drop,
which is why I started with existential imperatives.
It is a necessary condition for existence
that this must occur because the boundary
basically defines the thing that’s existing.
So it is that self assembly aspect
it’s that you were hinting at in biology,
sometimes known as autopoiesis
in computational chemistry with self assembly.
It’s the, what does it look like?
Sorry, how would you describe things
that configure themselves out of nothing?
The way they clearly demarcate themselves
from the states or the soup in which they are immersed.
So from the point of view of computational chemistry,
for example, you would just understand that
as a configuration of a macro molecule
to minimize its free energy, its thermodynamic free energy.
It’s exactly the same principle that we’ve been talking about
that thermodynamic free energy is just the negative elbow.
It’s the same mathematical construct.
So the very emergence of existence, of structure, of form
that can be distinguished from the environment
or the thing that is not the thing
necessitates the existence of an objective function
that it looks as if it is minimizing.
It’s finding a free energy minima.
And so just to clarify, I’m trying to wrap my head around.
So the free energy principle says that if something exists,
these are the properties it should display.
Yeah.
So what that means is we can’t just look,
we can’t just go into a soup and there’s no mechanism.
Free energy principle doesn’t give us a mechanism
to find the things that exist.
Is that what’s implying, is being implied
that you can kind of use it to reason,
to think about like, study a particular system
and say, does this exhibit these qualities?
That’s an excellent question.
But to answer that, I’d have to return
to your previous question about what’s the difference
between living and nonliving things.
Yes, well, actually, sorry.
So yeah, maybe we can go there.
Maybe we can go there, you kind of drew a line
and forgive me for the stupid questions,
but you kind of drew a line between living and existing.
Is there an interesting sort of distinction?
Yeah, I think there is.
So things do exist, grains of sand,
rocks on the moon, trees, you.
So all of these things can be separated from the environment
in which they are immersed.
And therefore, they must at some level
be optimizing their free energy,
taking this sort of model evidence interpretation
of this quantity that basically means
they’re self evidencing.
Another nice little twist of phrase here
is that you are your own existence proof,
statistically speaking, which I don’t think
I said that, somebody did, but I love that phrase.
You are your own existence proof.
Yeah, so it’s so existential, isn’t it?
I’m gonna have to think about that for a few days.
That’s a beautiful line.
So the step through to answer your question
about what’s it good for,
we’ll go along the following lines.
First of all, you have to define what it means
to exist, which now, as you’ve rightly pointed out,
you have to define what probabilistic properties
must the states of something possess
so it knows where it finishes.
And then you write that down in terms
of statistical dependencies, again, sparsity.
Again, it’s not what’s connected or what’s correlated
or what depends upon, it’s what’s not correlated
and what doesn’t depend upon something.
Again, it comes down to the deep structures,
not in this instance, hierarchical,
but the structures that emerge
from removing connectivity and dependency.
And in this instance, basically being able
to identify the surface of the oil drop
from the water in which it is immersed.
And when you do that, you start to realize,
well, there are actually four kinds of states
in any given universe that contains anything.
The things that are internal to the surface,
the things that are external to the surface
and the surface in and of itself,
which is why I use a metaphor,
a little single celled organism
that has an interior and exterior
and then the surface of the cell.
And that’s mathematically a Markov blanket.
Just to pause, I’m in awe of this concept
that there’s the stuff outside the surface,
stuff inside the surface and the surface itself,
the Markov blanket.
It’s just the most beautiful kind of notion
about trying to explore what it means
to exist mathematically.
I apologize, it’s just a beautiful idea.
But it came out of California, so that’s.
I changed my mind.
I take it all back.
So anyway, so you were just talking
about the surface, about the Markov blanket.
So this surface or these blanket states
that are the, because they are now defined
in relation to these independencies
and what different states internal blanket
or external states can,
which ones can influence each other
and which cannot influence each other.
You can now apply standard results
that you would find in non equilibrium physics
or steady state or thermodynamics or hydrodynamics,
usually out of equilibrium solutions
and apply them to this partition.
And what it looks like is if all the normal gradient flows
that you would associate with any non equilibrium system
apply in such a way that part of the Markov blanket
and the internal states seem to be hill climbing
or doing a gradient descent on the same quantity.
And that means that you can now describe
the very existence of this oil drop.
You can write down the existence of this oil drop
in terms of flows, dynamics, equations of motion,
where the blanket states or part of them,
we call them active states and the internal states
now seem to be and must be trying to look
as if they’re minimizing the same function,
which is a low probability of occupying these states.
Interesting thing is that what would they be called
if you were trying to describe these things?
So what we’re talking about are internal states,
external states and blanket states.
Now let’s carve the blanket states
into two sensory states and active states.
Operationally, it has to be the case
that in order for this carving up
into different sets of states to exist,
the active states, the Markov blanket
cannot be influenced by the external states.
And we already know that the internal states
can’t be influenced by the external states
because the blanket separates them.
So what does that mean?
Well, it means the active states, the internal states
are now jointly not influenced by external states.
They only have autonomous dynamics.
So now you’ve got a picture of an oil drop
that has autonomy, it has autonomous states,
it has autonomous states in the sense
that there must be some parts of the surface of the oil drop
that are not influenced by the external states
and all the interior.
And together, those two states endow
even a little oil drop with autonomous states
that look as if they are optimizing
their variational free energy or their negative elbow,
their moral evidence.
And that would be an interesting intellectual exercise.
And you could say, you could even go into the realms
of panpsychism, that everything that exists
is implicitly making inferences on self evidencing.
Now we make the next move, but what about living things?
I mean, so let me ask you,
what’s the difference between an oil drop
and a little tadpole or a little lava or a plankton?
The picture was just painted of an oil drop.
Just immediately in a matter of minutes
took me into the world of panpsychism,
where you’ve just convinced me,
made me feel like an oil drop is a living,
it’s certainly an autonomous system,
but almost a living system.
So it has sensory capabilities and acting capabilities
and it maintains something.
So what is the difference between that
and something that we traditionally
think of as a living system?
That it could die or it can’t,
I mean, yeah, mortality, I’m not exactly sure.
I’m not sure what the right answer there is
because they can move,
like movement seems like an essential element
to being able to act in the environment,
but the oil drop is doing that.
So I don’t know.
Is it?
The oil drop will be moved,
but does it in and of itself move autonomously?
Well, the surface is performing actions
that maintain its structure.
Yeah, you’re being too clever.
I was, I had in mind a passive little oil drop
that’s sitting there at the bottom
on the top of a glass of water.
Sure, I guess.
What I’m trying to say is you’re absolutely right.
You’ve nailed it.
It’s movement.
So where does that movement come from?
If it comes from the inside,
then you’ve got, I think, something that’s living.
What do you mean from the inside?
What I mean is that the internal states
that can influence the active states,
where the active states can influence,
but they’re not influenced by the external states,
can cause movement.
So there are two types of oil drops, if you like.
There are oil drops where the internal states
are so random that they average themselves away,
and the thing cannot, on average,
when you do the averaging, move.
So a nice example of that would be the Sun.
The Sun certainly has internal states.
There’s lots of intrinsic autonomous activity going on,
but because it’s not coordinated,
because it doesn’t have the deep, in the millennial sense,
the hierarchical structure that the brain does,
there is no overall mode or pattern or organization
that expresses itself on the surface
that allows it to actually swim.
It can certainly have a very active surface,
but en masse, at the scale of the actual surface of the Sun,
the average position of that surface cannot, in itself, move,
because the internal dynamics are more like a hot gas.
They are literally like a hot gas,
whereas your internal dynamics are much more structured
and deeply structured,
and now you can express on your active states
with your muscles and your secretory organs,
your autonomic nervous system and its effectors,
you can actually move, and that’s all you can do.
And that’s something which,
if you haven’t thought of it like this before,
I think it’s nice to just realize
there is no other way that you can change the universe
other than simply moving.
Whether that moving is articulating with my voice box
or walking around or squeezing juices
out of my secretory organs,
there’s only one way you can change the universe.
It’s moving.
And the fact that you do so nonrandomly makes you alive.
Yeah, so it’s that nonrandomness.
And that would be manifested,
we realize, in terms of essentially swimming,
essentially moving, changing one’s shape,
a morphogenesis that is dynamic and possibly adaptive.
So that’s what I was trying to get at
between the difference between the oil drop
and the little tadpole.
The tadpole is moving around.
Its active states are actually changing the external states.
And there’s now a cycle,
an action perception cycle, if you like,
a recurrent dynamic that’s going on
that depends upon this deeply structured autonomous behavior
that rests upon internal dynamics
that are not only modeling
the data impressed upon their surface or the blanket states,
but they are actively resampling those data by moving.
They’re moving towards chemical gradients and chemotaxis.
So they’ve gone beyond just being good little models
of the kind of world they live in.
For example, an oil droplet could, in a panpsychic sense,
be construed as a little being
that has now perfectly inferred.
It’s a passive, nonliving oil drop
living in a bowl of water.
No problem.
But to now equip that oil drop with the ability to go out
and test that hypothesis about different states of beings.
So it can actually push its surface over there, over there,
and test for chemical gradients,
or then you start to move to a much more lifelike form.
This is all fun, theoretically interesting,
but it actually is quite important
in terms of reflecting what I have seen
since the turn of the millennium,
which is this move towards an inactive
and embodied understanding of intelligence.
And you say you’re from machine learning.
So what that means,
the central importance of movement,
I think has yet to really hit machine learning.
It certainly has now diffused itself throughout robotics.
And perhaps you could say certain problems in active vision
where you actually have to move the camera
to sample this and that.
But machine learning of the data mining deep learning sort
simply hasn’t contended with this issue.
What it’s done, instead of dealing with the movement problem
and the active sampling of data,
it’s just said, we don’t need to worry about,
we can see all the data because we’ve got big data.
So we can ignore movement.
So that for me is an important omission
in current machine learning.
The current machine learning is much more like the oil drop.
Yes.
But an oil drop that enjoys exposure
to nearly all the data that it will ever need to be exposed to,
as opposed to the tadpoles swimming out
to find the right data.
For example, it likes food.
That’s a good hypothesis.
Let’s test it out.
Let’s go and move and ingest food, for example,
and see is that evidence that I’m the kind of thing
that likes this kind of food.
So the next natural question, and forgive this question,
but if we think of sort of even artificial intelligence
systems, which I just painted a beautiful picture
of existence and life.
So do you ascribe, do you find within this framework
a possibility of defining consciousness
or exploring the idea of consciousness?
Like what, you know, self awareness
and expand it to consciousness?
Yeah.
How can we start to think about consciousness
within this framework?
Is it possible?
Well, yeah, I think it’s possible to think about it,
whether you’ll get it.
Get anywhere is another question.
And again, I’m not sure that I’m licensed
to answer that question.
I think you’d have to speak to a qualified philosopher
to get a definitive answer to that.
But certainly, there’s a lot of interest
in using not just these ideas, but related ideas
from information theory to try and tie down
the maths and the calculus and the geometry of consciousness,
either in terms of sort of a minimal consciousness,
even less than a minimal selfhood.
And what I’m talking about is the ability, effectively,
to plan, to have agency.
So you could argue that a virus does have a form of agency
in virtue of the way that it selectively
finds hosts and cells to live in and moves around.
But you wouldn’t endow it with the capacity
to think about planning and moving in a purposeful way
where it countenances the future.
Whereas you might an ant.
You might think an ant’s not quite as unconscious
as a virus.
It certainly seems to have a purpose.
It talks to its friends en route during its foraging.
It has a different kind of autonomy, which is biotic,
but beyond a virus.
So there’s something about, so there’s
some line that has to do with the complexity of planning
that may contain an answer.
I mean, it would be beautiful if we
can find a line beyond which we could say a being is conscious.
Yes, it will be.
These are wonderful lines that we’ve drawn with existence,
life, and consciousness.
Yes, it will be very nice.
One little wrinkle there, and this
is something I’ve only learned in the past few months,
is the philosophical notion of vagueness.
So you’re saying it would be wonderful to draw a line.
I had always assumed that that line at some point
would be drawn until about four months ago,
and the philosopher taught me about vagueness.
So I don’t know if you’ve come across this,
but it’s a technical concept.
And I think most revealingly illustrated with,
at what point does a pile of sand become a pile?
Is it one grain, two grains, three grains, or four grains?
So at what point would you draw the line
between being a pile of sand and a collection of grains of sand?
In the same way, is it right to ask,
where would I draw the line between conscious
and unconscious?
And it might be a vague concept.
Having said that, I agree with you entirely.
Systems that have the ability to plan.
So just technically, what that means
is your inferential self evidencing,
by which I simply mean the thermodynamics and gradient
flows that underwrite the preservation of your oil
droplet like form, can be described
as an optimization of log Bayesian model
evidence, your elbow.
That self evidencing must be evidence
for a model of what’s causing the sensory impressions
on the sensory part of your surface or your Markov
blanket.
If that model is capable of planning,
it must include a model of the future consequences
of your active states or your action, just planning.
So we’re now in the game of planning as inference.
Now notice what we’ve made, though.
We’ve made quite a big move away from big data and machine
learning, because again, it’s the consequences of moving.
It’s the consequences of selecting those data or those
data or looking over there.
And that tells you immediately that even
to be a contender for a conscious artifact or a strong
AI or generalized, I don’t know what that’s called nowadays,
then you’ve got to have movement in the game.
And furthermore, you’ve got to have a generative model
of the sort you might find in, say, a variational auto
encoder that is thinking about the future conditioned
upon different courses of action.
Now that brings a number of things to the table, which
now you start to think, well, those
have got all the right ingredients
to talk about consciousness.
I’ve now got to select among a number of different courses
of action into the future as part of planning.
I’ve now got free will.
The act of selecting this course of action or that policy
or that policy or that action suddenly
makes me into an inference machine,
a self evidencing artifact that now
looks as if it’s selecting amongst different alternative
ways forward, as I actively swim here or swim there
or look over here, look over there.
So I think you’ve now got to a situation
if there is planning in the mix.
You’re now getting much closer to that line
if that line were ever to exist.
I don’t think it gets you quite as far as self aware, though.
And then you have to, I think, grapple with the question,
how would formally write down a calculus or a maths
of self awareness?
I don’t think it’s impossible to do.
But I think there would be pressure on you
to actually commit to a formal definition of what
you mean by self awareness.
I think most people that I know would probably
say that a goldfish, a pet fish, was not self aware.
They would probably argue about their favorite cat,
but would be quite happy to say that their mom was self aware.
So.
I mean, but that might very well connect
to some level of complexity with planning.
It seems like self awareness is essential for complex planning.
Yeah.
Do you want to take that further?
Because I think you’re absolutely right.
Again, the line is unclear, but it
seems like integrating yourself into the world,
into your planning, is essential for constructing complex plans.
Yes.
Yeah.
So mathematically describing that in the same elegant way
as you have with the free energy principle might be difficult.
Well, yes and no.
I don’t think that, well, perhaps we should just,
can we just go back?
That’s a very important answer you gave.
And I think if I just unpacked it,
you’d see the truisms that you’ve just exposed for us.
But let me, sorry, I’m mindful that I didn’t answer
your question before.
Well, what’s the free energy principle good for?
Is it just a pretty theoretical exercise
to explain nonequilibrium steady states?
Yes, it is.
It does nothing more for you than that.
It can be regarded, it’s going to sound very arrogant,
but it is of the sort of theory of natural selection,
or a hypothesis of natural selection.
Beautiful, undeniably true, but tells you
absolutely nothing about why you have legs and eyes.
It tells you nothing about the actual phenotype,
and it wouldn’t allow you to build something.
So the free energy principle by itself
is as vacuous as most tautological theories.
And by tautological, of course,
I’m talking to the theory of natural,
the survival of the fittest.
What’s the fittest of those that survive?
Why do they cycle?
It’s the fitter.
It just goes around in circles.
In a sense, the free energy principle has that same
deflationary tautology under the hood.
It’s a characteristic of things that exist.
Why do they exist?
Because they minimize their free energy.
Why do they minimize their free energy?
Because they exist.
And you just keep on going round and round and round.
But the practical thing,
which you don’t get from natural selection,
but you could say has now manifest in things
like differential evolution or genetic algorithms
and MCMC, for example, in machine learning.
The practical thing you can get is,
if it looks as if things that exist
are trying to have density dynamics
and look as though they’re optimizing
a variational free energy,
and a variational free energy has to be
a functional of a generative model,
a probabilistic description of causes and consequences,
causes out there, consequences in the sensorium
on the sensory parts of the Markov blanket,
then it should, in theory, be possible
to write down the generative model,
work out the gradients,
and then cause it to autonomously self evidence.
So you should be able to write down oil droplets.
You should be able to create artifacts
where you have supplied the objective function
that supplies the gradients,
that supplies the self organizing dynamics
to non equilibrium steady state.
So there is actually a practical application
of the free energy principle
when you can write down your required evidence
in terms of, well, when you can write down
the generative model that is the thing
that has the evidence.
The probability of these sensory data
or this data, given that model,
is effectively the thing that the elbow
or the variational free energy bounds or approximates.
That means that you can actually write down the model
and the kind of thing that you want to engineer,
the kind of AGI or artificial general intelligence
that you want to manifest probabilistically,
and then you engineer, a lot of hard work,
but you would engineer a robot and a computer
to perform a gradient descent on that objective function.
So it does have a practical implication.
Now, why am I wittering on about that?
It did seem relevant to, yes.
So what kinds of, so the answer to,
would it be easier or would it be hard?
Well, mathematically, it’s easy.
I’ve just told you all you need to do
is write down your perfect artifact,
probabilistically, in the form
of a probabilistic generative model,
a probability distribution over the causes
and consequences of the world
in which this thing is immersed.
And then you just engineer a computer and a robot
to perform a gradient descent on that objective function.
No problem.
But of course, the big problem
is writing down the generative model.
So that’s where the heavy lifting comes in.
So it’s the form and the structure of that generative model
which basically defines the artifact that you will create
or, indeed, the kind of artifact that has self awareness.
So that’s where all the hard work comes,
very much like natural selection doesn’t tell you
in the slightest why you have eyes.
So you have to drill down on the actual phenotype,
the actual generative model.
So with that in mind, what did you tell me
that tells me immediately the kinds of generative models
I would have to write down in order to have self awareness?
What you said to me was I have to have a model
that is effectively fit for purpose
for this kind of world in which I operate.
And if I now make the observation
that this kind of world is effectively largely populated
by other things like me, i.e. you,
then it makes enormous sense
that if I can develop a hypothesis
that we are similar kinds of creatures,
in fact, the same kind of creature,
but I am me and you are you,
then it becomes, again, mandated to have a sense of self.
So if I live in a world
that is constituted by things like me,
basically a social world, a community,
then it becomes necessary now for me to infer
that it’s me talking and not you talking.
I wouldn’t need that if I was on Mars by myself
or if I was in the jungle as a feral child.
If there was nothing like me around,
there would be no need to have an inference
at a hypothesis, oh yes, it is me
that is experiencing or causing these sounds
and it is not you.
It’s only when there’s ambiguity in play
induced by the fact that there are others in that world.
So I think that the special thing about self aware artifacts
is that they have learned to, or they have acquired,
or at least are equipped with, possibly by evolution,
generative models that allow for the fact
there are lots of copies of things like them around,
and therefore they have to work out it’s you and not me.
That’s brilliant.
I’ve never thought of that.
I never thought of that, that the purpose
of the really usefulness of consciousness
or self awareness in the context of planning
existing in the world is so you can operate
with other things like you, and like you could,
it doesn’t have to necessarily be human.
It could be other kind of similar creatures.
Absolutely, well, we view a lot of our attributes
into our pets, don’t we?
Or we try to make our robots humanoid.
And I think there’s a deep reason for that,
that it’s just much easier to read the world
if you can make the simplifying assumption
that basically you’re me, and it’s just your turn to talk.
I mean, when we talk about planning,
when you talk specifically about planning,
the highest, if you like, manifestation or realization
of that planning is what we’re doing now.
I mean, the human condition doesn’t get any higher
than this talking about the philosophy of existence
and the conversation.
But in that conversation, there is a beautiful art
of turn taking and mutual inference, theory of mind.
I have to know when you wanna listen.
I have to know when you want to interrupt.
I have to make sure that you’re online.
I have to have a model in my head
of your model in your head.
That’s the highest, the most sophisticated form
of generative model, where the generative model
actually has a generative model
of somebody else’s generative model.
And I think that, and what we are doing now evinces
the kinds of generative models
that would support self awareness,
because without that, we’d both be talking over each other,
or we’d be singing together in a choir.
That’s not a brilliant analogy for what I’m trying to say,
but yeah, we wouldn’t have this discourse.
We wouldn’t have this.
Yeah, the dance of it.
Yeah, that’s right.
As I interrupt, I mean, that’s beautifully put.
I’ll re listen to this conversation many times.
There’s so much poetry in this, and mathematics.
Let me ask the silliest, or perhaps the biggest question
as a last kind of question.
We’ve talked about living in existence
and the objective function under which
these objects would operate.
What do you think is the objective function
of our existence?
What’s the meaning of life?
What do you think is the, for you, perhaps,
the purpose, the source of fulfillment,
the source of meaning for your existence,
as one blob in this soup?
I’m tempted to answer that, again, as a physicist,
until it’s the free energy I expect
consequent upon my behavior.
So technically, we could get a really interesting
conversation about what that comprises
in terms of searching for information,
resolving uncertainty about the kind of thing that I am.
But I suspect that you want a slightly more personal
and fun answer, but which can be consistent with that.
And I think it’s reassuringly simple
and hops back to what you were taught as a child,
that you have certain beliefs about the kind of creature
and the kind of person you are.
And all that self evidencing means,
all that minimizing variational free energy
in an inactive and embodied way,
means is fulfilling the beliefs about
what kind of thing you are.
And of course, we’re all given those scripts,
those narratives, at a very early age,
usually in the form of bedtime stories or fairy stories
that I’m a princess and I’m gonna meet a beast
who’s gonna transform and he’s gonna be a prince.
And so the narratives are all around you
from your parents to the friends
to the society feeds these stories.
And then your objective function is to fulfill.
Exactly, that narrative that has been encultured
by your immediate family, but as you say,
also the sort of the culture in which you grew up
and you create for yourself.
I mean, again, because of this active inference,
this inactive aspect of self evidencing,
not only am I modeling my environment,
my eco niche, my external states out there,
but I’m actively changing them all the time
and doing the same back, we’re doing it together.
So there’s a synchrony that means that I’m creating
my own culture over different timescales.
So the question now is for me being very selfish,
what scripts were I given?
It basically was a mixture between Einstein and shark homes.
So I smoke as heavily as possible,
try to avoid too much interpersonal contact,
enjoy the fantasy that you’re a popular scientist
who’s gonna make a difference in a slightly quirky way.
So that’s what I grew up on.
My father was an engineer and loved science
and he loved sort of things like Sir Arthur Edmonds,
Spacetime and Gravitation, which was the first
understandable version of general relativity.
So all the fairy stories I was told as I was growing up
were all about these characters.
I’m keeping the Hobbit out of this
because that doesn’t quite fit my narrative.
There’s a journey of exploration, I suppose, of sorts.
So yeah, I’ve just grown up to be what I imagine
a mild mannered Sherlock Holmes slash Albert Einstein
would do in my shoes.
And you did it elegantly and beautifully.
Carl was a huge honor talking today, it was fun.
Thank you so much for your time.
No, thank you. Appreciate it.
Thank you for listening to this conversation
with Carl Friston and thank you
to our presenting sponsor, Cash App.
Please consider supporting the podcast
by downloading Cash App and using code LexPodcast.
If you enjoy this podcast, subscribe on YouTube,
review it with five stars on Apple Podcast,
support on Patreon, or simply connect with me on Twitter
at LexFriedman.
And now let me leave you with some words from Carl Friston.
Your arm moves because you predict it will
and your motor system seeks to minimize prediction error.
Thank you for listening and hope to see you next time.