Lex Fridman Podcast - #99 - Karl Friston: Neuroscience and the Free Energy Principle

The following is a conversation with Carl Friston,

one of the greatest neuroscientists in history.

Cited over 245,000 times,

known for many influential ideas in brain imaging,

neuroscience, and theoretical neurobiology,

including especially the fascinating idea

of the free energy principle for action and perception.

Carl’s mix of humor, brilliance, and kindness,

to me, are inspiring and captivating.

This was a huge honor and a pleasure.

This is the Artificial Intelligence Podcast.

If you enjoy it, subscribe on YouTube,

review it with five stars on Apple Podcast,

support it on Patreon,

or simply connect with me on Twitter,

at Lex Friedman, spelled F R I D M A N.

As usual, I’ll do a few minutes of ads now,

and never any ads in the middle

that can break the flow of the conversation.

I hope that works for you,

and doesn’t hurt the listening experience.

This show is presented by Cash App,

the number one finance app in the App Store.

When you get it, use code LEXPODCAST.

Cash App lets you send money to friends by Bitcoin,

and invest in the stock market with as little as $1.

Since Cash App allows you to send

and receive money digitally,

let me mention a surprising fact related to physical money.

Of all the currency in the world,

roughly 8% of it is actual physical money.

The other 92% of money only exists digitally.

So again, if you get Cash App from the App Store,

Google Play, and use the code LEXPODCAST, you get $10,

and Cash App will also donate $10 to FIRST,

an organization that is helping to advance robotics

and STEM education for young people around the world.

And now, here’s my conversation with Carl Friston.

How much of the human brain do we understand

from the low level of neuronal communication

to the functional level to the highest level,

maybe the psychiatric disorder level?

Well, we’re certainly in a better position

than we were last century.

How far we’ve got to go, I think,

is almost an unanswerable question.

So you’d have to set the parameters,

you know, what constitutes understanding, what level

of understanding do you want?

I think we’ve made enormous progress

in terms of broad brush principles.

Whether that affords a detailed cartography

of the functional anatomy of the brain and what it does,

right down to the microcircuitry and the neurons,

that’s probably out of reach at the present time.

So the cartography, so mapping the brain,

do you think mapping of the brain,

the detailed, perfect imaging of it,

does that get us closer to understanding

of the mind, of the brain?

So how far does it get us if we have

that perfect cartography of the brain?

I think there are lower bounds on that.

It’s a really interesting question.

And it would determine the sort of scientific career

you’d pursue if you believe that knowing

every dendritic connection, every sort of microscopic,

synaptic structure right down to the molecular level

was gonna give you the right kind of information

to understand the computational anatomy,

then you’d choose to be a microscopist

and you would study little cubic millimeters of brain

for the rest of your life.

If on the other hand you were interested

in holistic functions and a sort of functional anatomy

of the sort that a neuropsychologist would understand,

you’d study brain lesions and strokes,

just looking at the whole person.

So again, it comes back to at what level

do you want understanding?

I think there are principled reasons not to go too far.

If you commit to a view of the brain

as a machine that’s performing a form of inference

and representing things, that level of understanding

is necessarily cast in terms of probability densities

and ensemble densities, distributions.

And what that tells you is that you don’t really want

to look at the atoms to understand the thermodynamics

of probabilistic descriptions of how the brain works.

So I personally wouldn’t look at the molecules

or indeed the single neurons in the same way

if I wanted to understand the thermodynamics

of some non equilibrium steady state of a gas

or an active material, I wouldn’t spend my life

looking at the individual molecules

that constitute that ensemble.

I’d look at their collective behavior.

On the other hand, if you go too coarse grain,

you’re gonna miss some basic canonical principles

of connectivity and architectures.

I’m thinking here this bit colloquial,

but this current excitement about high field

magnetic resonance imaging at seven Tesla, why?

Well, it gives us for the first time the opportunity

to look at the brain in action at the level

of a few millimeters that distinguish

between different layers of the cortex

that may be very important in terms of evincing

generic principles of conical microcircuitry

that are replicated throughout the brain

that may tell us something fundamental

about message passing in the brain

and these density dynamics or neuronal

or some more population dynamics

that underwrite our brain function.

So somewhere between a millimeter and a meter.

Lingering for a bit on the big questions if you allow me,

what to use the most beautiful or surprising characteristic

of the human brain?

I think it’s hierarchical and recursive aspect.

It’s recurrent aspect.

Of the structure or of the actual

representation of power of the brain?

Well, I think one speaks to the other.

I was actually answering in a dull minded way

from the point of view of purely its anatomy

and its structural aspects.

I mean, there are many marvelous organs in the body.

Let’s take your liver for example.

Without it, you wouldn’t be around for very long

and it does some beautiful and delicate biochemistry

and homeostasis and evolved with a finesse

that would easily parallel the brain

but it doesn’t have a beautiful anatomy.

It has a simple anatomy which is attractive

in a minimalist sense but it doesn’t have

that crafted structure of sparse connectivity

and that recurrence and that specialization

that the brain has.

So you said a lot of interesting terms here.

So the recurrence, the sparsity,

but you also started by saying hierarchical.

So I’ve never thought of our brain as hierarchical.

Sort of I always thought it’s just like a giant mess,

interconnected mess where it’s very difficult

to figure anything out.

But in what sense do you see the brain as hierarchical?

Well, I see it, it’s not a magic soup.

Which of course is what I used to think

before I studied medicine and the like.

So a lot of those terms imply each other.

So hierarchies, if you just think about

the nature of a hierarchy,

how would you actually build one?

And what you would have to do is basically

carefully remove the right connections

that destroy the completely connected soups

that you might have in mind.

So a hierarchy is in and of itself defined

by a sparse and particular connectivity structure.

I’m not committing to any particular form of hierarchy.

But your sense is there is some.

Oh, absolutely, yeah.

In virtue of the fact that there is a sparsity

of connectivity, not necessarily of a qualitative sort,

but certainly of a quantitative sort.

So it is demonstrably so that the further apart

two parts of the brain are,

the less likely they are to be wired,

to possess axonal processes, neuronal processes

that directly communicate one message

or messages from one part of that brain

to the other part of the brain.

So we know there’s a sparse connectivity.

And furthermore, on the basis of anatomical connectivity

in traces studies, we know that that sparsity

underwrites a hierarchical and very structured

sort of connectivity that might be best understood

like a little bit like an onion.

There is a concentric, sometimes referred to as centripetal

by people like Marcel Masulam,

hierarchical organization to the brain.

So you can think of the brain as in a rough sense,

like an onion, and all the sensory information

and all the afferent outgoing messages

that supply commands to your muscles

or to your secretory organs come from the surface.

So there’s a massive exchange interface

with the world out there on the surface.

And then underneath, there’s a little layer

that sits and looks at the exchange on the surface.

And then underneath that, there’s a layer

right the way down to the very center,

to the deepest part of the onion.

That’s what I mean by a hierarchical organization.

There’s a discernible structure defined

by the sparsity of connections

that lends the architecture a hierarchical structure

that tells one a lot about the kinds of representations

and messages.

Coming back to your earlier question,

is this about the representational capacity

or is it about the anatomy?

Well, one underwrites the other.

If one simply thinks of the brain

as a message passing machine,

a process that is in the service of doing something,

then the circuitry and the connectivity

that shape that message passing also dictate its function.

So you’ve done a lot of amazing work

in a lot of directions.

So let’s look at one aspect of that,

of looking into the brain

and trying to study this onion structure.

What can we learn about the brain by imaging it?

Which is one way to sort of look at the anatomy of it.

Broadly speaking, what are the methods of imaging,

but even bigger, what can we learn about it?

Right, so well, most human neuroimaging

that you might see in science journals

that speaks to the way the brain works,

measures brain activity over time.

So that’s the first thing to say,

that we’re effectively looking at fluctuations

in neuronal responses,

usually in response to some sensory input

or some instruction, some task.

Not necessarily, there’s a lot of interest

in just looking at the brain

in terms of resting state, endogenous,

or intrinsic activity.

But crucially, at every point,

looking at these fluctuations,

either induced or intrinsic in the neural activity,

and understanding them at two levels.

So normally, people would recourse

to two principles of brain organization

that are complementary.

One, functional specialization or segregation.

So what does that mean?

It simply means that there are certain parts of the brain

that may be specialized for certain kinds of processing.

For example, visual motion,

our ability to recognize or to perceive movement

in the visual world.

And furthermore, that specialized processing

may be spatially or anatomically segregated,

leading to functional segregation.

Which means that if I were to compare your brain activity

during a period of viewing a static image,

and then compare that to the responses of fluctuations

in the brain when you were exposed to a moving image,

say a flying bird,

we’d expect to see

restricted, segregated differences in activity.

And those are basically the hotspots

that you see in the statistical parametric maps

that test for the significance of the responses

that are circumscribed.

So now, basically, we’re talking about

some people have perhaps unkindly called a neocartography.

This is a phrenology augmented by modern day neuroimaging,

basically finding blobs or bumps on the brain

that do this or do that,

and trying to understand the cartography

of that functional specialization.

So how much is there such,

this is such a beautiful sort of ideal to strive for.

We humans, scientists, would like this,

to hope that there’s a beautiful structure to this

where it’s, like you said, there’s segregated regions

that are responsible for the different function.

How much hope is there to find such regions

in terms of looking at the progress of studying the brain?

Oh, I think enormous progress has been made

in the past 20 or 30 years.

So this is beyond incremental.

At the advent of brain imaging,

the very notion of functional segregation

was just a hypothesis based upon a century,

if not more, of careful neuropsychology,

looking at people who had lost via insult

or traumatic brain injury particular parts of the brain,

and then saying, well, they can’t do this

or they can’t do that.

For example, losing the visual cortex

and not being able to see,

or losing particular parts of the visual cortex

or regions known as V5

or the middle temporal region, MT,

and noticing that they selectively

could not see moving things.

And so that created the hypothesis

that perhaps visual movement processing

was located in this functionally segregated area.

And you could then go and put invasive electrodes

in animal models and say, yes, indeed,

we can excite activity here.

We can form receptive fields that are sensitive to

or defined in terms of visual motion.

But at no point could you exclude the possibility

that everywhere else in the brain

was also very interested in visual motion.

By the way, I apologize to interrupt,

but a tiny little tangent.

You said animal models, just out of curiosity,

from your perspective, how different is the human brain

versus the other animals

in terms of our ability to study the brain?

Well, clearly, the further away you go from a human brain,

the greater the differences,

but not as remarkable as you might think.

So people will choose their level of approximation

to the human brain,

depending upon the kinds of questions

that they want to answer.

So if you’re talking about sort of canonical principles

of microcircuitry, it might be perfectly okay

to look at a mouse, indeed.

You could even look at flies, worms.

If, on the other hand, you wanted to look at

the finer details of organization of visual cortex

and V1, V2, these are designated patches of cortex

that may do different things, indeed, do.

You’d probably want to use a primate

that looked a little bit more like a human,

because there are lots of ethical issues

in terms of the use of nonhuman primates

to answer questions about human anatomy.

But I think most people assume

that most of the important principles are conserved

in a continuous way, right from, well, yes,

worms right through to you and me.

So now returning to, so that was the early sort of ideas

of studying the functional regions of the brain

by if there’s some damage to it,

to try to infer that that part of the brain

might be somewhat responsible for this type of function.

So where does that lead us?

What are the next steps beyond that?

Right, well, I’ll just actually just reverse a bit,

come back to your sort of notion

that the brain is a magic soup.

That was actually a very prominent idea at one point,

notions such as Lashley’s law of mass action

inherited from the observation that for certain animals,

if you just took out spoonfuls of the brain,

it didn’t matter where you took these spoonfuls out,

they always showed the same kinds of deficits.

So it was very difficult to infer functional specialization

purely on the basis of lesion deficit studies.

But once we had the opportunity

to look at the brain lighting up

and it’s literally it’s sort of excitement, neuronal excitement

when looking at this versus that,

one was able to say, yes, indeed,

these functionally specialized responses

are very restricted and they’re here or they’re over there.

If I do this, then this part of the brain lights up.

And that became doable in the early 90s.

In fact, shortly before with the advent

of positron emission tomography.

And then functional magnetic resonance imaging

came along in the early 90s.

And since that time, there has been an explosion

of discovery, refinement, confirmation.

There are people who believe that it’s all in the anatomy.

If you understand the anatomy,

then you understand the function at some level.

And many, many hypotheses were predicated

on a deep understanding of the anatomy and the connectivity,

but they were all confirmed

and taken much further with neuroimaging.

So that’s what I meant by we’ve made an enormous amount

of progress in this century indeed,

and in relation to the previous century,

by looking at these functionally selective responses.

But that wasn’t the whole story.

So there’s this sort of near phrenology,

but finding bumps and hot spots in the brain

that did this or that.

The bigger question was, of course,

the functional integration.

How all of these regionally specific responses

were orchestrated, how they were distributed,

how did they relate to distributed processing

and indeed representations in the brain.

So then you turn to the more challenging issue

of the integration, the connectivity.

And then we come back to this beautiful,

sparse, recurrent, hierarchical connectivity

that seems characteristic of the brain

and probably not many other organs.

But nevertheless, we come back to this challenge

of trying to figure out how everything is integrated.

But what’s your feeling?

What’s the general consensus?

Have we moved away from the magic soup view of the brain?

So there is a deep structure to it.

And then maybe a further question.

You said some people believe that the structure

is most of it, that you can really get

at the core of the function

by just deeply understanding the structure.

Where do you sit on that, do you?

I think it’s got some mileage to it, yes, yeah.

So it’s a worthy pursuit of going,

of studying through imaging and all the different methods

to actually study the structure.

No, absolutely, yeah, yeah.

Sorry, I’m just noting, you were accusing me

of using lots of long words

and then you introduced one there, which is deep,

which is interesting.

Because deep is the sort of millennial equivalent

of hierarchical.

So if you’ve put deep in front of anything,

not only are you very millennial and very trending,

but you’re also implying a hierarchical architecture.

So it is a depth, which is, for me, the beautiful thing.

That’s right, the word deep kind of,

yeah, exactly, it implies hierarchy.

I didn’t even think about that.

That indeed, the implicit meaning

of the word deep is hierarchy.

Yep. Yeah.

So deep inside the onion is the center of your soul.

Beautifully put.

Maybe briefly, if you could paint a picture

of the kind of methods of neuroimaging,

maybe the history which you were a part of,

from statistical parametric mapping.

I mean, just what’s out there that’s interesting

for people maybe outside the field

to understand of what are the actual methodologies

of looking inside the human brain?

Right, well, you can answer that question

from two perspectives.

Basically, it’s the modality.

What kind of signal are you measuring?

And they can range from,

and let’s limit ourselves

to sort of imaging based noninvasive techniques.

So you’ve essentially got brain scanners,

and brain scanners can either measure

the structural attributes, the amount of water,

the amount of fat, or the amount of iron

in different parts of the brain,

and you can make lots of inferences

about the structure of the organ of the sort

that you might have produced from an X ray,

but a very nuanced X ray that is looking

at this kind of property or that kind of property.

So looking at the anatomy noninvasively

would be the first sort of neuroimaging

that people might want to employ.

Then you move on to the kinds of measurements

that reflect dynamic function,

and the most prevalent of those fall into two camps.

You’ve got these metabolic, sometimes hemodynamic,

blood related signals.

So these metabolic and or hemodynamic signals

are basically proxies for elevated activity

and message passing and neuronal dynamics

in particular parts of the brain.

Characteristically though, the time constants

of these hemodynamic or metabolic responses

to neural activity are much longer

than the neural activity itself.

And this is referring,

forgive me for the dumb questions,

but this would be referring to blood,

like the flow of blood.

Absolutely, absolutely.

So there’s a ton of,

it seems like there’s a ton of blood vessels in the brain.

Yeah.

So what’s the interaction between the flow of blood

and the function of the neurons?

Is there an interplay there or?

Yup, yup, and that interplay accounts for several careers

of world renowned scientists, yes, absolutely.

So this is known as neurovascular coupling,

is exactly what you said.

It’s how does the neural activity,

the neuronal infrastructure, the actual message passing

that we think underlies our capacity to perceive and act,

how is that coupled to the vascular responses

that supply the energy for that neural processing?

So there’s a delicate web of large vessels,

arteries and veins, that gets progressively finer

and finer in detail until it perfuses

at a microscopic level,

the machinery where little neurons lie.

So coming back to this sort of onion perspective,

we were talking before using the onion as a metaphor

for a deep hierarchical structure,

but also I think it’s just anatomically quite

a useful metaphor.

All the action, all the heavy lifting

in terms of neural computation is done

on the surface of the brain,

and then the interior of the brain is constituted

by fatty wires, essentially, axonal processes

that are enshrouded by myelin sheaths.

And these, when you dissect them, they look fatty and white,

and so it’s called white matter,

as opposed to the actual neuro pill,

which does the computation constituted largely by neurons,

and that’s known as gray matter.

So the gray matter is a surface or a skin

that sits on top of this big ball,

now we are talking magic soup,

but a big ball of connections like spaghetti,

very carefully structured with sparse connectivity

that preserves this deep hierarchical structure,

but all the action takes place on the surface,

on the cortex of the onion, and that means

that you have to supply the right amount of blood flow,

the right amount of nutrient,

which is rapidly absorbed and used by neural cells

that don’t have the same capacity

that your leg muscles would have

to basically spend their energy budget

and then claim it back later.

So one peculiar thing about cerebral metabolism,

brain metabolism, is it really needs to be driven

in the moment, which means you basically

have to turn on the taps.

So if there’s lots of neural activity

in one part of the brain, a little patch

of a few millimeters, even less possibly,

you really do have to water that piece

of the garden now and quickly,

and by quickly I mean within a couple of seconds.

So that contains a lot of, hence the imaging

could tell you a story of what’s happening.

Absolutely, but it is slightly compromised

in terms of the resolution.

So the deployment of these little microvessels

that water the garden to enable the neural activity

to play out, the spatial resolution

is in order of a few millimeters,

and crucially, the temporal resolution

is the order of a few seconds.

So you can’t get right down and dirty

into the actual spatial and temporal scale

of neural activity in and of itself.

To do that, you’d have to turn

to the other big imaging modality,

which is the recording of electromagnetic signals

as they’re generated in real time.

So here, the temporal bandwidth, if you like,

or the low limit on the temporal resolution

is incredibly small, talking about milliseconds.

And then you can get into the phasic fast responses

that is in and of itself the neural activity,

and start to see the succession or cascade

of hierarchical recurrent message passing

evoked by a particular stimulus.

But the problem is you’re looking

at electromagnetic signals that have passed

through an enormous amount of magic soup

or spaghetti of collectivity, and through the scalp

and the skull, and it’s become spatially very diffuse.

So it’s very difficult to know where you are.

So you’ve got this sort of catch 22.

You can either use an imaging modality

that tells you within millimeters

which part of the brain is activated,

but you don’t know when,

or you’ve got these electromagnetic EEG, MEG setups

that tell you to within a few milliseconds

when something has responded, but you’re not aware.

So you’ve got these two complementary measures,

either indirect via the blood flow,

or direct via the electromagnetic signals

caused by neural activity.

These are the two big imaging devices.

And then the second level of responding to your question,

what are the, from the outside,

what are the big ways of using this technology?

So once you’ve chosen the kind of neural imaging

that you want to use to answer your set questions,

and sometimes it would have to be both,

then you’ve got a whole raft of analyses,

time series analyses usually, that you can bring to bear

in order to answer your questions

or address your hypothesis about those data.

And interestingly, they both fall

into the same two camps we were talking about before,

this dialectic between specialization and integration,

differentiation and integration.

So it’s the cartography, the blobology analyses.

I apologize, I probably shouldn’t interrupt so much,

but just heard a fun word, the blah.

Blobology.

It’s a neologism, which means the study of blobs.

So nothing bob.

Are you being witty and humorous,

or does the word blobology ever appear

in a textbook somewhere?

It would appear in a popular book.

It would not appear in a worthy specialist journal.

Yeah, I thought so.

It’s the fond word for the study of literally little blobs

on brain maps showing activations.

So the kind of thing that you’d see in the newspapers

on ABC or BBC reporting the latest finding

from brain imaging.

Interestingly though, the maths involved

in that stream of analysis does actually call upon

the mathematics of blobs.

So seriously, they’re actually called Euler characteristics

and they have a lot of fancy names in mathematics.

We’ll talk about it, about your ideas

in free energy principle.

I mean, there’s echoes of blobs there

when you consider sort of entities,

mathematically speaking.

Yes, absolutely.

Well, circumscribed, well defined,

you entities of, well, from the free energy point of view,

entities of anything, but from the point of view

of the analysis, the cartography of the brain,

these are the entities that constitute the evidence

for this functional segregation.

You have segregated this function in this blob

and it is not outside of the blob.

And that’s basically the, if you were a map maker

of America and you did not know its structure,

the first thing were you doing constituting

or creating a map would be to identify the cities,

for example, or the mountains or the rivers.

All of these uniquely spatially localizable features,

possibly topological features have to be placed somewhere

because that requires a mathematics of identifying

what does a city look like on a satellite image

or what does a river look like

or what does a mountain look like?

What would it, you know, what data features

would evidence that particular top,

you know, that particular thing

that you wanted to put on the map?

And they normally are characterized

in terms of literally these blobs

or these sort of, another way of looking at this

is that a certain statistical measure

of the degree of activation crosses a threshold

and in crossing that threshold

in the spatially restricted part of the brain,

it creates a blob.

And that’s basically what statistical parametric mapping does.

It’s basically mathematically finessed blobology.

Okay, so those are the,

you kind of described these two methodologies for,

one is temporally noisy, one is spatially noisy

and you kind of have to play and figure out

what can be useful.

It’d be great if you can sort of comment.

I got a chance recently to spend a day

at a company called Neuralink

that uses brain computer interfaces

and their dream is to, well,

there’s a bunch of sort of dreams,

but one of them is to understand the brain

by sort of, you know, getting in there

past the so called sort of factory wall,

getting in there and be able to listen,

communicate both directions.

What are your thoughts about this,

the future of this kind of technology

of brain computer interfaces

to be able to now have a window

or direct contact within the brain

to be able to measure some of the signals,

to be able to sense signals,

to understand some of the functionality of the brain?

Ambivalent, my sense is ambivalent.

So it’s a mixture of good and bad

and I acknowledge that freely.

So the good bits, if you just look at the legacy

of that kind of reciprocal but invasive

your brain stimulation,

I didn’t paint a complete picture

when I was talking about sort of the ways

we understand the brain prior to neuroimaging.

It wasn’t just lesion deficit studies.

Some of the early work, in fact,

literally 100 years from where we’re sitting

at the institution of neurology,

was done by stimulating the brain of say dogs

and looking at how they responded

with their muscles or with their salivation

and imputing what that part of the brain must be doing.

If I stimulate it and I vote this kind of response,

then that tells me quite a lot

about the functional specialization.

So there’s a long history of brain stimulation

which continues to enjoy a lot of attention nowadays.

Positive attention.

Oh yes, absolutely.

You know, deep brain stimulation for Parkinson’s disease

is now a standard treatment

and also a wonderful vehicle

to try and understand the neuronal dynamics

underlying movement disorders like Parkinson’s disease.

Even interest in magnetic stimulation,

stimulating the magnetic fields

and will it work in people who are depressed, for example.

Quite a crude level of understanding what you’re doing,

but there is historical evidence

that these kinds of brute force interventions

do change things.

They, you know, it’s a little bit like banging the TV

when the valves aren’t working properly,

but it’s still, it works.

So, you know, there is a long history.

Brain computer interfacing or BCI,

I think is a beautiful example of that.

It’s sort of carved out its own niche

and its own aspirations

and there’ve been enormous advances within limits.

Advances in terms of our ability to understand

how the brain, the embodied brain,

engages with the world.

I’m thinking here of sensory substitution,

the augmenting our sensory capacities

by giving ourselves extra ways of sensing

and sampling the world,

ranging from sort of trying to replace lost visual signals

through to giving people completely new signals.

So, one of the, I think, most engaging examples of this

is equipping people with a sense of magnetic fields.

So you can actually give them magnetic sensors

that enable them to feel,

should we say, tactile pressure around their tummy,

where they are in relation to the magnetic field of the Earth.

And after a few weeks, they take it for granted.

They integrate it, they embody this,

simulate this new sensory information

into the way that they literally feel their world,

but now equipped with this sense of magnetic direction.

So that tells you something

about the brain’s plastic potential

to remodel and its plastic capacity

to suddenly try to explain the sensory data at hand

by augmenting the sensory sphere

and the kinds of things that you can measure.

Clearly, that’s purely for entertainment

and understanding the nature and the power of our brains.

I would imagine that most BCI is pitched

at solving clinical and human problems

such as locked in syndrome, such as paraplegia,

or replacing lost sensory capacities

like blindness and deafness.

So then we come to the negative part of my ambivalence,

the other side of it.

So I don’t want to be deflationary

because much of my deflationary comments

is probably large out of ignorance than anything else.

But generally speaking, the bandwidth

and the bit rates that you get

from brain computer interfaces as we currently know them,

we’re talking about bits per second.

So that would be like me only being able to communicate

with any world or with you using very, very, very slow Morse code.

And it is not even within an order of magnitude

near what we actually need for an inactive realization

of what people aspire to when they think about

sort of curing people with paraplegia or replacing sight

despite heroic efforts.

So one has to ask, is there a lower bound

on the kinds of recurrent information exchange

between a brain and some augmented or artificial interface?

And then we come back to, interestingly,

what I was talking about before,

which is if you’re talking about function

in terms of inference, and I presume we’ll get to that

later on in terms of the free energy principle,

then at the moment, there may be fundamental reasons

to assume that is the case.

We’re talking about ensemble activity.

We’re talking about basically, for example,

let’s paint the challenge facing brain computer interfacing

in terms of controlling another system

that is highly and deeply structured,

very relevant to our lives, very nonlinear,

that rests upon the kind of nonequilibrium

steady states and dynamics that the brain does,

the weather, all right?

So imagine you had some very aggressive satellites

that could produce signals that could perturb

some little parts of the weather system.

And then what you’re asking now is,

can I meaningfully get into the weather

and change it meaningfully and make the weather respond

in a way that I want it to?

You’re talking about chaos control on a scale

which is almost unimaginable.

So there may be fundamental reasons

why BCI, as you might read about it in a science fiction novel,

aspirational BCI may never actually work

in the sense that to really be integrated

and be part of the system is a requirement

that requires you to have evolved with that system,

that you have to be part of a very delicately structured,

deeply structured, dynamic, ensemble activity

that is not like rewiring a broken computer

or plugging in a peripheral interface adapter.

It is much more like getting into the weather patterns

or, come back to your magic soup,

getting into the active matter

and meaningfully relate that to the outside world.

So I think there are enormous challenges there.

So I think the example of the weather is a brilliant one.

And I think you paint a really interesting picture

and it wasn’t as negative as I thought.

It’s essentially saying that it might be

incredibly challenging, including the low bound

of the bandwidth and so on.

I kind of, so just to full disclosure,

I come from the machine learning world.

So my natural thought is the hardest part

is the engineering challenge of controlling the weather,

of getting those satellites up and running and so on.

And once they are, then the rest is fundamentally

the same approaches that allow you to win in the game of Go

will allow you to potentially play in this soup,

in this chaos.

So I have a hope that sort of machine learning methods

will help us play in this soup.

But perhaps you’re right that it is a biology

and the brain is just an incredible system

that may be almost impossible to get in.

But for me, what seems impossible

is the incredible mess of blood vessels

that you also described without,

we also value the brain.

You can’t make any mistakes, you can’t damage things.

So to me, that engineering challenge seems nearly impossible.

One of the things I was really impressed by at Neuralink

is just talking to brilliant neurosurgeons

and the roboticists that made me realize

that even though it seems impossible,

if anyone can do it, it’s some of these world class

engineers that are trying to take it on.

So I think the conclusion of our discussion here

of this part is basically that the problem is really hard

but hopefully not impossible.

Absolutely.

So if it’s okay, let’s start with the basics.

So you’ve also formulated a fascinating principle,

the free energy principle.

Could we maybe start at the basics

and what is the free energy principle?

Well, in fact, the free energy principle

inherits a lot from the building

of these data analytic approaches

to these very high dimensional time series

you get from the brain.

So I think it’s interesting to acknowledge that.

And in particular, the analysis tools

that try to address the other side,

which is a functional integration,

so the connectivity analyses.

So on the one hand, but I should also acknowledge

it inherits an awful lot from machine learning as well.

So the free energy principle is just a formal statement

that the existential imperatives for any system

that manages to survive in a changing world

can be cast as an inference problem

in the sense that you can interpret

the probability of existing as the evidence that you exist.

And if you can write down that problem of existence

as a statistical problem,

then you can use all the maths that has been developed

for inference to understand and characterize

the ensemble dynamics that must be in play

in the service of that inference.

So technically, what that means is

you can always interpret anything that exists

in virtue of being separate from the environment

in which it exists as trying to minimize

variational free energy.

And if you’re from the machine learning community,

you will know that as a negative evidence lower bound

or a negative elbow, which is the same as saying

you’re trying to maximize or it will look as if

all your dynamics are trying to maximize

the complement of that which is the marginal likelihood

or the evidence for your own existence.

So that’s basically the free energy principle.

But to even take a sort of a small step backwards,

you said the existential imperative.

There’s a lot of beautiful poetic words here,

but to put it crudely, it’s a fascinating idea

of basically just of trying to describe

if you’re looking at a blob,

how do you know this thing is alive?

What does it mean to be alive?

What does it mean to exist?

And so you can look at the brain,

you can look at parts of the brain,

or this is just a general principle

that applies to almost any system.

That’s just a fascinating sort of philosophically

at every level question and a methodology

to try to answer that question.

What does it mean to be alive?

So that’s a huge endeavor and it’s nice

that there’s at least some,

from some perspective, a clean answer.

So maybe can you talk about that optimization view of it?

So what’s trying to be minimized, maximized?

A system that’s alive, what is it trying to minimize?

Right, you’ve made a big move there.

First of all, it’s good to make big moves.

But you’ve assumed that the thing exists

in a state that could be living or nonliving.

So I may ask you, what licenses you

to say that something exists?

That’s why I use the word existential.

It’s beyond living, it’s just existence.

So if you drill down onto the definition

of things that exist, then they have certain properties

if you borrow the maths

from nonequilibrium steady state physics

that enable you to interpret their existence

in terms of this optimization procedure.

So it’s good you introduced the word optimization.

So what the free energy principle

in its sort of most ambitious,

but also most deflationary and simplest, says

is that if something exists,

then it must, by the mathematics

of nonequilibrium steady state,

exhibit properties that make it look

as if it is optimizing a particular quantity.

And it turns out that particular quantity

happens to be exactly the same

as the evidence lower bound in machine learning

or Bayesian model evidence in Bayesian statistics.

Or, and then I can list a whole other list

of ways of understanding this key quantity,

which is a bound on surprise or self information

if you have information theory.

There are a number of different perspectives

on this quantity.

It’s just basically the log probability

of being in a particular state.

I’m telling this story as an honest,

an attempt to answer your question.

And I’m answering it as if I was pretending

to be a physicist who was trying to understand

the fundaments of nonequilibrium steady state.

And I shouldn’t really be doing that

because the last time I was taught physics,

I was in my 20s.

What kind of systems,

when you think about the free energy principle,

what kind of systems are you imagining

as a sort of more specific kind of case study?

Yeah, I’m imagining a range of systems,

but at its simplest, a single celled organism

that can be identified from its eco niche

or its environment.

So at its simplest, that’s basically

what I always imagined in my head.

And you may ask, well, is there any,

how on earth can you even elaborate questions

about the existence of a single drop of oil, for example?

But there are deep questions there.

Why doesn’t the oil, why doesn’t the thing,

the interface between the drop of oil

that contains an interior

and the thing that is not the drop of oil,

which is the solvent in which it is immersed,

how does that interface persist over time?

Why doesn’t the oil just dissolve into solvent?

So what special properties of the exchange

between the surface of the oil drop

and the external states in which it’s immersed,

if you’re a physicist, say it would be the heat bath.

You’ve got a physical system, an ensemble again,

we’re talking about density dynamics, ensemble dynamics,

an ensemble of atoms or molecules immersed in the heat bath.

But the question is, how did the heat bath get there?

And why does it not dissolve?

How is it maintaining itself?

Exactly.

What actions is it?

I mean, it’s such a fascinating idea of a drop of oil

and I guess it would dissolve in water,

it wouldn’t dissolve in water.

So what?

Precisely, so why not?

So why not?

Why not?

And how do you mathematically describe,

I mean, it’s such a beautiful idea.

And also the idea of like, where does the thing,

where does the drop of oil end and where does it begin?

Right, so I mean, you’re asking deep questions,

deep in a nonmillennial sense here.

In a hierarchical sense.

But what you can do, so this is the deflationary part of it.

Can I just qualify my answer by saying that normally

when I’m asked this question,

I answer from the point of view of a psychologist,

we talk about predictive processing and predictive coding

and the brain as an inference machine,

but you haven’t asked me from that perspective,

I’m answering from the point of view of a physicist.

So the question is not so much why,

but if it exists, what properties must it display?

So that’s the deflationary part of the free energy principle.

The free energy principle does not supply an answer

as to why, it’s saying if something exists,

then it must display these properties.

That’s the sort of thing that’s on offer.

And it so happens that these properties it must display

are actually intriguing and have this inferential gloss,

this sort of self evidencing gloss that inherits on the fact

that the very preservation of the boundary

between the oil drop and the not oil drop

requires an optimization of a particular function

or a functional that defines the presence

or the existence of this oil drop,

which is why I started with existential imperatives.

It is a necessary condition for existence

that this must occur because the boundary

basically defines the thing that’s existing.

So it is that self assembly aspect

it’s that you were hinting at in biology,

sometimes known as autopoiesis

in computational chemistry with self assembly.

It’s the, what does it look like?

Sorry, how would you describe things

that configure themselves out of nothing?

The way they clearly demarcate themselves

from the states or the soup in which they are immersed.

So from the point of view of computational chemistry,

for example, you would just understand that

as a configuration of a macro molecule

to minimize its free energy, its thermodynamic free energy.

It’s exactly the same principle that we’ve been talking about

that thermodynamic free energy is just the negative elbow.

It’s the same mathematical construct.

So the very emergence of existence, of structure, of form

that can be distinguished from the environment

or the thing that is not the thing

necessitates the existence of an objective function

that it looks as if it is minimizing.

It’s finding a free energy minima.

And so just to clarify, I’m trying to wrap my head around.

So the free energy principle says that if something exists,

these are the properties it should display.

Yeah.

So what that means is we can’t just look,

we can’t just go into a soup and there’s no mechanism.

Free energy principle doesn’t give us a mechanism

to find the things that exist.

Is that what’s implying, is being implied

that you can kind of use it to reason,

to think about like, study a particular system

and say, does this exhibit these qualities?

That’s an excellent question.

But to answer that, I’d have to return

to your previous question about what’s the difference

between living and nonliving things.

Yes, well, actually, sorry.

So yeah, maybe we can go there.

Maybe we can go there, you kind of drew a line

and forgive me for the stupid questions,

but you kind of drew a line between living and existing.

Is there an interesting sort of distinction?

Yeah, I think there is.

So things do exist, grains of sand,

rocks on the moon, trees, you.

So all of these things can be separated from the environment

in which they are immersed.

And therefore, they must at some level

be optimizing their free energy,

taking this sort of model evidence interpretation

of this quantity that basically means

they’re self evidencing.

Another nice little twist of phrase here

is that you are your own existence proof,

statistically speaking, which I don’t think

I said that, somebody did, but I love that phrase.

You are your own existence proof.

Yeah, so it’s so existential, isn’t it?

I’m gonna have to think about that for a few days.

That’s a beautiful line.

So the step through to answer your question

about what’s it good for,

we’ll go along the following lines.

First of all, you have to define what it means

to exist, which now, as you’ve rightly pointed out,

you have to define what probabilistic properties

must the states of something possess

so it knows where it finishes.

And then you write that down in terms

of statistical dependencies, again, sparsity.

Again, it’s not what’s connected or what’s correlated

or what depends upon, it’s what’s not correlated

and what doesn’t depend upon something.

Again, it comes down to the deep structures,

not in this instance, hierarchical,

but the structures that emerge

from removing connectivity and dependency.

And in this instance, basically being able

to identify the surface of the oil drop

from the water in which it is immersed.

And when you do that, you start to realize,

well, there are actually four kinds of states

in any given universe that contains anything.

The things that are internal to the surface,

the things that are external to the surface

and the surface in and of itself,

which is why I use a metaphor,

a little single celled organism

that has an interior and exterior

and then the surface of the cell.

And that’s mathematically a Markov blanket.

Just to pause, I’m in awe of this concept

that there’s the stuff outside the surface,

stuff inside the surface and the surface itself,

the Markov blanket.

It’s just the most beautiful kind of notion

about trying to explore what it means

to exist mathematically.

I apologize, it’s just a beautiful idea.

But it came out of California, so that’s.

I changed my mind.

I take it all back.

So anyway, so you were just talking

about the surface, about the Markov blanket.

So this surface or these blanket states

that are the, because they are now defined

in relation to these independencies

and what different states internal blanket

or external states can,

which ones can influence each other

and which cannot influence each other.

You can now apply standard results

that you would find in non equilibrium physics

or steady state or thermodynamics or hydrodynamics,

usually out of equilibrium solutions

and apply them to this partition.

And what it looks like is if all the normal gradient flows

that you would associate with any non equilibrium system

apply in such a way that part of the Markov blanket

and the internal states seem to be hill climbing

or doing a gradient descent on the same quantity.

And that means that you can now describe

the very existence of this oil drop.

You can write down the existence of this oil drop

in terms of flows, dynamics, equations of motion,

where the blanket states or part of them,

we call them active states and the internal states

now seem to be and must be trying to look

as if they’re minimizing the same function,

which is a low probability of occupying these states.

Interesting thing is that what would they be called

if you were trying to describe these things?

So what we’re talking about are internal states,

external states and blanket states.

Now let’s carve the blanket states

into two sensory states and active states.

Operationally, it has to be the case

that in order for this carving up

into different sets of states to exist,

the active states, the Markov blanket

cannot be influenced by the external states.

And we already know that the internal states

can’t be influenced by the external states

because the blanket separates them.

So what does that mean?

Well, it means the active states, the internal states

are now jointly not influenced by external states.

They only have autonomous dynamics.

So now you’ve got a picture of an oil drop

that has autonomy, it has autonomous states,

it has autonomous states in the sense

that there must be some parts of the surface of the oil drop

that are not influenced by the external states

and all the interior.

And together, those two states endow

even a little oil drop with autonomous states

that look as if they are optimizing

their variational free energy or their negative elbow,

their moral evidence.

And that would be an interesting intellectual exercise.

And you could say, you could even go into the realms

of panpsychism, that everything that exists

is implicitly making inferences on self evidencing.

Now we make the next move, but what about living things?

I mean, so let me ask you,

what’s the difference between an oil drop

and a little tadpole or a little lava or a plankton?

The picture was just painted of an oil drop.

Just immediately in a matter of minutes

took me into the world of panpsychism,

where you’ve just convinced me,

made me feel like an oil drop is a living,

it’s certainly an autonomous system,

but almost a living system.

So it has sensory capabilities and acting capabilities

and it maintains something.

So what is the difference between that

and something that we traditionally

think of as a living system?

That it could die or it can’t,

I mean, yeah, mortality, I’m not exactly sure.

I’m not sure what the right answer there is

because they can move,

like movement seems like an essential element

to being able to act in the environment,

but the oil drop is doing that.

So I don’t know.

Is it?

The oil drop will be moved,

but does it in and of itself move autonomously?

Well, the surface is performing actions

that maintain its structure.

Yeah, you’re being too clever.

I was, I had in mind a passive little oil drop

that’s sitting there at the bottom

on the top of a glass of water.

Sure, I guess.

What I’m trying to say is you’re absolutely right.

You’ve nailed it.

It’s movement.

So where does that movement come from?

If it comes from the inside,

then you’ve got, I think, something that’s living.

What do you mean from the inside?

What I mean is that the internal states

that can influence the active states,

where the active states can influence,

but they’re not influenced by the external states,

can cause movement.

So there are two types of oil drops, if you like.

There are oil drops where the internal states

are so random that they average themselves away,

and the thing cannot, on average,

when you do the averaging, move.

So a nice example of that would be the Sun.

The Sun certainly has internal states.

There’s lots of intrinsic autonomous activity going on,

but because it’s not coordinated,

because it doesn’t have the deep, in the millennial sense,

the hierarchical structure that the brain does,

there is no overall mode or pattern or organization

that expresses itself on the surface

that allows it to actually swim.

It can certainly have a very active surface,

but en masse, at the scale of the actual surface of the Sun,

the average position of that surface cannot, in itself, move,

because the internal dynamics are more like a hot gas.

They are literally like a hot gas,

whereas your internal dynamics are much more structured

and deeply structured,

and now you can express on your active states

with your muscles and your secretory organs,

your autonomic nervous system and its effectors,

you can actually move, and that’s all you can do.

And that’s something which,

if you haven’t thought of it like this before,

I think it’s nice to just realize

there is no other way that you can change the universe

other than simply moving.

Whether that moving is articulating with my voice box

or walking around or squeezing juices

out of my secretory organs,

there’s only one way you can change the universe.

It’s moving.

And the fact that you do so nonrandomly makes you alive.

Yeah, so it’s that nonrandomness.

And that would be manifested,

we realize, in terms of essentially swimming,

essentially moving, changing one’s shape,

a morphogenesis that is dynamic and possibly adaptive.

So that’s what I was trying to get at

between the difference between the oil drop

and the little tadpole.

The tadpole is moving around.

Its active states are actually changing the external states.

And there’s now a cycle,

an action perception cycle, if you like,

a recurrent dynamic that’s going on

that depends upon this deeply structured autonomous behavior

that rests upon internal dynamics

that are not only modeling

the data impressed upon their surface or the blanket states,

but they are actively resampling those data by moving.

They’re moving towards chemical gradients and chemotaxis.

So they’ve gone beyond just being good little models

of the kind of world they live in.

For example, an oil droplet could, in a panpsychic sense,

be construed as a little being

that has now perfectly inferred.

It’s a passive, nonliving oil drop

living in a bowl of water.

No problem.

But to now equip that oil drop with the ability to go out

and test that hypothesis about different states of beings.

So it can actually push its surface over there, over there,

and test for chemical gradients,

or then you start to move to a much more lifelike form.

This is all fun, theoretically interesting,

but it actually is quite important

in terms of reflecting what I have seen

since the turn of the millennium,

which is this move towards an inactive

and embodied understanding of intelligence.

And you say you’re from machine learning.

So what that means,

the central importance of movement,

I think has yet to really hit machine learning.

It certainly has now diffused itself throughout robotics.

And perhaps you could say certain problems in active vision

where you actually have to move the camera

to sample this and that.

But machine learning of the data mining deep learning sort

simply hasn’t contended with this issue.

What it’s done, instead of dealing with the movement problem

and the active sampling of data,

it’s just said, we don’t need to worry about,

we can see all the data because we’ve got big data.

So we can ignore movement.

So that for me is an important omission

in current machine learning.

The current machine learning is much more like the oil drop.

Yes.

But an oil drop that enjoys exposure

to nearly all the data that it will ever need to be exposed to,

as opposed to the tadpoles swimming out

to find the right data.

For example, it likes food.

That’s a good hypothesis.

Let’s test it out.

Let’s go and move and ingest food, for example,

and see is that evidence that I’m the kind of thing

that likes this kind of food.

So the next natural question, and forgive this question,

but if we think of sort of even artificial intelligence

systems, which I just painted a beautiful picture

of existence and life.

So do you ascribe, do you find within this framework

a possibility of defining consciousness

or exploring the idea of consciousness?

Like what, you know, self awareness

and expand it to consciousness?

Yeah.

How can we start to think about consciousness

within this framework?

Is it possible?

Well, yeah, I think it’s possible to think about it,

whether you’ll get it.

Get anywhere is another question.

And again, I’m not sure that I’m licensed

to answer that question.

I think you’d have to speak to a qualified philosopher

to get a definitive answer to that.

But certainly, there’s a lot of interest

in using not just these ideas, but related ideas

from information theory to try and tie down

the maths and the calculus and the geometry of consciousness,

either in terms of sort of a minimal consciousness,

even less than a minimal selfhood.

And what I’m talking about is the ability, effectively,

to plan, to have agency.

So you could argue that a virus does have a form of agency

in virtue of the way that it selectively

finds hosts and cells to live in and moves around.

But you wouldn’t endow it with the capacity

to think about planning and moving in a purposeful way

where it countenances the future.

Whereas you might an ant.

You might think an ant’s not quite as unconscious

as a virus.

It certainly seems to have a purpose.

It talks to its friends en route during its foraging.

It has a different kind of autonomy, which is biotic,

but beyond a virus.

So there’s something about, so there’s

some line that has to do with the complexity of planning

that may contain an answer.

I mean, it would be beautiful if we

can find a line beyond which we could say a being is conscious.

Yes, it will be.

These are wonderful lines that we’ve drawn with existence,

life, and consciousness.

Yes, it will be very nice.

One little wrinkle there, and this

is something I’ve only learned in the past few months,

is the philosophical notion of vagueness.

So you’re saying it would be wonderful to draw a line.

I had always assumed that that line at some point

would be drawn until about four months ago,

and the philosopher taught me about vagueness.

So I don’t know if you’ve come across this,

but it’s a technical concept.

And I think most revealingly illustrated with,

at what point does a pile of sand become a pile?

Is it one grain, two grains, three grains, or four grains?

So at what point would you draw the line

between being a pile of sand and a collection of grains of sand?

In the same way, is it right to ask,

where would I draw the line between conscious

and unconscious?

And it might be a vague concept.

Having said that, I agree with you entirely.

Systems that have the ability to plan.

So just technically, what that means

is your inferential self evidencing,

by which I simply mean the thermodynamics and gradient

flows that underwrite the preservation of your oil

droplet like form, can be described

as an optimization of log Bayesian model

evidence, your elbow.

That self evidencing must be evidence

for a model of what’s causing the sensory impressions

on the sensory part of your surface or your Markov

blanket.

If that model is capable of planning,

it must include a model of the future consequences

of your active states or your action, just planning.

So we’re now in the game of planning as inference.

Now notice what we’ve made, though.

We’ve made quite a big move away from big data and machine

learning, because again, it’s the consequences of moving.

It’s the consequences of selecting those data or those

data or looking over there.

And that tells you immediately that even

to be a contender for a conscious artifact or a strong

AI or generalized, I don’t know what that’s called nowadays,

then you’ve got to have movement in the game.

And furthermore, you’ve got to have a generative model

of the sort you might find in, say, a variational auto

encoder that is thinking about the future conditioned

upon different courses of action.

Now that brings a number of things to the table, which

now you start to think, well, those

have got all the right ingredients

to talk about consciousness.

I’ve now got to select among a number of different courses

of action into the future as part of planning.

I’ve now got free will.

The act of selecting this course of action or that policy

or that policy or that action suddenly

makes me into an inference machine,

a self evidencing artifact that now

looks as if it’s selecting amongst different alternative

ways forward, as I actively swim here or swim there

or look over here, look over there.

So I think you’ve now got to a situation

if there is planning in the mix.

You’re now getting much closer to that line

if that line were ever to exist.

I don’t think it gets you quite as far as self aware, though.

And then you have to, I think, grapple with the question,

how would formally write down a calculus or a maths

of self awareness?

I don’t think it’s impossible to do.

But I think there would be pressure on you

to actually commit to a formal definition of what

you mean by self awareness.

I think most people that I know would probably

say that a goldfish, a pet fish, was not self aware.

They would probably argue about their favorite cat,

but would be quite happy to say that their mom was self aware.

So.

I mean, but that might very well connect

to some level of complexity with planning.

It seems like self awareness is essential for complex planning.

Yeah.

Do you want to take that further?

Because I think you’re absolutely right.

Again, the line is unclear, but it

seems like integrating yourself into the world,

into your planning, is essential for constructing complex plans.

Yes.

Yeah.

So mathematically describing that in the same elegant way

as you have with the free energy principle might be difficult.

Well, yes and no.

I don’t think that, well, perhaps we should just,

can we just go back?

That’s a very important answer you gave.

And I think if I just unpacked it,

you’d see the truisms that you’ve just exposed for us.

But let me, sorry, I’m mindful that I didn’t answer

your question before.

Well, what’s the free energy principle good for?

Is it just a pretty theoretical exercise

to explain nonequilibrium steady states?

Yes, it is.

It does nothing more for you than that.

It can be regarded, it’s going to sound very arrogant,

but it is of the sort of theory of natural selection,

or a hypothesis of natural selection.

Beautiful, undeniably true, but tells you

absolutely nothing about why you have legs and eyes.

It tells you nothing about the actual phenotype,

and it wouldn’t allow you to build something.

So the free energy principle by itself

is as vacuous as most tautological theories.

And by tautological, of course,

I’m talking to the theory of natural,

the survival of the fittest.

What’s the fittest of those that survive?

Why do they cycle?

It’s the fitter.

It just goes around in circles.

In a sense, the free energy principle has that same

deflationary tautology under the hood.

It’s a characteristic of things that exist.

Why do they exist?

Because they minimize their free energy.

Why do they minimize their free energy?

Because they exist.

And you just keep on going round and round and round.

But the practical thing,

which you don’t get from natural selection,

but you could say has now manifest in things

like differential evolution or genetic algorithms

and MCMC, for example, in machine learning.

The practical thing you can get is,

if it looks as if things that exist

are trying to have density dynamics

and look as though they’re optimizing

a variational free energy,

and a variational free energy has to be

a functional of a generative model,

a probabilistic description of causes and consequences,

causes out there, consequences in the sensorium

on the sensory parts of the Markov blanket,

then it should, in theory, be possible

to write down the generative model,

work out the gradients,

and then cause it to autonomously self evidence.

So you should be able to write down oil droplets.

You should be able to create artifacts

where you have supplied the objective function

that supplies the gradients,

that supplies the self organizing dynamics

to non equilibrium steady state.

So there is actually a practical application

of the free energy principle

when you can write down your required evidence

in terms of, well, when you can write down

the generative model that is the thing

that has the evidence.

The probability of these sensory data

or this data, given that model,

is effectively the thing that the elbow

or the variational free energy bounds or approximates.

That means that you can actually write down the model

and the kind of thing that you want to engineer,

the kind of AGI or artificial general intelligence

that you want to manifest probabilistically,

and then you engineer, a lot of hard work,

but you would engineer a robot and a computer

to perform a gradient descent on that objective function.

So it does have a practical implication.

Now, why am I wittering on about that?

It did seem relevant to, yes.

So what kinds of, so the answer to,

would it be easier or would it be hard?

Well, mathematically, it’s easy.

I’ve just told you all you need to do

is write down your perfect artifact,

probabilistically, in the form

of a probabilistic generative model,

a probability distribution over the causes

and consequences of the world

in which this thing is immersed.

And then you just engineer a computer and a robot

to perform a gradient descent on that objective function.

No problem.

But of course, the big problem

is writing down the generative model.

So that’s where the heavy lifting comes in.

So it’s the form and the structure of that generative model

which basically defines the artifact that you will create

or, indeed, the kind of artifact that has self awareness.

So that’s where all the hard work comes,

very much like natural selection doesn’t tell you

in the slightest why you have eyes.

So you have to drill down on the actual phenotype,

the actual generative model.

So with that in mind, what did you tell me

that tells me immediately the kinds of generative models

I would have to write down in order to have self awareness?

What you said to me was I have to have a model

that is effectively fit for purpose

for this kind of world in which I operate.

And if I now make the observation

that this kind of world is effectively largely populated

by other things like me, i.e. you,

then it makes enormous sense

that if I can develop a hypothesis

that we are similar kinds of creatures,

in fact, the same kind of creature,

but I am me and you are you,

then it becomes, again, mandated to have a sense of self.

So if I live in a world

that is constituted by things like me,

basically a social world, a community,

then it becomes necessary now for me to infer

that it’s me talking and not you talking.

I wouldn’t need that if I was on Mars by myself

or if I was in the jungle as a feral child.

If there was nothing like me around,

there would be no need to have an inference

at a hypothesis, oh yes, it is me

that is experiencing or causing these sounds

and it is not you.

It’s only when there’s ambiguity in play

induced by the fact that there are others in that world.

So I think that the special thing about self aware artifacts

is that they have learned to, or they have acquired,

or at least are equipped with, possibly by evolution,

generative models that allow for the fact

there are lots of copies of things like them around,

and therefore they have to work out it’s you and not me.

That’s brilliant.

I’ve never thought of that.

I never thought of that, that the purpose

of the really usefulness of consciousness

or self awareness in the context of planning

existing in the world is so you can operate

with other things like you, and like you could,

it doesn’t have to necessarily be human.

It could be other kind of similar creatures.

Absolutely, well, we view a lot of our attributes

into our pets, don’t we?

Or we try to make our robots humanoid.

And I think there’s a deep reason for that,

that it’s just much easier to read the world

if you can make the simplifying assumption

that basically you’re me, and it’s just your turn to talk.

I mean, when we talk about planning,

when you talk specifically about planning,

the highest, if you like, manifestation or realization

of that planning is what we’re doing now.

I mean, the human condition doesn’t get any higher

than this talking about the philosophy of existence

and the conversation.

But in that conversation, there is a beautiful art

of turn taking and mutual inference, theory of mind.

I have to know when you wanna listen.

I have to know when you want to interrupt.

I have to make sure that you’re online.

I have to have a model in my head

of your model in your head.

That’s the highest, the most sophisticated form

of generative model, where the generative model

actually has a generative model

of somebody else’s generative model.

And I think that, and what we are doing now evinces

the kinds of generative models

that would support self awareness,

because without that, we’d both be talking over each other,

or we’d be singing together in a choir.

That’s not a brilliant analogy for what I’m trying to say,

but yeah, we wouldn’t have this discourse.

We wouldn’t have this.

Yeah, the dance of it.

Yeah, that’s right.

As I interrupt, I mean, that’s beautifully put.

I’ll re listen to this conversation many times.

There’s so much poetry in this, and mathematics.

Let me ask the silliest, or perhaps the biggest question

as a last kind of question.

We’ve talked about living in existence

and the objective function under which

these objects would operate.

What do you think is the objective function

of our existence?

What’s the meaning of life?

What do you think is the, for you, perhaps,

the purpose, the source of fulfillment,

the source of meaning for your existence,

as one blob in this soup?

I’m tempted to answer that, again, as a physicist,

until it’s the free energy I expect

consequent upon my behavior.

So technically, we could get a really interesting

conversation about what that comprises

in terms of searching for information,

resolving uncertainty about the kind of thing that I am.

But I suspect that you want a slightly more personal

and fun answer, but which can be consistent with that.

And I think it’s reassuringly simple

and hops back to what you were taught as a child,

that you have certain beliefs about the kind of creature

and the kind of person you are.

And all that self evidencing means,

all that minimizing variational free energy

in an inactive and embodied way,

means is fulfilling the beliefs about

what kind of thing you are.

And of course, we’re all given those scripts,

those narratives, at a very early age,

usually in the form of bedtime stories or fairy stories

that I’m a princess and I’m gonna meet a beast

who’s gonna transform and he’s gonna be a prince.

And so the narratives are all around you

from your parents to the friends

to the society feeds these stories.

And then your objective function is to fulfill.

Exactly, that narrative that has been encultured

by your immediate family, but as you say,

also the sort of the culture in which you grew up

and you create for yourself.

I mean, again, because of this active inference,

this inactive aspect of self evidencing,

not only am I modeling my environment,

my eco niche, my external states out there,

but I’m actively changing them all the time

and doing the same back, we’re doing it together.

So there’s a synchrony that means that I’m creating

my own culture over different timescales.

So the question now is for me being very selfish,

what scripts were I given?

It basically was a mixture between Einstein and shark homes.

So I smoke as heavily as possible,

try to avoid too much interpersonal contact,

enjoy the fantasy that you’re a popular scientist

who’s gonna make a difference in a slightly quirky way.

So that’s what I grew up on.

My father was an engineer and loved science

and he loved sort of things like Sir Arthur Edmonds,

Spacetime and Gravitation, which was the first

understandable version of general relativity.

So all the fairy stories I was told as I was growing up

were all about these characters.

I’m keeping the Hobbit out of this

because that doesn’t quite fit my narrative.

There’s a journey of exploration, I suppose, of sorts.

So yeah, I’ve just grown up to be what I imagine

a mild mannered Sherlock Holmes slash Albert Einstein

would do in my shoes.

And you did it elegantly and beautifully.

Carl was a huge honor talking today, it was fun.

Thank you so much for your time.

No, thank you. Appreciate it.

Thank you for listening to this conversation

with Carl Friston and thank you

to our presenting sponsor, Cash App.

Please consider supporting the podcast

by downloading Cash App and using code LexPodcast.

If you enjoy this podcast, subscribe on YouTube,

review it with five stars on Apple Podcast,

support on Patreon, or simply connect with me on Twitter

at LexFriedman.

And now let me leave you with some words from Carl Friston.

Your arm moves because you predict it will

and your motor system seeks to minimize prediction error.

Thank you for listening and hope to see you next time.

comments powered by Disqus