Lex Fridman Podcast - #225 - Jeffrey Shainline: Neuromorphic Computing and Optoelectronic Intelligence

The following is a conversation with Jeff Schoenlein,

a scientist at NIST

interested in optoelectronic intelligence.

We have a deep technical dive into computing hardware

that will make Jim Keller proud.

I urge you to hop onto this rollercoaster ride

through neuromorphic computing

and superconducting electronics

and hold on for dear life.

Jeff is a great communicator of technical information

and so it was truly a pleasure to talk to him

about some physics and engineering.

To support this podcast,

please check out our sponsors in the description.

This is the Lex Friedman Podcast

and here is my conversation with Jeff Schoenlein.

I got a chance to read a fascinating paper you authored

called Optoelectronic Intelligence.

So maybe we can start by talking about this paper

and start with the basic questions.

What is optoelectronic intelligence?

Yeah, so in that paper,

the concept I was trying to describe

is sort of an architecture

for building brain inspired computing

that leverages light for communication

in conjunction with electronic circuits for computation.

In that particular paper,

a lot of the work we’re doing right now

in our project at NIST

is focused on superconducting electronics for computation.

I won’t go into why that is,

but that might make a little more sense in context

if we first describe what that is in contrast to,

which is semiconducting electronics.

So is it worth taking a couple minutes

to describe semiconducting electronics?

It might even be worthwhile to step back

and talk about electricity and circuits

and how circuits work

before we talk about superconductivity.

Right, okay.

How does a computer work, Jeff?

Well, I won’t go into everything

that makes a computer work,

but let’s talk about the basic building blocks,

a transistor, and even more basic than that,

a semiconductor material, silicon, say.

So in silicon, silicon is a semiconductor,

and what that means is at low temperature,

there are no free charges,

no free electrons that can move around.

So when you talk about electricity,

you’re talking about predominantly electrons

moving to establish electrical currents,

and they move under the influence of voltages.

So you apply voltages, electrons move around,

those can be measured as currents,

and you can represent information in that way.

So semiconductors are special

in the sense that they are really malleable.

So if you have a semiconductor material,

you can change the number of free electrons

that can move around by putting different elements,

different atoms in lattice sites.

So what is a lattice site?

Well, a semiconductor is a crystal,

which means all the atoms that comprise the material

are at exact locations

that are perfectly periodic in space.

So if you start at any one atom

and you go along what are called the lattice vectors,

you get to another atom and another atom and another atom,

and for high quality devices,

it’s important that it’s a perfect crystal

with very few defects,

but you can intentionally replace a silicon atom

with say a phosphorus atom,

and then you can change the number of free electrons

that are in a region of space

that has that excess of what are called dopants.

So picture a device that has a left terminal

and a right terminal,

and if you apply a voltage between those two,

you can cause electrical current to flow between them.

Now we add a third terminal up on top there,

and depending on the voltage

between the left and right terminal and that third voltage,

you can change that current.

So what’s commonly done in digital electronic circuits

is to leave a fixed voltage from left to right,

and then change that voltage

that’s applied at what’s called the gate,

the gate of the transistor.

So what you do is you make it to where

there’s an excess of electrons on the left,

excess of electrons on the right,

and very few electrons in the middle,

and you do this by changing the concentration

of different dopants in the lattice spatially.

And then when you apply a voltage to that gate,

you can either cause current to flow or turn it off,

and so that’s sort of your zero and one.

If you apply voltage, current can flow,

that current is representing a digital one,

and from that, from that basic element,

you can build up all the complexity

of digital electronic circuits

that have really had a profound influence on our society.

Now you’re talking about electrons.

Can you give a sense of what scale we’re talking about

when we’re talking about in silicon

being able to mass manufacture these kinds of gates?

Yeah, so scale in a number of different senses.

Well, at the scale of the silicon lattice,

the distance between two atoms there is half a nanometer.

So people often like to compare these things

to the width of a human hair.

I think it’s some six orders of magnitude smaller

than the width of a human hair, something on that order.

So remarkably small,

we’re talking about individual atoms here,

and electrons are of that length scale

when they’re in that environment.

But there’s another sense

that scale matters in digital electronics.

This is perhaps the more important sense,

although they’re related.

Scale refers to a number of things.

It refers to the size of that transistor.

So for example, I said you have a left contact,

a right contact, and some space between them

where the gate electrode sits.

That’s called the channel width or the channel length.

And what has enabled what we think of as Moore’s law

or the continued increased performance

in silicon microelectronic circuits

is the ability to make that size, that feature size,

ever smaller, ever smaller at a really remarkable pace.

I mean, that feature size has decreased consistently

every couple of years since the 1960s.

And that was what Moore predicted in the 1960s.

He thought it would continue for at least two more decades,

and it’s been much longer than that.

And so that is why we’ve been able to fit ever more devices,

ever more transistors, ever more computational power

on essentially the same size of chip.

So a user sits back and does essentially nothing.

You’re running the same computer program,

but those devices are getting smaller, so they get faster,

they get more energy efficient,

and all of our computing performance

just continues to improve.

And we don’t have to think too hard

about what we’re doing as, say,

a software designer or something like that.

I absolutely don’t mean to say

that there’s no innovation in software or the user side

of things, of course there is.

But from the hardware perspective,

we just have been given this gift

of continued performance improvement

through this scaling that is ever smaller feature sizes

with very similar, say, power consumption.

That power consumption has not continued to scale

in the most recent decades, but nevertheless,

we had a really good run there for a while.

And now we’re down to gates that are seven nanometers,

which is state of the art right now.

Maybe GlobalFoundries is trying to push it

even lower than that.

I can’t keep up with where the predictions are

that it’s gonna end.

But seven nanometer transistor has just a few tens of atoms

along the length of the conduction pathway.

So a naive semiconductor device physicist

would think you can’t go much further than that

without some kind of revolution in the way we think

about the physics of our devices.

Is there something to be said

about the mass manufacture of these devices?

Right, right, so that’s another thing.

So how have we been able

to make those transistors smaller and smaller?

Well, companies like Intel, GlobalFoundries,

they invest a lot of money in the lithography.

So how are these chips actually made?

Well, one of the most important steps

is this what’s called ion implantation.

So you start with sort of a pristine silicon crystal

and then using photolithography,

which is a technique where you can pattern

different shapes using light,

you can define which regions of space

you’re going to implant with different species of ions

that are going to change

the local electrical properties right there.

So by using ever shorter wavelengths of light

and different kinds of optical techniques

and different kinds of lithographic techniques,

things that go far beyond my knowledge base,

you can just simply shrink that feature size down.

And you say you’re at seven nanometers.

Well, the wavelength of light that’s being used

is over a hundred nanometers.

That’s already deep in the UV.

So how are those minute features patterned?

Well, there’s an extraordinary amount of innovation

that has gone into that,

but nevertheless, it stayed very consistent

in this ever shrinking feature size.

And now the question is, can you make it smaller?

And even if you do, do you still continue

to get performance improvements?

But that’s another kind of scaling

where these companies have been able to…

So, okay, you picture a chip that has a processor on it.

Well, that chip is not made as a chip.

It’s made on a wafer.

And using photolithography,

you basically print the same pattern on different dyes

all across the wafer, multiple layers,

tens, probably a hundred some layers

in a mature foundry process.

And you do this on ever bigger wafers too.

That’s another aspect of scaling

that’s occurred in the last several decades.

So now you have this 300 millimeter wafer.

It’s like as big as a pizza

and it has maybe a thousand processors on it.

And then you dice that up using a saw.

And now you can sell these things so cheap

because the manufacturing process was so streamlined.

I think a technology as revolutionary

as silicon microelectronics has to have

that kind of manufacturing scalability,

which I will just emphasize,

I believe is enabled by physics.

It’s not, I mean, of course there’s human ingenuity

that goes into it, but at least from my side where I sit,

it sure looks like the physics of our universe

allows us to produce that.

And we’ve discovered how more so than we’ve invented it,

although of course we have invented it,

humans have invented it,

but it’s almost as if it was there

waiting for us to discover it.

You mean the entirety of it

or are you specifically talking about

the techniques of photolithography,

like the optics involved?

I mean, the entirety of the scaling down

to the seven nanometers,

you’re able to have electrons not interfere with each other

in such a way that you could still have gates.

Like that’s enabled.

To achieve that scale, spatial and temporal,

it seems to be very special

and is enabled by the physics of our world.

All of the things you just said.

So starting with the silicon material itself,

silicon is a unique semiconductor.

It has essentially ideal properties

for making a specific kind of transistor

that’s extraordinarily useful.

So I mentioned that silicon has,

well, when you make a transistor,

you have this gate contact

that sits on top of the conduction channel.

And depending on the voltage you apply there,

you pull more carriers into the conduction channel

or push them away so it becomes more or less conductive.

In order to have that work

without just sucking those carriers right into that contact,

you need a very thin insulator.

And part of scaling has been to gradually decrease

the thickness of that gate insulator

so that you can use a roughly similar voltage

and still have the same current voltage characteristics.

So the material that’s used to do that,

or I should say was initially used to do that

was a silicon dioxide,

which just naturally grows on the silicon surface.

So you expose silicon to the atmosphere that we breathe

and well, if you’re manufacturing,

you’re gonna purify these gases,

but nevertheless,

that what’s called a native oxide will grow there.

There are essentially no other materials

on the entire periodic table

that have as good of a gate insulator

as that silicon dioxide.

And that has to do with nothing but the physics

of the interaction between silicon and oxygen.

And if it wasn’t that way,

transistors could not perform

in nearly the degree of capability that they have.

And that has to do with the way that the oxide grows,

the reduced density of defects there,

it’s insulation, meaning essentially it’s energy gaps.

You can apply a very large voltage there

without having current leak through it.

So that’s physics right there.

There are other things too.

Silicon is a semiconductor in an elemental sense.

You only need silicon atoms.

A lot of other semiconductors,

you need two different kinds of atoms,

like a compound from group three

and a compound from group five.

That opens you up to lots of defects that can occur

where one atom’s not sitting quite at the lattice site,

it is and it’s switched with another one

that degrades performance.

But then also on the side that you mentioned

with the manufacturing,

we have access to light sources

that can produce these very short wavelengths of light.

How does photolithography occur?

Well, you actually put this polymer on top of your wafer

and you expose it to light,

and then you use a aqueous chemical processing

to dissolve away the regions that were exposed to light

and leave the regions that were not.

And we are blessed with these polymers

that have the right property

where they can cause scission events

where the polymer splits where a photon hits.

I mean, maybe that’s not too surprising,

but I don’t know, it all comes together

to have this really complex,

manufacturable ecosystem

where very sophisticated technologies can be devised

and it works quite well.

And amazingly, like you said,

with a wavelength at like 100 nanometers

or something like that,

you’re still able to achieve on this polymer

precision of whatever we said, seven nanometers.

I think I’ve heard like four nanometers

being talked about, something like that.

If we could just pause on this

and we’ll return to superconductivity,

but in this whole journey from a history perspective,

what do you think is the most beautiful

at the intersection of engineering and physics

to you in this whole process

that we talked about with silicon and photolithography,

things that people were able to achieve

in order to push Moore’s law forward?

Is it the early days,

the invention of the transistor itself?

Is it some particular cool little thing

that maybe not many people know about?

Like, what do you think is most beautiful

in this whole process, journey?

The most beautiful is a little difficult to answer.

Let me try and sidestep it a little bit

and just say what strikes me about looking

at the history of silicon microelectronics is that,

so when quantum mechanics was developed,

people quickly began applying it to semiconductors

and it was broadly understood

that these are fascinating systems

and people cared about them for their basic physics,

but also their utility as devices.

And then the transistor was invented in the late forties

in a relatively crude experimental setup

where you just crammed a metal electrode

into the semiconductor and that was ingenious.

These people were able to make it work.

But so what I wanna get to that really strikes me

is that in those early days,

there were a number of different semiconductors

that were being considered.

They had different properties, different strengths,

different weaknesses.

Most people thought germanium was the way to go.

It had some nice properties related to things

about how the electrons move inside the lattice.

But other people thought that compound semiconductors

with group three and group five also had

really, really extraordinary properties

that might be conducive to making the best devices.

So there were different groups exploring each of these

and that’s great, that’s how science works.

You have to cast a broad net.

But then what I find striking is why is it that silicon won?

Because it’s not that germanium is a useless material

and it’s not present in technology

or compound semiconductors.

They’re both doing exciting and important things,

slightly more niche applications

whereas silicon is the semiconductor material

for microelectronics which is the platform

for digital computing which has transformed our world.

Why did silicon win?

It’s because of a remarkable assemblage of qualities

that no one of them was the clear winner

but it made these sort of compromises

between a number of different influences.

It had that really excellent gate oxide

that allowed us to make MOSFETs,

these high performance transistors,

so quickly and cheaply and easily

without having to do a lot of materials development.

The band gap of silicon is actually,

so in a semiconductor there’s an important parameter

which is called the band gap

which tells you there are sort of electrons

that fill up to one level in the energy diagram

and then there’s a gap where electrons aren’t allowed

to have an energy in a certain range

and then there’s another energy level above that.

And that difference between the lower sort of filled level

and the unoccupied level,

that tells you how much voltage you have to apply

in order to induce a current to flow.

So with germanium, that’s about 0.75 electron volts.

That means you have to apply 0.75 volts

to get a current moving.

And it turns out that if you compare that

to the thermal excitations that are induced

just by the temperature of our environment,

that gap’s not quite big enough.

You start to use it to perform computations,

it gets a little hot and you get all these accidental

carriers that are excited into the conduction band

and it causes errors in your computation.

Silicon’s band gap is just a little higher,

1.1 electron volts,

but you have an exponential dependence

on the number of carriers that are present

that can induce those errors.

It decays exponentially with that voltage.

So just that slight extra energy in that band gap

really puts it in an ideal position to be operated

in the conditions of our ambient environment.

It’s kind of fascinating that, like you mentioned,

errors decrease exponentially with the voltage.

So it’s funny because this error thing comes up

when you start talking about quantum computing.

And it’s kind of amazing that everything

we’ve been talking about, the errors,

as we scale down, seems to be extremely low.

Yes.

And like all of our computation is based

on the assumption that it’s extremely low.

Yes, well it’s digital computation.

Digital, sorry, digital computation.

So as opposed to our biological computation in our brain,

is like the assumption is stuff is gonna fail

all over the place and we somehow

have to still be robust to that.

That’s exactly right.

So this also, this is gonna be the most controversial part

of our conversation where you’re gonna make some enemies.

So let me ask,

because we’ve been talking about physics and engineering.

Which group of people is smarter

and more important for this one?

Let me ask the question in a better way.

Some of the big innovations,

some of the beautiful things that we’ve been talking about,

how much of it is physics?

How much of it is engineering?

My dad is a physicist and he talks down

to all the amazing engineering that we’re doing

in the artificial intelligence and the computer science

and the robotics and all that space.

So we argue about this all the time.

So what do you think?

Who gets more credit?

I’m genuinely not trying to just be politically correct here.

I don’t see how you would have any of the,

what we consider sort of the great accomplishments

of society without both.

You absolutely need both of those things.

Physics tends to play a key role earlier in the development

and then engineering optimization, these things take over.

And I mean, the invention of the transistor

or actually even before that,

the understanding of semiconductor physics

that allowed the invention of the transistor,

that’s all physics.

So if you didn’t have that physics,

you don’t even get to get on the field.

But once you have understood and demonstrated

that this is in principle possible,

more so as engineering.

Why we have computers more powerful

than old supercomputers in each of our phones,

that’s all engineering.

And I think I would be quite foolish to say

that that’s not valuable, that’s not a great contribution.

It’s a beautiful dance.

Would you put like Silicon,

the understanding of the material properties

in the space of engineering?

Like how does that whole process work?

To understand that it has all these nice properties

or even the development of photolithography,

is that basically,

would you put that in a category of engineering?

No, I would say that it is basic physics,

it is applied physics, it’s material science,

it’s X ray crystallography, it’s polymer chemistry,

it’s everything.

Chemistry even is thrown in there?

Absolutely, yes, absolutely.

Just no biology.

We can get to biology.

Or the biologies and the humans

that are engineering the system,

so it’s all integrated deeply.

Okay, so let’s return,

you mentioned this word superconductivity.

So what does that have to do with what we’re talking about?

Right, okay, so in a semiconductor,

as I tried to describe a second ago,

you can sort of induce currents by applying voltages

and those have sort of typical properties

that you would expect from some kind of a conductor.

Those electrons, they don’t just flow

perfectly without dissipation.

If an electron collides with an imperfection in the lattice

or another electron, it’s gonna slow down,

it’s gonna lose its momentum.

So you have to keep applying that voltage

in order to keep the current flowing.

In a superconductor, something different happens.

If you get a current to start flowing,

it will continue to flow indefinitely.

There’s no dissipation.

So that’s crazy.

How does that happen?

Well, it happens at low temperature and this is crucial.

It has to be a quite low temperature

and what I’m talking about there,

for essentially all of our conversation,

I’m gonna be talking about conventional superconductors,

sometimes called low TC superconductors,

low critical temperature superconductors.

And so those materials have to be at a temperature around,

say around four Kelvin.

I mean, their critical temperature might be 10 Kelvin,

something like that,

but you wanna operate them at around four Kelvin,

four degrees above absolute zero.

And what happens at that temperature,

at very low temperatures in certain materials

is that the noise of atoms moving around,

the lattice vibrating, electrons colliding with each other,

that becomes sufficiently low

that the electrons can settle into this very special state.

It’s sometimes referred to as a macroscopic quantum state

because if I had a piece of superconducting material here,

let’s say niobium is a very typical superconductor.

If I had a block of niobium here

and we cooled it below its critical temperature,

all of the electrons in that superconducting state

would be in one coherent quantum state.

The wave function of that state is described

in terms of all of the particles simultaneously,

but it extends across macroscopic dimensions,

the size of whatever block of that material

I have sitting here.

And the way this occurs is that,

let’s try to be a little bit light on the technical details,

but essentially the electrons coordinate with each other.

They are able to, in this macroscopic quantum state,

they’re able to sort of,

one can quickly take the place of the other.

You can’t tell electrons apart.

They’re what’s known as identical particles.

So if this electron runs into a defect

that would otherwise cause it to scatter,

it can just sort of almost miraculously avoid that defect

because it’s not really in that location.

It’s part of a macroscopic quantum state

and the entire quantum state

was not scattered by that defect.

So you can get a current that flows without dissipation

and that’s called a supercurrent.

That’s sort of just very much scratching the surface

of superconductivity.

There’s very deep and rich physics there,

just probably not the main subject

we need to go into right now.

But it turns out that when you have this material,

you can do usual things like make wires out of it

so you can get current to flow in a straight line on a chip,

but you can also make other devices

that perform different kinds of operations.

Some of them are kind of logic operations

like you’d get in a transistor.

The most common or the most,

I would say, diverse in its utility component

is a Josephson junction.

It’s not analogous to a transistor

in the sense that if you apply a voltage here,

it changes how much current flows from left to right,

but it is analogous in sort of a sense

of it’s the go to component

that a circuit engineer is going to use

to start to build up more complexity.

So these junctions serve as gates.

They can serve as gates.

So I’m not sure how concerned to be with semantics,

but let me just briefly say what a Josephson junction is

and we can talk about different ways that they can be used.

Basically, if you have a superconducting wire

and then a small gap of a different material

that’s not superconducting, an insulator or normal metal,

and then another superconducting wire on the other side,

that’s a Josephson junction.

So it’s sometimes referred to

as a superconducting weak link.

So you have this superconducting state on one side

and on the other side, and the superconducting wave function

actually tunnels across that gap.

And when you create such a physical entity,

it has very unusual current voltage characteristics.

In that gap, like weird stuff happens.

Through the entire circuit.

So you can imagine, suppose you had a loop set up

that had one of those weak links in the loop.

Current would flow in that loop independent,

even if you hadn’t applied a voltage to it,

and that’s called the Josephson effect.

So the fact that there’s this phase difference

in the quantum wave function from one side

of the tunneling barrier to the other

induces current to flow.

So how does you change state?

Right, exactly.

So how do you change state?

Now picture if I have a current bias coming down

this line of my circuit and there’s a Josephson junction

right in the middle of it.

And now I make another wire

that goes around the Josephson junction.

So I have a loop here, a superconducting loop.

I can add current to that loop by exceeding

the critical current of that Josephson junction.

So like any superconducting material,

it can carry this supercurrent that I’ve described,

this current that can propagate without dissipation

up to a certain level.

And if you try and pass more current than that

through the material, it’s going to become

a resistive material, normal material.

So in the Josephson junction, the same thing happens.

I can bias it above its critical current.

And then what it’s going to do,

it’s going to add a quantized amount of current

into that loop.

And what I mean by quantized is it’s going to come

in discrete packets with a well defined value of current.

So in the vernacular of some people working

in this community, you would say you pop a flux on

into the loop.

So a flux on.

You pop a flux on into the loop.

Yeah, so a flux on.

Sounds like skateboarder talk, I love it.

Okay, sorry, go ahead.

A flux on is one of these quantized sort of amounts

of current that you can add to a loop.

And this is a cartoon picture,

but I think it’s sufficient for our purposes.

So which, maybe it’s useful to say,

what is the speed at which these discrete packets

of current travel?

Because we’ll be talking about light a little bit.

It seems like the speed is important.

The speed is important, that’s an excellent question.

Sometimes I wonder where you, how you became so astute.

But so this.

Matrix 4 is coming out, so maybe that’s related.

I’m not sure.

I’m dressed for the job.

I was trying to get to become an extra on Matrix 4,

didn’t work out.

Anyway, so what’s the speed of these packets?

You’ll have to find another gig.

I know, I’m sorry.

So the speed of the pack is actually these flux ons,

these sort of pulses of current

that are generated by Joseph’s injunctions,

they can actually propagate very close

to the speed of light,

maybe something like a third of the speed of light.

That’s quite fast.

So one of the reasons why Joseph’s injunctions are appealing

is because their signals can propagate quite fast

and they can also switch very fast.

What I mean by switch is perform that operation

that I described where you add current to the loop.

That can happen within a few tens of picoseconds.

So you can get devices that operate

in the hundreds of gigahertz range.

And by comparison, most processors

in our conventional computers operate closer

to the one gigahertz range, maybe three gigahertz

seems to be kind of where those speeds have leveled out.

The gamers listening to this are getting really excited

to overclock their system to like, what is it?

Like four gigahertz or something,

a hundred sounds incredible.

Can I just as a tiny tangent,

is the physics of this understood well

how to do this stably?

Oh yes, the physics is understood well.

The physics of Joseph’s injunctions is understood well.

The technology is understood quite well too.

The reasons why it hasn’t displaced

silicon microelectronics in conventional digital computing

I think are more related to what I was alluding to before

about the myriad practical, almost mundane aspects

of silicon that make it so useful.

You can make a transistor ever smaller and smaller

and it will still perform its digital function quite well.

The same is not true of a Joseph’s injunction.

You really, they don’t, they just,

it’s not the same thing that there’s this feature

that you can keep making smaller and smaller

and it’ll keep performing the same operations.

This loop I described, any Joseph’s in circuit,

well, I wanna be careful, I shouldn’t say

any Joseph’s in circuit, but many Joseph’s in circuits,

the way they process information

or the way they perform whatever function it is

they’re trying to do,

maybe it’s sensing a weak magnetic field,

it depends on an interplay between the junction

and that loop.

And you can’t make that loop much smaller.

And it’s not for practical reasons

that have to do with lithography.

It’s for fundamental physical reasons

about the way the magnetic field interacts

with that superconducting material.

There are physical limits that no matter how good

our technology got, those circuits would,

I think would never be able to be scaled down

to the densities that silicon microelectronics can.

I don’t know if we mentioned,

is there something interesting

about the various superconducting materials involved

or is it all?

There’s a lot of stuff that’s interesting.

And it’s not silicon.

It’s not silicon, no.

So like it’s some materials that also required

to be super cold, four Kelvin and so on.

So let’s dissect a couple of those different things.

The super cold part,

let me just mention for your gamers out there

that are trying to clock it at four gigahertz

and would love to go to 400.

What kind of cooling system can achieve four Kelvin?

Four Kelvin, you need liquid helium.

And so liquid helium is expensive.

It’s inconvenient.

You need a cryostat that sits there

and the energy consumption of that cryostat

is impracticable for, it’s not going in your cell phone.

So you can picture holding your cell phone like this

and then something the size of a keg of beer or something

on your back to cool it.

Like that makes no sense.

So if you’re trying to make this in consumer devices,

electronics that are ubiquitous across society,

superconductors are not in the race for that.

For now, but you’re saying,

so just to frame the conversation,

maybe the thing we’re focused on

is computing systems that serve as servers, like large.

Yes, large systems.

So then you can contrast what’s going on in your cell phone

with what’s going on at one of the supercomputers.

Colleague Katie Schuman invited us out to Oak Ridge

a few years ago, so we got to see Titan

and that was when they were building Summit.

So these are some high performance supercomputers

out in Tennessee and those are filling entire rooms

the size of warehouses.

So once you’re at that level, okay,

there you’re already putting a lot of power into cooling.

Cooling is part of your engineering task

that you have to deal with.

So there it’s not entirely obvious

that cooling to four Kelvin is out of the question.

It has not happened yet and I can speak to why that is

in the digital domain if you’re interested.

I think it’s not going to happen.

I don’t think superconductors are gonna replace

semiconductors for digital computation.

There are a lot of reasons for that,

but I think ultimately what it comes down to

is all things considered cooling errors,

scaling down to feature sizes, all that stuff,

semiconductors work better at the system level.

Is there some aspect of just curious

about the historical momentum of this?

Is there some power to the momentum of an industry

that’s mass manufacturing using a certain material?

Is this like a Titanic shifting?

Like what’s your sense when a good idea comes along,

how good does that idea need to be

for the Titanic to start shifting?

That’s an excellent question.

That’s an excellent way to frame it.

And you know, I don’t know the answer to that,

but what I think is, okay,

so the history of the superconducting logic

goes back to the 70s.

IBM made a big push to do

superconducting digital computing in the 70s.

And they made some choices about their devices

and their architectures and things that in hindsight,

were kind of doomed to fail.

And I don’t mean any disrespect for the people that did it,

it was hard to see at the time.

But then another generation of superconducting logic

was introduced, I wanna say the 90s,

someone named Lykarev and Seminov,

they proposed an entire family of circuits

based on Joseph’s injunctions

that are doing digital computing based on logic gates

and or not these kinds of things.

And they showed how it could go hundreds of times faster

than silicon microelectronics.

And it’s extremely exciting.

I wasn’t working in the field at that time,

but later when I went back and read the literature,

I was just like, wow, this is so awesome.

And so you might think, well,

the reason why it didn’t display silicon

is because silicon already had so much momentum

at that time.

But that was the 90s.

Silicon kept that momentum

because it had the simple way to keep getting better.

You just make features smaller and smaller.

So it would have to be,

I don’t think it would have to be that much better

than silicon to displace it.

But the problem is it’s just not better than silicon.

It might be better than silicon in one metric,

speed of a switching operation

or power consumption of a switching operation.

But building a digital computer is a lot more

than just that elemental operation.

It’s everything that goes into it,

including the manufacturing, including the packaging,

including the various materials aspects of things.

So the reason why,

and even in some of those early papers,

I can’t remember which one it was,

Lykarev said something along the lines of,

you can see how we could build an entire family

of digital electronic circuits based on these components.

They could go a hundred or more times faster

than semiconductor logic gates.

But I don’t think that’s the right way

to use superconducting electronic circuits.

He didn’t say what the right way was,

but he basically said digital logic,

trying to steal the show from silicon

is probably not what these circuits

are most suited to accomplish.

So if we can just linger and use the word computation.

When you talk about computation, how do you think about it?

Do you think purely on just the switching,

or do you think something a little bit larger scale,

a circuit taken together,

performing the basic arithmetic operations

that are then required to do the kind of computation

that makes up a computer?

Because when we talk about the speed of computation,

is it boiled down to the basic switching,

or is there some bigger picture

that you’re thinking about?

Well, all right, so maybe we should disambiguate.

There are a variety of different kinds of computation.

I don’t pretend to be an expert

in the theory of computation or anything like that.

I guess it’s important to differentiate though

between digital logic,

which represents information as a series of bits,

binary digits, which you can think of them

as zeros and ones or whatever.

Usually they correspond to a physical system

that has two very well separated states.

And then other kinds of computation,

like we’ll get into more the way your brain works,

which it is, I think,

indisputably processing information,

but where the computation begins and ends

is not anywhere near as well defined.

It doesn’t depend on these two levels.

Here’s a zero, here’s a one.

There’s a lot of gray area

that’s usually referred to as analog computing.

Also in conventional digital computers

or digital computers in general,

you have a concept of what’s called arithmetic depth,

which is jargon that basically means

how many sequential operations are performed

to turn an input into an output.

And those kinds of computations in digital systems

are highly serial, meaning that data streams,

they don’t branch off too far to the side.

You do, you have to pull some information over there

and access memory from here and stuff like that.

But by and large, the computation proceeds

in a serial manner.

It’s not that way in the brain.

In the brain, you’re always drawing information

from different places.

It’s much more network based computing.

Neurons don’t wait for their turn.

They fire when they’re ready to fire.

And so it’s asynchronous.

So one of the other things about a digital system

is you’re performing these operations on a clock.

And that’s a crucial aspect of it.

Get rid of a clock in a digital system,

nothing makes sense anymore.

The brain has no clock.

It builds its own timescales based on its internal activity.

So you can think of the brain as kind of like this,

like network computation,

where it’s actually really trivial, simple computers,

just a huge number of them and they’re networked.

I would say it is complex, sophisticated little processors

and there’s a huge number of them.

Neurons are not, are not simple.

I don’t mean to offend neurons.

They’re very complicated and beautiful and yeah,

but we often oversimplify them.

Yes, they’re actually like there’s computation happening

within a neuron.

Right, so I would say to think of a transistor

as the building block of a digital computer is accurate.

You use a few transistors to make your logic gates.

You build up more, you build up processors

from logic gates and things like that.

So you can think of a transistor

as a fundamental building block,

or you can think of,

as we get into more highly parallelized architectures,

you can think of a processor

as a fundamental building block.

To make the analogy to the neuro side of things,

a neuron is not a transistor.

A neuron is a processor.

It has synapses, even synapses are not transistors,

but they are more,

they’re lower on the information processing hierarchy

in a sense.

They do a bulk of the computation,

but neurons are entire processors in and of themselves

that can take in many different kinds of inputs

on many different spatial and temporal scales

and produce many different kinds of outputs

so that they can perform different computations

in different contexts.

So this is where enters this distinction

between computation and communication.

So you can think of neurons performing computation

and the inter, the networking,

the interconnectivity of neurons

is communication between neurons.

And you see this with very large server systems.

I’ve been, I mentioned offline,

we’ve been talking to Jim Keller,

whose dream is to build giant computers

that, you know, the bottom like there

is often the communication

between the different pieces of computing.

So in this paper that we mentioned,

Optoelectronic Intelligence,

you say electrons excel at computation

while light is excellent for communication.

Maybe you can linger and say in this context,

what do you mean by computation and communication?

What are electrons, what is light

and why do they excel at those two tasks?

Yeah, just to first speak to computation

versus communication,

I would say computation is essentially taking in

some information, performing operations

on that information and producing new,

hopefully more useful information.

So for example, imagine you have a picture in front of you

and there is a key in it

and that’s what you’re looking for,

for whatever reason, you wanna find the key,

we all wanna find the key.

So the input is that entire picture

and the output might be the coordinates where the key is.

So you’ve reduced the total amount of information you have

but you found the useful information

for you in that present moment,

that’s the useful information.

And you think about this computation

as the controlled synchronous sequential?

Not necessarily, it could be,

that could be how your system is performing the computation

or it could be asynchronous,

there are lots of ways to find the key.

It depends on the nature of the data,

it depends on, that’s a very simplified example,

a picture with a key in it,

what about if you’re in the world

and you’re trying to decide the best way

to live your life?

It might be interactive,

it might be there might be some recurrence

or some weird asynchrony, I got it.

But there’s an input and there’s an output

and you do some stuff in the middle

that actually goes from the input to the output.

You’ve taken in information

and output different information,

hopefully reducing the total amount of information

and extracting what’s useful.

Communication is then getting that information

from the location at which it’s stored

because information is physical as Landauer emphasized

and so it is in one place

and you need to get that information to another place

so that something else can use it

for whatever computation it’s working on.

Maybe it’s part of the same network

and you’re all trying to solve the same problem

but neuron A over here just deduced something

based on its inputs

and it’s now sending that information across the network

to another location

so that would be the act of communication.

Can you linger on Landauer

and saying information is physical?

Rolf Landauer, not to be confused with Lev Landauer.

Yeah, and he made huge contributions

to our understanding of the reversibility of information

and this concept that energy has to be dissipated

in computing when the computation is irreversible

but if you can manage to make it reversible

then you don’t need to expend energy

but if you do expend energy to perform a computation

there’s sort of a minimal amount that you have to do

and it’s KT log two.

And it’s all somehow related

to the second law of thermodynamics

and that the universe is an information process

and then we’re living in a simulation.

So okay, sorry, sorry for that tangent.

So that’s the defining the distinction

between computation and communication.

Let me say one more thing just to clarify.

Communication ideally does not change the information.

It moves it from one place to another

but it is preserved.

Got it, okay.

All right, that’s beautiful.

So then the electron versus light distinction

and why are electrons good at computation

and light good at communication?

Yes, there’s a lot that goes into it I guess

but just try to speak to the simplest part of it.

Electrons interact strongly with one another.

They’re charged particles.

So if I pile a bunch of them over here

they’re feeling a certain amount of force

and they wanna move somewhere else.

They’re strongly interactive.

You can also get them to sit still.

You can, an electron has a mass

so you can cause it to be spatially localized.

So for computation that’s useful

because now I can make these little devices

that put a bunch of electrons over here

and then I change the state of a gate

like I’ve been describing,

put a different voltage on this gate

and now I move the electrons over here.

Now they’re sitting somewhere else.

I have a physical mechanism

with which I can represent information.

It’s spatially localized and I have knobs

that I can adjust to change where those electrons are

or what they’re doing.

Light by contrast, photons of light

which are the discrete packets of energy

that were identified by Einstein,

they do not interact with each other

especially at low light levels.

If you’re in a medium and you have a bright high light level

you can get them to interact with each other

through the interaction with that medium that they’re in

but that’s a little bit more exotic.

And for the purposes of this conversation

we can assume that photons don’t interact with each other.

So if you have a bunch of them

all propagating in the same direction

they don’t interfere with each other.

If I wanna send, if I have a communication channel

and I put one more photon on it,

it doesn’t screw up with those other ones.

It doesn’t change what those other ones were doing at all.

So that’s really useful for communication

because that means you can sort of allow

a lot of these photons to flow

without disruption of each other

and they can branch really easily and things like that.

But it’s not good for computation

because it’s very hard for this packet of light

to change what this packet of light is doing.

They pass right through each other.

So in computation you want to change information

and if photons don’t interact with each other

it’s difficult to get them to change the information

represented by the others.

So that’s the fundamental difference.

Is there also something about the way they travel

through different materials

or is that just a particular engineering?

No, it’s not, that’s deep physics I think.

So this gets back to electrons interact with each other

and photons don’t.

So say I’m trying to get a packet of information

from me to you and we have a wire going between us.

In order for me to send electrons across that wire

I first have to raise the voltage on my end of the wire

and that means putting a bunch of charges on it

and then that charge packet has to propagate along the wire

and it has to get all the way over to you.

That wire is gonna have something that’s called capacitance

which basically tells you how much charge

you need to put on the wire

in order to raise the voltage on it

and the capacitance is gonna be proportional

to the length of the wire.

So the longer the length of the wire is

the more charge I have to put on it

and the energy required to charge up that line

and move those electrons to you

is also proportional to the capacitance

and goes as the voltage squared.

So you get this huge penalty if you wanna send electrons

across a wire over appreciable distances.

So distance is an important thing here

when you’re doing communication.

Distance is an important thing.

So is the number of connections I’m trying to make.

Me to you, okay one, that’s not so bad.

If I want to now send it to 10,000 other friends

then all of those wires are adding tons

of extra capacitance.

Now not only does it take forever

to put the charge on that wire

and raise the voltage on all those lines

but it takes a ton of power

and the number 10,000 is not randomly chosen.

That’s roughly how many connections

each neuron in your brain makes.

So a neuron in your brain needs to send 10,000 messages

every time it has something to say.

You can’t do that if you’re trying to drive electrons

from here to 10,000 different places.

The brain does it in a slightly different way

which we can discuss.

How can light achieve the 10,000 connections

and why is it better?

In terms of like the energy use required

to use light for the communication of the 10,000 connections.

Right, right.

So now instead of trying to send electrons

from me to you, I’m trying to send photons.

So I can make what’s called a wave guide

which is just a simple piece of a material.

It could be glass like an optical fiber

or silicon on a chip.

And I just have to inject photons into that wave guide

and independent of how long it is,

independent of how many different connections I’m making,

it doesn’t change the voltage or anything like that

that I have to raise up on the wire.

So if I have one more connection,

if I add additional connections,

I need to add more light to the wave guide

because those photons need to split

and go to different paths.

That makes sense but I don’t have a capacitive penalty.

Sometimes these are called wiring parasitics.

There are no parasitics associated with light

in that same sense.

So this might be a dumb question

but how do I catch a photon on the other end?

Is it material?

Is it the polymer stuff you were talking about

for a different application for photolithography?

Like how do you catch a photon?

There’s a lot of ways to catch a photon.

It’s not a dumb question.

It’s a deep and important question

that basically defines a lot of the work

that goes on in our group at NIST.

One of my group leaders, Seywoon Nam,

has built his career around

these superconducting single photon detectors.

So if you’re going to try to sort of reach a lower limit

and detect just one particle of light,

superconductors come back into our conversation

and just picture a simple device

where you have current flowing

through a superconducting wire and…

A loop again or no?

Let’s say yes, you have a loop.

So you have a superconducting wire

that goes straight down like this

and on your loop branch, you have a little ammeter,

something that measures current.

There’s a resistor up there too.

Go with me here.

So your current biasing this,

so there’s current flowing

through that superconducting branch.

Since there’s a resistor over here,

all the current goes through the superconducting branch.

Now a photon comes in, strikes that superconductor.

We talked about this superconducting

macroscopic quantum state.

That’s going to be destroyed by the energy of that photon.

So now that branch of the circuit is resistive too.

And you’ve properly designed your circuit

so that the resistance on that superconducting branch

is much greater than the other resistance.

Now all of your current’s going to go that way.

Your ammeter says, oh, I just got a pulse of current.

That must mean I detected a photon.

Then where you broke that superconductivity

in a matter of a few nanoseconds,

it cools back off, dissipates that energy

and the current flows back

through that superconducting branch.

This is a very powerful superconducting device

that allows us to understand quantum states of light.

I didn’t realize a loop like that

could be sensitive to a single photon.

I mean, that seems strange to me because,

I mean, so what happens when you just barrage it

with photons?

If you put a bunch of photons in there,

essentially the same thing happens.

You just drive it into the normal state,

it becomes resistive and it’s not particularly interesting.

So you have to be careful how many photons you send.

Like you have to be very precise with your communication.

Well, it depends.

So I would say that that’s actually in the application

that we’re trying to use these detectors for.

That’s a feature because what we want is for,

if a neuron sends one photon to a synaptic connection

and one of these superconducting detectors is sitting there,

you get this pulse of current.

And that synapse says event,

then I’m gonna do what I do when there’s a synapse event,

I’m gonna perform computations, that kind of thing.

But if accidentally you send two there or three or five,

it does the exact same.

Got it.

And so this is how in the system that we’re devising here,

communication is entirely binary.

And that’s what I tried to emphasize a second ago.

Communication should not change the information.

You’re not saying, oh, I got this kind of communication

event for photons.

No, we’re not keeping track of that.

This neuron fired, this synapse says that neuron fired,

that’s it.

So that’s a noise filtering property of those detectors.

However, there are other applications

where you’d rather know the exact number of photons

that can be very useful in quantum computing with light.

And our group does a lot of work

around another kind of superconducting sensor

called a transition edge sensor that Adrian Alita

in our group does a lot of work on that.

And that can tell you based on the amplitude

of the current pulse you divert exactly how many photons

were in that pulse.

What’s that useful for?

One way that you can encode information

in quantum states of light is in the number of photons.

You can have what are called number states

and a number state will have a well defined number

of photons and maybe the output of your quantum computation

encodes its information in the number of photons

that are generated.

So if you have a detector that is sensitive to that,

it’s extremely useful.

Can you achieve like a clock with photons

or is that not important?

Is there a synchronicity here?

In general, it can be important.

Clock distribution is a big challenge

in especially large computational systems.

And so yes, optical clocks, optical clock distribution

is a very powerful technology.

I don’t know the state of that field right now,

but I imagine that if you’re trying to distribute a clock

across any appreciable size computational system,

you wanna use light.

Yeah, I wonder how these giant systems work,

especially like supercomputers.

Do they need to do clock distribution

or are they doing more ad hoc parallel

like concurrent programming?

Like there’s some kind of locking mechanisms or something.

That’s a fascinating question,

but let’s zoom in at this very particular question

of computation on a processor

and communication between processors.

So what does this system look like

that you’re envisioning?

One of the places you’re envisioning it

is in the paper on optoelectronic intelligence.

So what are we talking about?

Are we talking about something

that starts to look a lot like the human brain

or does it still look a lot like a computer?

What are the size of this thing?

Is it going inside a smartphone or as you said,

does it go inside something that’s more like a house?

Like what should we be imagining?

What are you thinking about

when you’re thinking about these fundamental systems?

Let me introduce the word neuromorphic.

There’s this concept of neuromorphic computing

where what that broadly refers to

is computing based on the information processing principles

of the brain.

And as digital computing seems to be pushing

towards some fundamental performance limits,

people are considering architectural advances,

drawing inspiration from the brain,

more distributed parallel network kind of architectures

and stuff.

And so there’s this continuum of neuromorphic

from things that are pretty similar to digital computers,

but maybe there are more cores

and the way they send messages is a little bit more

like the way brain neurons send spikes.

But for the most part, it’s still digital electronics.

And then you have some things in between

where maybe you’re using transistors,

but now you’re starting to use them

instead of in a digital way, in an analog way.

And so you’re trying to get those circuits

to behave more like neurons.

And then that’s a little bit,

quite a bit more on the neuromorphic side of things.

You’re trying to get your circuits,

although they’re still based on silicon,

you’re trying to get them to perform operations

that are highly analogous to the operations in the brain.

And that’s where a great deal of work is

in neuromorphic computing,

people like Giacomo Indoveri and Gert Kauenberg,

Jennifer Hasler, countless others.

It’s a rich and exciting field going back to Carver Mead

in the late 1980s.

And then all the way on the other extreme of the continuum

is where you say, I’ll give up anything related

to transistors or semiconductors or anything like that.

I’m not starting with the assumption

that I’m gonna use any kind

of conventional computing hardware.

And instead, what I wanna do is try and understand

what makes the brain powerful

at the kind of information processing it does.

And I wanna think from first principles

about what hardware is best going to enable us

to capture those information processing principles

in an artificial system.

And that’s where I live.

That’s where I’m doing my exploration these days.

So what are the first principles

of brain like computation communication?

Right, yeah, this is so important

and I’m glad we booked 14 hours for this because.

I only have 13, I’m sorry.

Okay, so the brain is notoriously complicated.

And I think that’s an important part

of why it can do what it does.

But okay, let me try to break it down.

Starting with the devices, neurons, as I said before,

they’re sophisticated devices in and of themselves

and synapses are too.

They can change their state based on the activity.

So they adapt over time.

That’s crucial to the way the brain works.

They don’t just adapt on one timescale,

they can adapt on myriad timescales

from the spacing between pulses,

the spacing between spikes that come from neurons

all the way to the age of the organism.

Also relevant, perhaps I think the most important thing

that’s guided my thinking is the network structure

of the brain, so.

Which can also be adjusted on different scales.

Absolutely, yes, so you’re making new,

you’re changing the strength of contacts,

you’re changing the spatial distribution of them,

although spatial distribution doesn’t change that much

once you’re a mature organism.

But that network structure is really crucial.

So let me dwell on that for a second.

You can’t talk about the brain without emphasizing

that most of the neurons in the neocortex

or the prefrontal cortex, the part of the brain

that we think is most responsible for high level reasoning

and things like that,

those neurons make thousands of connections.

So you have this network that is highly interconnected.

And I think it’s safe to say that one of the primary reasons

that they make so many different connections

is that allows information to be communicated very rapidly

from any spot in the network

to any other spot in the network.

So that’s a sort of spatial aspect of it.

You can quantify this in terms of concepts

that are related to fractals and scale invariants,

which I think is a very beautiful concept.

So what I mean by that is kind of,

no matter what spatial scale you’re looking at in the brain

within certain bounds, you see the same

general statistical pattern.

So if I draw a box around some region of my cortex,

most of the connections that those neurons

within that box make are gonna be within the box

to each other in their local neighborhood.

And that’s sort of called clustering, loosely speaking.

But a non negligible fraction

is gonna go outside of that box.

And then if I draw a bigger box,

the pattern is gonna be exactly the same.

So you have this scale invariants,

and you also have a non vanishing probability

of a neuron making connection very far away.

So suppose you wanna plot the probability

of a neuron making a connection as a function of distance.

If that were an exponential function,

it would go e to the minus radius

over some characteristic radius,

and it would drop off up to some certain radius,

the probability would be reasonably close to one,

and then beyond that characteristic length R zero,

it would drop off sharply.

And so that would mean that the neurons in your brain

are really localized, and that’s not what we observe.

Instead, what you see is that the probability

of making a longer distance connection, it does drop off,

but it drops off as a power law.

So the probability that you’re gonna have a connection

at some radius R goes as R to the minus some power.

And that’s more, that’s what we see with forces in nature,

like the electromagnetic force

between two particles or gravity

goes as one over the radius squared.

So you can see this in fractals.

I love that there’s like a fractal dynamics of the brain

that if you zoom out, you draw the box

and you increase that box by certain step sizes,

you’re gonna see the same statistics.

I think that’s probably very important

to the way the brain processes information.

It’s not just in the spatial domain,

it’s also in the temporal domain.

And what I mean by that is…

That’s incredible that this emerged

through the evolutionary process

that potentially somehow connected

to the way the physics of the universe works.

Yeah, I couldn’t agree more that it’s a deep

and fascinating subject that I hope to be able

to spend the rest of my life studying.

You think you need to solve, understand this,

this fractal nature in order to understand intelligence

and communication. I do think so.

I think they’re deeply intertwined.

Yes, I think power laws are right at the heart of it.

So just to push that one through,

the same thing happens in the temporal domain.

So suppose your neurons in your brain

were always oscillating at the same frequency,

then the probability of finding a neuron oscillating

as a function of frequency

would be this narrowly peaked function

around that certain characteristic frequency.

That’s not at all what we see.

The probability of finding neurons oscillating

or producing spikes at a certain frequency

is again a power law,

which means there’s no defined scale

of the temporal activity in the brain.

At what speed do your thoughts occur?

Well, there’s a fastest speed they can occur

and that is limited by communication and other things,

but there’s not a characteristic scale.

We have thoughts on all temporal scales

from a few tens of milliseconds,

which is physiologically limited by our devices,

compare that to tens of picoseconds

that I talked about in superconductors,

all the way up to the lifetime of the organism.

You can still think about things

that happened to you when you were a kid.

Or if you wanna be really trippy

then across multiple organisms

in the entirety of human civilization,

you have thoughts that span organisms, right?

Yes, taking it to that level, yes.

If you’re willing to see the entirety of the human species

as a single organism with a collective intelligence

and that too on a spatial and temporal scale,

there’s thoughts occurring.

And then if you look at not just the human species,

but the entirety of life on earth

as an organism with thoughts that are occurring,

that are greater and greater sophisticated thoughts,

there’s a different spatial and temporal scale there.

This is getting very suspicious.

Well, hold on though, before we’re done,

I just wanna just tie the bow

and say that the spatial and temporal aspects

are intimately interrelated with each other.

So activity between neurons that are very close to each other

is more likely to happen on this faster timescale

and information is gonna propagate

and encompass more of the brain,

more of your cortices, different modules in the brain

are gonna be engaged in information processing

on longer timescales.

So there’s this concept of information integration

where neurons are specialized.

Any given neuron or any cluster of neuron

has its specific purpose,

but they’re also very much integrated.

So you have neurons that specialize,

but share their information.

And so that happens through these fractal nested oscillations

that occur across spatial and temporal scales.

I think capturing those dynamics in hardware,

to me, that’s the goal of neuromorphic computing.

So does it need to look,

so first of all, that’s fascinating.

We stated some clear principles here.

Now, does it have to look like the brain

outside of those principles as well?

Like what other characteristics

have to look like the human brain?

Or can it be something very different?

Well, it depends on what you’re trying to use it for.

And so I think a lot of the community

asks that question a lot.

What are you gonna do with it?

And I completely get it.

I think that’s a very important question.

And it’s also sometimes not the most helpful question.

What if what you wanna do with it is study it?

What if you just wanna see,

what do you have to build into your hardware

in order to observe these dynamical principles?

And also, I ask myself that question every day

and I’m not sure I’m able to answer that.

So like, what are you gonna do

with this particular neuromorphic machine?

So suppose what we’re trying to do with it

is build something that thinks.

We’re not trying to get it to make us any money

or drive a car.

Maybe we’ll be able to do that, but that’s not our goal.

Our goal is to see if we can get the same types of behaviors

that we observe in our own brain.

And by behaviors in this sense,

what I mean the behaviors of the components,

the neurons, the network, that kind of stuff.

I think there’s another element that I didn’t really hit on

that you also have to build into this.

And those are architectural principles.

They have to do with the hierarchical modular construction

of the network.

And without getting too lost in jargon,

the main point that I think is relevant there,

let me try and illustrate it with a cartoon picture

of the architecture of the brain.

So in the brain, you have the cortex,

which is sort of this outer sheet.

It’s actually, it’s a layered structure.

You can, if you could take it out of your brain,

you could unroll it on the table

and it would be about the size of a pizza sitting there.

And that’s a module.

It does certain things.

It processes as Yogi Buzaki would say,

it processes the what of what’s going on around you.

But you have another really crucial module

that’s called the hippocampus.

And that network is structured entirely differently.

First of all, this cortex that had described

10 billion neurons in there.

So numbers matter here.

And they’re organized in that sort of power law distribution

where the probability of making a connection drops off

as a power law in space.

The hippocampus is another module that’s important

for understanding how, where you are,

when you are keeping track of your position

in space and time.

And that network is very much random.

So the probability of making a connection,

it almost doesn’t even drop off as a function of distance.

It’s the same probability that you’ll make it here

to over there, but there are only about 100 million neurons

there, so you can have that huge densely connected module

because it’s not so big.

And the neocortex or the cortex and the hippocampus,

they talk to each other constantly.

And that communication is largely facilitated

by what’s called the thalamus.

I’m not a neuroscientist here.

I’m trying to do my best to recite things.

Cartoon picture of the brain, I gotcha.

Yeah, something like that.

So this thalamus is coordinating the activity

between the neocortex and the hippocampus

and making sure that they talk to each other

at the right time and send messages

that will be useful to one another.

So this all taken together is called

the thalamocortical complex.

And it seems like building something like that

is going to be crucial to capturing the types of activity

we’re looking for because those responsibilities,

those separate modules, they do different things,

that’s gotta be central to achieving these states

of efficient information integration across space and time.

By the way, I am able to achieve this state

by watching simulations, visualizations

of the thalamocortical complex.

There’s a few people I forget from where.

They’ve created these incredible visual illustrations

of visual stimulation from the eye or something like that.

And this image flowing through the brain.

Wow, I haven’t seen that, I gotta check that out.

So it’s one of those things,

you find this stuff in the world,

and you see on YouTube, it has 1,000 views,

these visualizations of the human brain

processing information.

And because there’s chemistry there,

because this is from actual human brains,

I don’t know how they’re doing the coloring,

but they’re able to actually trace

the different, the chemical and the electrical signals

throughout the brain, and the visual thing,

it’s like, whoa, because it looks kinda like the universe,

I mean, the whole thing is just incredible.

I recommend it highly, I’ll probably post a link to it.

But you can just look for, one of the things they simulate

is the thalamocortical complex and just visualization.

You can find that yourself on YouTube, but it’s beautiful.

The other question I have for you is,

how does memory play into all of this?

Because all the signals sending back and forth,

that’s computation and communication,

but that’s kinda like processing of inputs and outputs,

to produce outputs in the system,

that’s kinda like maybe reasoning,

maybe there’s some kind of recurrence.

But is there a storage mechanism that you think about

in the context of neuromorphic computing?

Yeah, absolutely, so that’s gotta be central.

You have to have a way that you can store memories.

And there are a lot of different kinds

of memory in the brain.

That’s yet another example of how it’s not a simple system.

So there’s one kind of memory,

one way of talking about memory,

usually starts in the context of Hopfield networks.

You were lucky to talk to John Hopfield on this program.

But the basic idea there is working memory

is stored in the dynamical patterns

of activity between neurons.

And you can think of a certain pattern of activity

as an attractor, meaning if you put in some signal

that’s similar enough to other

previously experienced signals like that,

then you’re going to converge to the same network dynamics

and you will see these neurons

participate in the same network patterns of activity

that they have in the past.

So you can talk about the probability

that different inputs will allow you to converge

to different basins of attraction

and you might think of that as,

oh, I saw this face and then I excited

this network pattern of activity

because last time I saw that face,

I was at some movie and that’s a famous person

that’s on the screen or something like that.

So that’s one memory storage mechanism.

But crucial to the ability to imprint those memories

in your brain is the ability to change

the strength of connection between one neuron and another,

that synaptic connection between them.

So synaptic weight update is a massive field of neuroscience

and neuromorphic computing as well.

So there are two poles on that spectrum.

Okay, so more in the language of machine learning,

we would talk about supervised and unsupervised learning.

And when I’m trying to tie that down

to neuromorphic computing,

I will use a definition of supervised learning,

which basically means the external user,

the person who’s controlling this hardware

has some knob that they can tune

to change each of the synaptic weights,

depending on whether or not the network

is doing what you want it to do.

Whereas what I mean in this conversation

when I say unsupervised learning

is that those synaptic weights

are dynamically changing in your network

based on nothing that the user is doing,

nothing that there’s no wire from the outside

going into any of those synapses.

The network itself is reconfiguring those synaptic weights

based on physical properties

that you’ve built into the devices.

So if the synapse receives a pulse from here

and that causes the neuron to spike,

some circuit built in there with no help from me

or anybody else adjust the weight

in a way that makes it more likely

to store the useful information

and excite the useful network patterns

and makes it less likely that random noise,

useless communication events

will have an important effect on the network activity.

So there’s memory encoded in the weights,

the synaptic weights.

What about the formation of something

that’s not often done in machine learning,

the formation of new synaptic connections?

Right, well, that seems to,

so again, not a neuroscientist here,

but my reading of the literature

is that that’s particularly crucial

in early stages of brain development

where a newborn is born

with tons of extra synaptic connections

and it’s actually pruned over time.

So the number of synapses decreases

as opposed to growing new long distance connections.

It is possible in the brain to grow new neurons

and assign new synaptic connections

but it doesn’t seem to be the primary mechanism

by which the brain is learning.

So for example, like right now,

sitting here talking to you,

you say lots of interesting things

and I learn from you

and I can remember things that you just said

and I didn’t grow new axonal connections

down to new synapses to enable those.

It’s plasticity mechanisms

in the synaptic connections between neurons

that enable me to learn on that timescale.

So at the very least,

you can sufficiently approximate that

with just weight updates.

You don’t need to form new connections.

I would say weight updates are a big part of it.

I also think there’s more

because broadly speaking,

when we’re doing machine learning,

our networks, say we’re talking about feed forward,

deep neural networks,

the temporal domain is not really part of it.

Okay, you’re gonna put in an image

and you’re gonna get out a classification

and you’re gonna do that as fast as possible.

So you care about time

but time is not part of the essence of this thing really.

Whereas in spiking neural networks,

what we see in the brain,

time is as crucial as space

and they’re intimately intertwined

as I’ve tried to say.

And so adaptation on different timescales

is important not just in memory formation,

although it plays a key role there,

but also in just keeping the activity

in a useful dynamic range.

So you have other plasticity mechanisms,

not just weight update,

or at least not on the timescale

of many action potentials,

but even on the shorter timescale.

So a synapse can become much less efficacious.

It can transmit a weaker signal

after the second, third, fourth,

that can second, third, fourth action potential

to occur in a sequence.

So that’s what’s called short term synaptic plasticity,

which is a form of learning.

You’re learning that I’m getting too much stimulus

from looking at something bright right now.

So I need to tone that down.

There’s also another really important mechanism

in learning that’s called metoplasticity.

What that seems to be is a way

that you change not the weights themselves,

but the rate at which the weights change.

So when I am in say a lecture hall and my,

this is a potentially terrible cartoon example,

but let’s say I’m in a lecture hall

and it’s time to learn, right?

So my brain will release more,

perhaps dopamine or some neuromodulator

that’s gonna change the rate

at which synaptic plasticity occurs.

So that can make me more sensitive

to learning at certain times,

more sensitive to overriding previous information

and less sensitive at other times.

And finally, as long as I’m rattling off the list,

I think another concept that falls in the category

of learning or memory adaptation is homeostasis

or homeostatic adaptation,

where neurons have the ability

to control their firing rate.

So if one neuron is just like blasting way too much,

it will naturally tone itself down.

Its threshold will adjust

so that it stays in a useful dynamical range.

And we see that that’s captured in deep neural networks

where you don’t just change the synaptic weights,

but you can also move the thresholds of simple neurons

in those models.

And so to achieve the spiking neural networks,

you want to use,

you want to implement the first principles

that you mentioned of the temporal

and the spatial fractal dynamics here.

So you can communicate locally,

you can communicate across much greater distances

and do the same thing in space

and do the same thing in time.

Now, you have like a chapter called

Superconducting Hardware for Neuromorphic Computing.

So what are some ideas that integrate

some of the things we’ve been talking about

in terms of the first principles of neuromorphic computing

and the ideas that you outline

in optoelectronic intelligence?

Yeah, so let me start, I guess,

on the communication side of things,

because that’s what led us down this track

in the first place.

By us, I’m talking about my team of colleagues at NIST,

Saeed Han, Bryce Brimavera, Sonia Buckley,

Jeff Chiles, Adam McCallum to name,

Alex Tate to name a few,

our group leaders, Saewoo Nam and Rich Mirren.

We’ve all contributed to this.

So this is not me saying necessarily

just the things that I’ve proposed,

but sort of where our team’s thinking

has evolved over the years.

Can I quickly ask, what is NIST

and where is this amazing group of people located?

NIST is the National Institute of Standards and Technology.

The larger facility is out in Gaithersburg, Maryland.

Our team is located in Boulder, Colorado.

NIST is a federal agency under the Department of Commerce.

We do a lot with, by we, I mean other people at NIST,

do a lot with standards,

making sure that we understand the system of units,

international system of units, precision measurements.

There’s a lot going on in electrical engineering,

material science.

And it’s historic.

I mean, it’s one of those, it’s like MIT

or something like that.

It has a reputation over many decades

of just being this really a place

where there’s a lot of brilliant people have done

a lot of amazing things.

But in terms of the people in your team,

in this team of people involved

in the concept we’re talking about now,

I’m just curious,

what kind of disciplines are we talking about?

What is it?

Mostly physicists and electrical engineers,

some material scientists,

but I would say,

yeah, I think physicists and electrical engineers,

my background is in photonics,

the use of light for technology.

So coming from there, I tend to have found colleagues

that are more from that background.

Although Adam McConn,

more of a superconducting electronics background,

we need a diversity of folks.

This project is sort of cross disciplinary.

I would love to be working more

with neuroscientists and things,

but we haven’t reached that scale yet.

But yeah.

You’re focused on the hardware side,

which requires all the disciplines that you mentioned.

And then of course,

neuroscientists may be a source of inspiration

for some of the longterm vision.

I would actually call it more than inspiration.

I would call it sort of a roadmap.

We’re not trying to build exactly the brain,

but I don’t think it’s enough to just say,

oh, neurons kind of work like that.

Let’s kind of do that thing.

I mean, we’re very much following the concepts

that the cognitive sciences have laid out for us,

which I believe is a really robust roadmap.

I mean, just on a little bit of a tangent,

it’s often stated that we just don’t understand the brain.

And so it’s really hard to replicate it

because we just don’t know what’s going on there.

And maybe five or seven years ago,

I would have said that,

but as I got more interested in the subject,

I read more of the neuroscience literature

and I was just taken by the exact opposite sense.

I can’t believe how much they know about this.

I can’t believe how mathematically rigorous

and sort of theoretically complete

a lot of the concepts are.

That’s not to say we understand consciousness

or we understand the self or anything like that,

but what is the brain doing

and why is it doing those things?

Neuroscientists have a lot of answers to those questions.

So if you’re a hardware designer

that just wants to get going,

whoa, it’s pretty clear which direction to go in, I think.

Okay, so I love the optimism behind that,

but in the implementation of these systems

that uses superconductivity, how do you make it happen?

So to me, it starts with thinking

about the communication network.

You know for sure that the ability of each neuron

to communicate to many thousands of colleagues

across the network is indispensable.

I take that as a core principle of my architecture,

my thinking on the subject.

So coming from a background in photonics,

it was very natural to say,

okay, we’re gonna use light for communication.

Just in case listeners may not know,

light is often used in communication.

I mean, if you think about radio, that’s light,

it’s long wavelengths, but it’s electromagnetic radiation.

It’s the same physical phenomenon

obeying exactly the same Maxwell’s equations.

And then all the way down to fiber, fiber optics.

Now you’re using visible

or near infrared wavelengths of light,

but the way you send messages across the ocean

is now contemporary over optical fibers.

So using light for communication is not a stretch.

It makes perfect sense.

So you might ask, well, why don’t you use light

for communication in a conventional microchip?

And the answer to that is, I believe, physical.

If we had a light source on a silicon chip

that was as simple as a transistor,

there would not be a processor in the world

that didn’t use light for communication,

at least above some distance.

How many light sources are needed?

Oh, you need a light source at every single point.

A light source per neuron.

Per neuron, per little,

but then if you could have a really small

and nice light source,

your definition of neuron could be flexible.

Could be, yes, yes.

Sometimes it’s helpful to me to say,

in this hardware, a neuron is that entity

which has a light source.

That, and I can explain.

And then there was light.

I mean, I can explain more about that, but.

Somehow this like rhymes with consciousness

because people will often say the light of consciousness.

So that consciousness is that which is conscious.

I got it.

That’s not my quote.

That’s me, that’s my quote.

You see, that quote comes from my background.

Yours is in optics, mine in light, mine’s in darkness.

So go ahead.

So the point I was making there is that

if it was easy to manufacture light sources

along with transistors on a silicon chip,

they would be everywhere.

And it’s not easy.

People have been trying for decades

and it’s actually extremely difficult.

I think an important part of our research

is dwelling right at that spot there.

So.

Is it physics or engineering?

It’s physics.

So, okay, so it’s physics, I think.

So what I mean by that is, as we discussed,

silicon is the material of choice for transistors

and it’s very difficult to imagine

that that’s gonna change anytime soon.

Silicon is notoriously bad at emitting light.

And that has to do with the immutable properties

of silicon itself.

The way that the energy bands are structured in silicon,

you’re never going to make silicon efficient

as a light source at room temperature

without doing very exotic things

that degrade its ability to interface nicely

with those transistors in the first place.

So that’s like one of these things where it’s,

why is nature dealing us that blow?

You give us these beautiful transistors

and you give us all the motivation

to use light for communication,

but then you don’t give us a light source.

So, well, okay, you do give us a light source.

Compound semiconductors,

like we talked about back at the beginning,

an element from group three and an element from group five

form an alloy where every other lattice site

switches which element it is.

Those have much better properties for generating light.

You put electrons in, light comes out.

Almost 100% of the electron hold,

it can be made efficient.

I’ll take your word for it, okay.

However, I say it’s physics, not engineering,

because it’s very difficult

to get those compound semiconductor light sources

situated with your silicon.

In order to do that ion implantation

that I talked about at the beginning,

high temperatures are required.

So you gotta make all of your transistors first

and then put the compound semiconductors on top of there.

You can’t grow them afterwards

because that requires high temperature.

It screws up all your transistors.

You try and stick them on there.

They don’t have the same lattice constant.

The spacing between atoms is different enough

that it just doesn’t work.

So nature does not seem to be telling us that,

hey, go ahead and combine light sources

with your digital switches

for conventional digital computing.

And conventional digital computing

will often require smaller scale, I guess,

in terms of like smartphone.

So in which kind of systems does nature hint

that we can use light and photons for communication?

Well, so let me just try and be clear.

You can use light for communication in digital systems,

just the light sources are not intimately integrated

with the silicon.

You manufacture all the silicon,

you have your microchip, plunk it down.

And then you manufacture your light sources,

separate chip, completely different process

made in a different foundry.

And then you put those together at the package level.

So now you have some,

I would say a great deal of architectural limitations

that are introduced by that sort of

package level integration

as opposed to monolithic on the same chip integration,

but it’s still a very useful thing to do.

And that’s where I had done some work previously

before I came to NIST.

There’s a project led by Vladimir Stoyanovich

that now spun out into a company called IR Labs

led by Mark Wade and Chen Sun

where they’re doing exactly that.

So you have your light source chip,

your silicon chip, whatever it may be doing,

maybe it’s digital electronics,

maybe it’s some other control purpose, something.

And the silicon chip drives the light source chip

and modulates the intensity of the lights.

You can get data out of the package on an optical fiber.

And that still gives you tremendous advantages in bandwidth

as opposed to sending those signals out

over electrical lines.

But it is somewhat peculiar to my eye

that they have to be integrated at this package level.

And those people, I mean, they’re so smart.

Those are my colleagues that I respect a great deal.

So it’s very clear that it’s not just

they’re making a bad choice.

This is what physics is telling us.

It just wouldn’t make any sense

to try to stick them together.

Yeah, so even if it’s difficult,

it’s easier than the alternative, unfortunately.

I think so, yes.

And again, I need to go back

and make sure that I’m not taking the wrong way.

I’m not saying that the pursuit

of integrating compound semiconductors with silicon

is fruitless and shouldn’t be pursued.

It should, and people are doing great work.

Kai Mei Lau and John Bowers, others,

they’re doing it and they’re making progress.

But to my eye, it doesn’t look like that’s ever going to be

just the standard monolithic light source

on silicon process.

I just don’t see it.

Yeah, so nature kind of points the way usually.

And if you resist nature,

you’re gonna have to do a lot more work.

And it’s gonna be expensive and not scalable.

Got it.

But okay, so let’s go far into the future.

Let’s imagine this gigantic neuromorphic computing system

that simulates all of our realities.

It currently is Mantra Matrix 4.

So this thing, this powerful computer,

how does it operate?

So what are the neurons?

What is the communication?

What’s your sense?

All right, so let me now,

after spending 45 minutes trashing

light source integration with silicon,

let me now say why I’m basing my entire life,

professional life, on integrating light sources

with electronics.

I think the game is completely different

when you’re talking about superconducting electronics.

For several reasons, let me try to go through them.

One is that, as I mentioned,

it’s difficult to integrate

those compound semiconductor light sources with silicon.

With silicon is a requirement that is introduced

by the fact that you’re using semiconducting electronics.

In superconducting electronics,

you’re still gonna start with a silicon wafer,

but it’s just the bread for your sandwich in a lot of ways.

You’re not using that silicon

in precisely the same way for the electronics.

You’re now depositing superconducting materials

on top of that.

The prospects for integrating light sources

with that kind of an electronic process

are certainly less explored,

but I think much more promising

because you don’t need those light sources

to be intimately integrated with the transistors.

That’s where the problems come up.

They don’t need to be lattice matched to the silicon,

all that kind of stuff.

Instead, it seems possible

that you can take those compound semiconductor light sources,

stick them on the silicon wafer,

and then grow your superconducting electronics

on the top of that.

It’s at least not obviously going to fail.

So the computation would be done

on the superconductive material as well?

Yes, the computation is done

in the superconducting electronics,

and the light sources receive signals

that say, hey, a neuron reached threshold,

produce a pulse of light,

send it out to all your downstream synaptic connections.

Those are, again, superconducting electronics.

Perform your computation,

and you’re off to the races.

Your network works.

So then if we can rewind real quick,

so what are the limitations of the challenges

of superconducting electronics

when we think about constructing these kinds of systems?

So actually, let me say one other thing

about the light sources,

and then I’ll move on, I promise,

because this is probably tedious for some.

This is super exciting.

Okay, one other thing about the light sources.

I said that silicon is terrible at emitting photons.

It’s just not what it’s meant to do.

However, the game is different

when you’re at low temperature.

If you’re working with superconductors,

you have to be at low temperature

because they don’t work otherwise.

When you’re at four Kelvin,

silicon is not obviously a terrible light source.

It’s still not as efficient as compound semiconductors,

but it might be good enough for this application.

The final thing that I’ll mention about that is, again,

leveraging superconductors, as I said,

in a different context,

superconducting detectors can receive one single photon.

In that conversation, I failed to mention

that semiconductors can also receive photons.

That’s the primary mechanism by which it’s done.

A camera in your phone that’s receptive to visible light

is receiving photons.

It’s based on silicon,

or you can make it in different semiconductors

for different wavelengths,

but it requires on the order of a thousand,

a few thousand photons to receive a pulse.

Now, when you’re using a superconducting detector,

you need one photon, exactly one.

I mean, one or more.

So the fact that your synapses can now be based

on superconducting detectors

instead of semiconducting detectors

brings the light levels that are required

down by some three orders of magnitude.

So now you don’t need good light sources.

You can have the world’s worst light sources.

As long as they spit out maybe a few thousand photons

every time a neuron fires,

you have the hardware principles in place

that you might be able to perform

this optoelectronic integration.

To me optoelectronic integration is, it’s just so enticing.

We want to be able to leverage electronics for computation,

light for communication,

working with silicon microelectronics at room temperature

that has been exceedingly difficult.

And I hope that when we move to the superconducting domain,

target a different application space

that is neuromorphic instead of digital

and use superconducting detectors,

maybe optoelectronic integration comes to us.

Okay, so there’s a bunch of questions.

So one is temperature.

So in these kinds of hybrid heterogeneous systems,

what’s the temperature?

What are some of the constraints to the operation here?

Does it all have to be a four Kelvin as well?

Four Kelvin.

Everything has to be at four Kelvin.

Okay, so what are the other engineering challenges

of making this kind of optoelectronic systems?

Let me just dwell on that four Kelvin for a second

because some people hear four Kelvin

and they just get up and leave.

They just say, I’m not doing it, you know?

And to me, that’s very earth centric, species centric.

We live in 300 Kelvin.

So we want our technologies to operate there too.

I totally get it.

Yeah, what’s zero Celsius?

Zero Celsius is 273 Kelvin.

So we’re talking very, very cold here.

This is…

Not even Boston cold.

No.

This is real cold.

Yeah.

Siberia cold, no.

Okay, so just for reference,

the temperature of the cosmic microwave background

is about 2.7 Kelvin.

So we’re still warmer than deep space.

Yeah, good.

So that when the universe dies out,

it’ll be colder than four K.

It’s already colder than four K.

In the expanses, you know,

you don’t have to get that far away from the earth

in order to drop down to not far from four Kelvin.

So what you’re saying is the aliens that live at the edge

of the observable universe

are using superconductive material for their computation.

They don’t have to live at the edge of the universe.

The aliens that are more advanced than us

in their solar system are doing this

in their asteroid belt.

We can get to that.

Oh, because they can get that

to that temperature easier there?

Sure, yeah.

All you have to do is reflect the sunlight away

and you have a huge headstart.

Oh, so the sun is the problem here.

Like it’s warm here on earth.

Got it. Yeah.

Okay, so can you…

So how do we get to four K?

What’s…

Well, okay, so what I want to say about temperature…

Yeah.

What I want to say about temperature is that

if you can swallow that,

if you can say, all right, I give up applications

that have to do with my cell phone

and the convenience of a laptop on a train

and you instead…

For me, I’m very much in the scientific head space.

I’m not looking at products.

I’m not looking at what this will be useful

to sell to consumers.

Instead, I’m thinking about scientific questions.

Well, it’s just not that bad to have to work at four Kelvin.

We do it all the time in our labs at NIST.

And so does…

I mean, for reference,

the entire quantum computing sector

usually has to work at something like 100 millikelvin,

50 millikelvin.

So now you’re talking of another factor of 100

even colder than that, a fraction of a degree.

And everybody seems to think quantum computing

is going to take over the world.

It’s so much more expensive

to have to get that extra factor of 10 or whatever colder.

And yet it’s not stopping people from investing in that area.

And by investing, I mean putting their research into it

as well as venture capital or whatever.

So…

Oh, so based on the energy of what you’re commenting on,

I’m getting a sense that’s one of the criticism

of this approach is 4K, 4 Kelvin is a big negative.

It is the showstopper for a lot of people.

They just, I mean, and understandably,

I’m not saying that that’s not a consideration.

Of course it is.

For some…

Okay, so different motivations for different people.

In the academic world,

suppose you spent your whole life

learning about silicon microelectronic circuits.

You send a design to a foundry,

they send you back a chip

and you go test it at your tabletop.

And now I’m saying,

here now learn how to use all these cryogenics

so you can do that at 4 Kelvin.

No, come on, man.

I don’t wanna do that.

That sounds bad.

It’s the old momentum, the Titanic of the turning.

Yeah, kind of.

But you’re saying that’s not too much of a…

When we’re looking at large systems

and the gain you can potentially get from them,

that’s not that much of a cost.

And when you wanna answer the scientific question

about what are the physical limits of cognition?

Well, the physical limits,

they don’t care if you’re at 4 Kelvin.

If you can perform cognition at a scale

orders of magnitude beyond any room temperature technology,

but you gotta get cold to do it,

you’re gonna do it.

And to me, that’s the interesting application space.

It’s not even an application space,

that’s the interesting scientific paradigm.

So I personally am not going to let low temperature

stop me from realizing a technological domain or realm

that is achieving in most ways everything else

that I’m looking for in my hardware.

So that, okay, that’s a big one.

Is there other kind of engineering challenges

that you envision?

Yeah, yeah, yeah.

So let me take a moment here

because I haven’t really described what I mean

by a neuron or a network in this particular hardware.

Yeah, do you wanna talk about loop neurons

and there’s so many fascinating…

But you just have so many amazing papers

that people should definitely check out

and the titles alone are just killer.

So anyway, go ahead.

Right, so let me say big picture,

based on optics, photonics for communication,

superconducting electronics for computation,

how does this all work?

So a neuron in this hardware platform

can be thought of as circuits

that are based on Josephson junctions,

like we talked about before,

where every time a photon comes in…

So let’s start by talking about a synapse.

A synapse receives a photon, one or more,

from a different neuron

and it converts that optical signal

to an electrical signal.

The amount of current that that adds to a loop

is controlled by the synaptic weight.

So as I said before,

you’re popping fluxons into a loop, right?

So a photon comes in,

it hits a superconducting single photon detector,

one photon, the absolute physical minimum

that you can communicate

from one place to another with light.

And that detector then converts that

into an electrical signal

and the amount of signal

is correlated with some kind of weight.

Yeah, so the synaptic weight will tell you

how many fluxons you pop into the loop.

It’s an analog number.

We’re doing analog computation now.

Well, can you just linger on that?

What the heck is a fluxon?

Are we supposed to know this?

Or is this a funny,

is this like the big bang?

Is this a funny word for something deeply technical?

No, let’s try to avoid using the word fluxon

because it’s not actually necessary.

When a photon…

It’s fun to say though.

So it’s very necessary, I would say.

When a photon hits

that superconducting single photon detector,

current is added to a superconducting loop.

And the amount of current that you add

is an analog value,

can have eight bit equivalent resolution,

something like that.

10 bits, maybe.

That’s amazing, by the way.

This is starting to make a lot more sense.

When you’re using superconductors for this,

the energy of that circulating current

is less than the energy of that photon.

So your energy budget is not destroyed

by doing this analog computation.

So now in the language of a neuroscientist,

you would say that’s your postsynaptic signal.

You have this current being stored in a loop.

You can decide what you wanna do with it.

Most likely you’re gonna have it decay exponentially.

So every single synapse

is gonna have some given time constant.

And that’s determined by putting some resistor

in that superconducting loop.

So a synapse event occurs when a photon strikes a detector,

adds current to that loop, it decays over time.

That’s the postsynaptic signal.

Then you can process that in a dendritic tree.

Bryce Primavera and I have a paper

that we’ve submitted about that.

For the more neuroscience oriented people,

there’s a lot of dendritic processing,

a lot of plasticity mechanisms you can implement

with essentially exactly the same circuits.

You have this one simple building block circuit

that you can use for a synapse, for a dendrite,

for the neuron cell body, for all the plasticity functions.

It’s all based on the same building block,

just tweaking a couple parameters.

So this basic building block

has both an optical and an electrical component,

and then you just build arbitrary large systems with that?

Close, you’re not at fault

for thinking that that’s what I meant.

What I should say is that if you want it to be a synapse,

you tack a superconducting detector onto the front of it.

And if you want it to be anything else,

there’s no optical component.

Got it, so at the front,

optics in the front, electrical stuff in the back.

Electrical, yeah, in the processing

and in the output signal that it sends

to the next stage of processing further.

So the dendritic trees is electrical.

It’s all electrical.

It’s all electrical in the superconducting domain.

For anybody who’s up on their superconducting circuits,

it’s just based on a DC squid, the most ubiquitous,

which is a circuit composed of two Joseph’s injunctions.

So it’s a very bread and butter kind of thing.

And then the only place where you go beyond that

is the neuron cell body itself.

It’s receiving all these electrical inputs

from the synapses or dendrites

or however you’ve structured that particular unique neuron.

And when it reaches its threshold,

which occurs by driving a Joseph’s injunction

above its critical current,

it produces a pulse of current,

which starts an amplification sequence,

voltage amplification,

that produces light out of a transmitter.

So one of our colleagues, Adam McCann,

and Sonia Buckley as well,

did a lot of work on the light sources

and the amplifiers that drive the current

and produce sufficient voltage to drive current

through that now semiconducting part.

So that light source is the semiconducting part of a neuron.

And that, so the neuron has reached threshold.

It produces a pulse of light.

That light then fans out across a network of wave guides

to reach all the downstream synaptic terminals

that perform this process themselves.

So it’s probably worth explaining

what a network of wave guides is,

because a lot of listeners aren’t gonna know that.

Look up the papers by Jeff Chiles on this one.

But basically, light can be guided in a simple,

basically wire of usually an insulating material.

So silicon, silicon nitride,

different kinds of glass,

just like in a fiber optic, it’s glass, silicon dioxide.

That makes it a little bit big.

We wanna bring these down.

So we use different materials like silicon nitride,

but basically just imagine a rectangle of some material

that just goes and branches,

forms different branch points

that target different subregions of the network.

You can transition between layers of these.

So now we’re talking about building in the third dimension,

which is absolutely crucial.

So that’s what wave guides are.

Yeah, that’s great.

Why the third dimension is crucial?

Okay, so yes, you were talking about

what are some of the technical limitations.

One of the things that I believe we have to grapple with

is that our brains are miraculously compact.

For the number of neurons that are in our brain,

it sure does fit in a small volume,

as it would have to if we’re gonna be biological organisms

that are resource limited and things like that.

Any kind of hardware neuron

is almost certainly gonna be much bigger than that

if it is of comparable complexity,

whether it’s based on silicon transistors.

Okay, a transistor, seven nanometers,

that doesn’t mean a semiconductor based neuron

is seven nanometers.

They’re big.

They require many transistors,

different other things like capacitors and things

that store charge.

They end up being on the order of 100 microns

by 100 microns,

and it’s difficult to get them down any smaller than that.

The same is true for superconducting neurons,

and the same is true

if we’re trying to use light for communication.

Even if you’re using electrons for communication,

you have these wires where, okay,

the size of an electron might be angstroms,

but the size of a wire is not angstroms,

and if you try and make it narrower,

the resistance just goes up,

so you don’t actually win.

To communicate over long distances,

you need your wires to be microns wide,

and it’s the same thing for wave guides.

Wave guides are essentially limited

by the wavelength of light,

and that’s gonna be about a micron,

so whereas compare that to an axon,

the analogous component in the brain,

which is 10 nanometers in diameter, something like that,

they’re bigger when they need to communicate

over long distances,

but grappling with the size of these structures

is inevitable and crucial,

and so in order to make systems of comparable scale

to the human brain, by scale here,

I mean number of interconnected neurons,

you absolutely have to be using

the third spatial dimension,

and that means on the wafer,

you need multiple layers

of both active and passive components.

Active, I mean superconducting electronic circuits

that are performing computations,

and passive, I mean these wave guides

that are routing the optical signals to different places,

you have to be able to stack those.

If you can get to something like 10 planes

of each of those, or maybe not even 10,

maybe five, six, something like that,

then you’re in business.

Now you can get millions of neurons on a wafer,

but that’s not anywhere close to the brain scale.

In order to get to the scale of the human brain,

you’re gonna have to also use the third dimension

in the sense that entire wafers

need to be stacked on top of each other

with fiber optic communication between them,

and we need to be able to fill a space

the size of this table with stacked wafers,

and that’s when you can get to some 10 billion neurons

like your human brain,

and I don’t think that’s specific

to the optoelectronic approach that we’re taking.

I think that applies to any hardware

where you’re trying to reach commensurate scale

and complexity as the human brain.

So you need that fractal stacking,

so stacking on the wafer,

and stacking of the wafers,

and then whatever the system that combines,

this stacking of the tables with the wafers.

And it has to be fractal all the way,

you’re exactly right,

because that’s the only way

that you can efficiently get information

from a small point to across that whole network.

It has to have the power law connected.

And photons are like optics throughout.

Yeah, absolutely.

Once you’re at this scale, to me it’s just obvious.

Of course you’re using light for communication.

You have fiber optics given to us from nature, so simple.

The thought of even trying to do

any kind of electrical communication

just doesn’t make sense to me.

I’m not saying it’s wrong, I don’t know,

but that’s where I’m coming from.

So let’s return to loop neurons.

Why are they called loop neurons?

Yeah, the term loop neurons comes from the fact,

like we’ve been talking about,

that they rely heavily on these superconducting loops.

So even in a lot of forms of digital computing

with superconductors,

storing a signal in a superconducting loop

is a primary technique.

In this particular case,

it’s just loops everywhere you look.

So the strength of a synaptic weight

is gonna be set by the amount of current circulating

in a loop that is coupled to the synapse.

So memory is implemented as current circulating

in a superconducting loop.

The coupling between, say, a synapse and a dendrite

or a synapse in the neuron cell body

occurs through loop coupling through transformers.

So current circulating in a synapse

is gonna induce current in a different loop,

a receiving loop in the neuron cell body.

So since all of the computation is happening

in these flux storage loops

and they play such a central role

in how the information is processed,

how memories are formed, all that stuff,

I didn’t think too much about it,

I just called them loop neurons

because it rolls off the tongue a little bit better

than superconducting optoelectronic neurons.

Okay, so how do you design circuits for these loop neurons?

That’s a great question.

There’s a lot of different scales of design.

So at the level of just one synapse,

you can use conventional methods.

They’re not that complicated

as far as superconducting electronics goes.

It’s just four Joseph’s injunctions or something like that

depending on how much complexity you wanna add.

So you can just directly simulate each component in SPICE.

What’s SPICE?

It’s Standard Electrical Simulation Software, basically.

So you’re just explicitly solving the differential equations

that describe the circuit elements.

And then you can stack these things together

in that simulation software to then build circuits.

You can, but that becomes computationally expensive.

So one of the things when COVID hit,

we knew we had to turn some attention

to more things you can do at home in your basement

or whatever, and one of them was computational modeling.

So we started working on adapting,

abstracting out the circuit performance

so that you don’t have to explicitly solve

the circuit equations, which for Joseph’s injunctions

usually needs to be done on like a picosecond timescale

and you have a lot of nodes in your circuit.

So it results in a lot of differential equations

that need to be solved simultaneously.

We were looking for a way to simulate these circuits

that is scalable up to networks of millions or so neurons

is sort of where we’re targeting right now.

So we were able to analyze the behavior of these circuits.

And as I said, it’s based on these simple building blocks.

So you really only need to understand

this one building block.

And if you get a good model of that, boom, it tiles.

And you can change the parameters in there

to get different behaviors and stuff,

but it’s all based on now it’s one differential equation

that you need to solve.

So one differential equation for every synapse,

dendrite or neuron in your system.

And for the neuroscientists out there,

it’s just a simple leaky integrate and fire model,

leaky integrator, basically.

A synapse is a leaky integrator,

a dendrite is a leaky integrator.

So I’m really fascinated by how this one simple component

can be used to achieve lots of different types

of dynamical activity.

And to me, that’s where scalability comes from.

And also complexity as well.

Complexity is often characterized

by relatively simple building blocks

connected in potentially simple

or sometimes complicated ways,

and then emergent new behavior that was hard to predict

from those simple elements.

And that’s exactly what we’re working with here.

So it’s a very exciting platform,

both from a modeling perspective

and from a hardware manifestation perspective

where we can hopefully start to have this test bed

where we can explore things,

not just related to neuroscience,

but also related to other things

that connected to other physics like critical phenomenon,

Ising models, things like that.

So you were asking how we simulate these circuits.

It’s at different levels

and we’ve got the simple spice circuit stuff.

That’s no problem.

And now we’re building these network models

based on this more efficient leaky integrator.

So we can actually reduce every element

to one differential equation.

And then we can also step through it

on a much coarser time grid.

So it ends up being something like a factor

of a thousand to 10,000 speed improvement,

which allows us to simulate,

but hopefully up to millions of neurons.

Whereas before we would have been limited to tens,

a hundred, something like that.

And just like simulating quantum mechanical systems

with a quantum computer.

So the goal here is to understand such systems.

For me, the goal is to study this

as a scientific physical system.

I’m not drawn towards turning this

into an enterprise at this point.

I feel short term applications

that obviously make a lot of money

is not necessarily a curiosity driver for you at the moment.

Absolutely not.

If you’re interested in short term making money,

go with deep learning, use silicon microelectronics.

If you wanna understand things like the physics

of a fascinating system,

or if you wanna understand something more

along the lines of the physical limits

of what can be achieved,

then I think single photon communication,

superconducting electronics is extremely exciting.

What if I wanna use superconducting hardware

at four Kelvin to mine Bitcoin?

That’s my main interest.

The reason I wanted to talk to you today,

I wanna say, no, I don’t know.

What’s Bitcoin?

Look it up on the internet.

Somebody told me about it.

I’m not sure exactly what it is.

But let me ask nevertheless

about applications to machine learning.

Okay, so if you look at the scale of five, 10, 20 years,

is it possible to, before we understand the nature

of human intelligence and general intelligence,

do you think we’ll start falling out of this exploration

of neuromorphic systems ability to solve some

of the problems that the machine learning systems

of today can’t solve?

Well, I’m really hesitant to over promise.

So I really don’t know.

Also, I don’t really understand machine learning

in a lot of senses.

I mean, machine learning from my perspective appears

to require that you know precisely what your input is

and also what your goal is.

You usually have some objective function

or something like that.

And that’s very limiting.

I mean, of course, a lot of times that’s the case.

There’s a picture and there’s a horse in it, so you’re done.

But that’s not a very interesting problem.

I think when I think about intelligence,

it’s almost defined by the ability to handle problems

where you don’t know what your inputs are going to be

and you don’t even necessarily know

what you’re trying to accomplish.

I mean, I’m not sure what I’m trying to accomplish

in this world.

Yeah, at all scales.

Yeah, at all scales, right.

I mean, so I’m more drawn to the underlying phenomena,

the critical dynamics of this system,

trying to understand how elements that you build

into your hardware result in emergent fascinating activity

that was very difficult to predict, things like that.

So, but I gotta be really careful

because I think a lot of other people who,

if they found themselves working on this project

in my shoes, they would say, all right,

what are all the different ways we can use this

for machine learning?

Actually, let me just definitely mention colleague

at NIST, Mike Schneider.

He’s also very much interested,

particularly in the superconducting side of things,

using the incredible speed, power efficiency,

also Ken Seagal at Colgate,

other people working on specifically

the superconducting side of this for machine learning

and deep feed forward neural networks.

There, the advantages are obvious.

It’s extremely fast.

Yeah, so that’s less on the nature of intelligences

and more on various characteristics of this hardware

that you can use for the basic computation

as we know it today and communication.

One of the things that Mike Schneider’s working on right now

is an image classifier at a relatively small scale.

I think he’s targeting that nine pixel problem

where you can have three different characters

and you put in a nine pixel image

and you classify it as one of these three categories.

And that’s gonna be really interesting

to see what happens there,

because if you can show that even at that scale,

you just put these images in and you get it out

and he thinks he can do it,

I forgot if it’s a nanosecond

or some extremely fast classification time,

it’s probably less,

it’s probably a hundred picoseconds or something.

There you have challenges though,

because the Joseph’s injunctions themselves,

the electronic circuit is extremely power efficient.

Some orders of magnitude for something more

than a transistor doing the same thing,

but when you have to cool it down to four Kelvin,

you pay a huge overhead just for keeping it cold,

even if it’s not doing anything.

So it has to work at large scale

in order to overcome that power penalty,

but that’s possible.

It’s just, it’s gonna have to get that performance.

And this is sort of what you were asking about before

is like how much better than silicon would it need to be?

And the answer is, I don’t know.

I think if it’s just overall better than silicon

at a problem that a lot of people care about,

maybe it’s image classification,

maybe it’s facial recognition,

maybe it’s monitoring credit transactions, I don’t know,

then I think it will have a place.

It’s not gonna be in your cell phone,

but it could be in your data center.

So what about in terms of the data center,

I don’t know if you’re paying attention

to the various systems,

like Tesla recently announced DOJO,

which is a large scale machine learning training system,

that again, the bottleneck there

is probably going to be communication

between those systems.

Is there something from your work

on everything we’ve been talking about

in terms of superconductive hardware

that could be useful there?

Oh, I mean, okay, tomorrow, no.

In the long term, it could be the whole thing.

It could be nothing.

I don’t know, but definitely, definitely.

When you look at the,

so I don’t know that much about DOJO.

My understanding is that that’s new, right?

That’s just coming online.

Well, I don’t even know where it hasn’t come online.

And when you announce big, sexy,

so let me explain to you the way things work

in the world of business and marketing.

It’s not always clear where you are

on the coming online part of that.

So I don’t know where they are exactly,

but the vision is from a ground up

to build a very, very large scale,

modular machine learning, ASIC,

basically hardware that’s optimized

for training neural networks.

And of course, there’s a lot of companies

that are small and big working on this kind of problem.

The question is how to do it in a modular way

that has very fast communication.

The interesting aspect of Tesla is you have a company

that at least at this time is so singularly focused

on solving a particular machine learning problem

and is making obviously a lot of money doing so

because the machine learning problem

happens to be involved with autonomous driving.

So you have a system that’s driven by an application.

And that’s really interesting because you have maybe Google

working on TPUs and so on.

You have all these other companies with ASICs.

They’re usually more kind of always thinking general.

So I like it when it’s driven by a particular application

because then you can really get to the,

it’s somehow if you just talk broadly about intelligence,

you may not always get to the right solutions.

It’s nice to couple that sometimes

with specific clear illustration

of something that requires general intelligence,

which for me driving is one such case.

I think you’re exactly right.

Sometimes just having that focus on that application

brings a lot of people focuses their energy and attention.

I think that, so one of the things that’s appealing

about what you’re saying is not just

that the application is specific,

but also that the scale is big

and that the benefit is also huge.

Financial and to humanity.

Right, right, right.

Yeah, so I guess let me just try to understand

is the point of this dojo system

to figure out the parameters

that then plug into neural networks

and then you don’t need to retrain,

you just make copies of a certain chip

that has all the other parameters established or?

No, it’s straight up retraining a large neural network

over and over and over.

So you have to do it once for every new car?

No, no, you have to, so they do this interesting process,

which I think is a process for machine learning,

supervised machine learning systems

you’re going to have to do, which is you have a system,

you train your network once, it takes a long time.

I don’t know how long, but maybe a week.

Okay. To train.

And then you deploy it on, let’s say about a million cars.

I don’t know what the number is.

But that part, you just write software

that updates some weights in a table and yeah, okay.

But there’s a loop back.

Yeah, yeah, okay.

Each of those cars run into trouble, rarely,

but they catch the edge cases

of the performance of that particular system

and then send that data back

and either automatically or by humans,

that weird edge case data is annotated

and then the network has to become smart enough

to now be able to perform in those edge cases,

so it has to get retrained.

There’s clever ways of retraining different parts

of that network, but for the most part,

I think they prefer to retrain the entire thing.

So you have this giant monster

that kind of has to be retrained regularly.

I think the vision with Dojo is to have

a very large machine learning focused,

driving focused supercomputer

that then is sufficiently modular

that can be scaled to other machine learning applications.

So they’re not limiting themselves completely

to this particular application,

but this application is the way they kind of test

this iterative process of machine learning

is you make a system that’s very dumb,

deploy it, get the edge cases where it fails,

make it a little smarter, it becomes a little less dumb

and that iterative process achieves something

that you can call intelligent or is smart enough

to be able to solve this particular application.

So it has to do with training neural networks fast

and training neural networks that are large.

But also based on an extraordinary amount of diverse input.

Data, yeah.

And that’s one of the things,

so this does seem like one of those spaces

where the scale of superconducting optoelectronics,

the way that, so when you talk about the weaknesses,

like I said, okay, well, you have to cool it down.

At this scale, that’s fine.

Because that’s not too much of an added cost.

Most of your power is being dissipated

by the circuits themselves, not the cooling.

And also you have one centralized kind of cognitive hub,

if you will.

And so if we’re talking about putting

a superconducting system in a car, that’s questionable.

Do you really wanna cryostat

in the trunk of everyone in your car?

It’ll fit, it’s not that big of a deal,

but hopefully there’s a better way, right?

But since this is sort of a central supreme intelligence

or something like that,

and it needs to really have this massive data acquisition,

massive data integration,

I would think that that’s where large scale

spiking neural networks with vast communication

and all these things would have something

pretty tremendous to offer.

It’s not gonna happen tomorrow.

There’s a lot of development that needs to be done.

But we have to be patient with self driving cars

for a lot of reasons.

We were all optimistic that they would be here by now.

And okay, they are to some extent,

but if we’re thinking five or 10 years down the line,

it’s not unreasonable.

One other thing, let me just mention,

getting into self driving cars and technologies

that are using AI out in the world,

this is something NIST cares a lot about.

Elham Tabassi is leading up a much larger effort in AI

at NIST than my little project.

And really central to that mission

is this concept of trustworthiness.

So when you’re going to deploy this neural network

in every single automobile with so much on the line,

you have to be able to trust that.

So now how do we know that we can trust that?

How do we know that we can trust the self driving car

or the supercomputer that trained it?

There’s a lot of work there

and there’s a lot of that going on at NIST.

And it’s still early days.

I mean, you’re familiar with the problem and all that.

But there’s a fascinating dance in engineering

with safety critical systems.

There’s a desire in computer science,

just recently talked to Don Knuth,

for algorithms and for systems,

for them to be provably correct or provably safe.

And this is one other difference

between humans and biological systems

is we’re not provably anything.

And so there’s some aspect of imperfection

that we need to have built in,

like robustness to imperfection be part of our systems,

which is a difficult thing for engineers to contend with.

They’re very uncomfortable with the idea

that you have to be okay with failure

and almost engineer failure into the system.

Mathematicians hate it too.

But I think it was Turing who said something

along the lines of,

I can give you an intelligent system

or I can give you a flawless system,

but I can’t give you both.

And it’s in sort of creativity and abstract thinking

seem to rely somewhat on stochasticity

and not having components

that perform exactly the same way every time.

This is where like the disagreement I have with,

not disagreement, but a different view on the world.

I’m with Turing,

but when I talk to robotic, robot colleagues,

that sounds like I’m talking to robots,

colleagues that are roboticists,

the goal is perfection.

And to me is like, no,

I think the goal should be imperfection

that’s communicated.

And through the interaction between humans and robots,

that imperfection becomes a feature, not a bug.

Like together, seen as a system,

the human and the robot together

are better than either of them individually,

but the robot itself is not perfect in any way.

Of course, there’s a bunch of disagreements,

including with Mr. Elon about,

to me, autonomous driving is fundamentally

a human robot interaction problem,

not a robotics problem.

To Elon, it’s a robotics problem.

That’s actually an open and fascinating question,

whether humans can be removed from the loop completely.

We’ve talked about a lot of fascinating chemistry

and physics and engineering,

and we’re always running up against this issue

that nature seems to dictate what’s easy and what’s hard.

So you have this cool little paper

that I’d love to just ask you about.

It’s titled,

Does Cosmological Evolution Select for Technology?

So in physics, there’s parameters

that seem to define the way our universe works,

that physics works, that if it worked any differently,

we would get a very different world.

So it seems like the parameters are very fine tuned

to the kind of physics that we see.

All the beautiful E equals MC squared,

they would get these nice, beautiful laws.

It seems like very fine tuned for that.

So what you argue in this article

is it may be that the universe has also fine tuned

its parameters that enable the kind of technological

innovation that we see, the technology that we see.

Can you explain this idea?

Yeah, I think you’ve introduced it nicely.

Let me just try to say a few things in my language layout.

What is this fine tuning problem?

So physicists have spent centuries trying to understand

the system of equations that govern the way nature behaves,

the way particles move and interact with each other.

And as that understanding has become more clear over time,

it became sort of evident that it’s all well adjusted

to allow a universe like we see, very complex,

this large, long lived universe.

And so one answer to that is, well, of course it is

because we wouldn’t be here otherwise.

But I don’t know, that’s not very satisfying.

That’s sort of, that’s what’s known

as the weak anthropic principle.

It’s a statement of selection bias.

We can only observe a universe that is fit for us to live in.

So what does it mean for a universe

to be fit for us to live in?

Well, the pursuit of physics,

it is based partially on coming up with equations

that describe how things behave

and interact with each other.

But in all those equations you have,

so there’s the form of the equation,

sort of how different fields or particles

move in space and time.

But then there are also the parameters

that just tell you sort of the strength

of different couplings.

How strongly does a charged particle

couple to the electromagnetic field or masses?

How strongly does a particle couple

to the Higgs field or something like that?

And those parameters that define,

not the general structure of the equations,

but the relative importance of different terms,

they seem to be every bit as important

as the structure of the equations themselves.

And so I forget who it was.

Somebody, when they were working through this

and trying to see, okay, if I adjust the parameter,

this parameter over here,

call it the, say the fine structure constant,

which tells us the strength

of the electromagnetic interaction.

Oh boy, I can’t change it very much.

Otherwise nothing works.

The universe sort of doesn’t,

it just pops into existence and goes away

in a nanosecond or something like that.

And somebody had the phrase,

this looks like a put up job,

meaning every one of these parameters was dialed in.

It’s arguable how precisely they have to be dialed in,

but dialed in to some extent,

not just in order to enable our existence,

that’s a very anthropocentric view,

but to enable a universe like this one.

So, okay, maybe I think the majority position

of working physicists in the field is,

it has to be that way in order for us to exist.

We’re here, we shouldn’t be surprised

that that’s the way the universe is.

And I don’t know, for a while,

that never sat well with me,

but I just kind of moved on

because there are things to do

and a lot of exciting work.

It doesn’t depend on resolving this puzzle,

but as I started working more with technology,

getting into the more recent years of my career,

particularly when I started,

after having worked with silicon for a long time,

which was kind of eerie on its own,

but then when I switched over to superconductors,

I was just like, this is crazy.

It’s just absolutely astonishing

that our universe gives us superconductivity.

It’s one of the most beautiful physical phenomena

and it’s also extraordinarily useful for technology.

So you can argue that the universe

has to have the parameters it does for us to exist

because we couldn’t be here otherwise,

but why does it give us technology?

Why does it give us silicon that has this ideal oxide

that allows us to make a transistor

without trying that hard?

That can’t be explained by the same anthropic reasoning.

Yeah, so it’s asking the why question.

I mean, a slight natural extension of that question is,

I wonder if the parameters were different

if we would simply have just another set of paint brushes

to create totally other things

that wouldn’t look like anything

like the technology of today,

but would nevertheless have incredible complexity,

which is if you sort of zoom out and start defining things,

not by like how many batteries it needs

and whether it can make toast,

but more like how much complexity is within the system

or something like that.

Well, yeah, you can start to quantify things.

You’re exactly right.

So nowhere am I arguing that

in all of the vast parameter space

of everything that could conceivably exist

in the multiverse of nature,

there’s this one point in parameter space

where complexity arises.

I doubt it.

That would be a shameful waste of resources, it seems.

But it might be that we reside

at one place in parameter space

that has been adapted through an evolutionary process

to allow us to make certain technologies

that allow our particular kind of universe to arise

and sort of achieve the things it does.

See, I wonder if nature in this kind of discussion,

if nature is a catalyst for innovation

or if it’s a ceiling for innovation.

So like, is it going to always limit us?

Like you’re talking about silicon.

Is it just make it super easy to do awesome stuff

in a certain dimension,

but we could still do awesome stuff in other ways,

it’ll just be harder?

Or does it really set like the maximum we can do?

That’s a good thing to,

that’s a good subject to discuss.

I guess I feel like we need to lay

a little bit more groundwork.

So I want to make sure that

I introduce this in the context

of Lee Smolin’s previous idea.

So who’s Lee Smolin and what kind of ideas does he have?

Okay, Lee Smolin is a theoretical physicist

who back in the late 1980s published a paper

in the early 1990s introduced this idea

of cosmological natural selection,

which argues that the universe did evolve.

So his paper was called, did the universe evolve?

And I gave myself the liberty of titling my paper

does cosmological selection

or does cosmological evolution select for technology

in reference to that.

So he introduced that idea decades ago.

Now he primarily works on quantum gravity,

loop quantum gravity, other approaches to

unifying quantum mechanics with general relativity,

as you can read about in his most recent book, I believe,

and he’s been on your show as well.

So, but I want to introduce this idea

of cosmological natural selection,

because I think that is one of the core ideas

that could change our understanding

of how the universe got here, our role in it,

what technology is doing here.

But there’s a couple more pieces

that need to be set up first.

So the beginning of our universe is largely accepted

to be the big bang.

And what that means is if you look back in time

by looking far away in space,

you see that everything used to be at one point

and it expanded away from there.

There was an era in the evolutionary process of our universe

that was called inflation.

And this idea was developed primarily by Alan Guth

and others, Andre Linde and others in the 80s.

And this idea of inflation is basically that

when a singularity begins this process of growth,

there can be a temporary stage

where it just accelerates incredibly rapidly.

And based on quantum field theory,

this tells us that this should produce matter

in precisely the proportions that we find

of hydrogen and helium in the big bang,

lithium too, lithium also, and other things too.

So the predictions that come out of big bang

inflationary cosmology have stood up extremely well

to empirical verification,

the cosmic microwave background, things like this.

So most scientists working in the field

think that the origin of our universe is the big bang.

And I base all my thinking on that as well.

I’m just laying this out there so that people understand

that where I’m coming from is an extension,

not a replacement of existing well founded ideas.

In a paper, I believe it was 1986 with Alan Guth

and another author Farhi,

they wrote that a big bang,

I don’t remember the exact quote,

a big bang is inextricably linked with a black hole.

The singularity that we call our origin

is mathematically indistinguishable from a black hole.

They’re the same thing.

And Lee Smolin based his thinking on that idea,

I believe, I don’t mean to speak for him,

but this is my reading of it.

So what Lee Smolin will say is that

a black hole in one universe is a big bang

in another universe.

And this allows us to have progeny, offspring.

So a universe can be said to have come

before another universe.

And very crucially, Smolin argues,

I think this is potentially one of the great ideas

of all time, that’s my opinion,

that when a black hole forms, it’s not a classical entity,

it’s a quantum gravitational entity.

So it is subject to the fluctuations

that are inherent in quantum mechanics, the properties,

what we’re calling the parameters

that describe the physics of that system

are subject to slight mutations

so that the offspring universe

does not have the exact same parameters

defining its physics as its parent universe.

They’re close, but they’re a little bit different.

And so now you have a mechanism for evolution,

for natural selection.

So there’s mutation, so there’s,

and then if you think about the DNA of the universe

are the basic parameters that govern its laws.

Exactly, so what Smolin said is our universe results

from an evolutionary process that can be traced back

some, he estimated, 200 million generations.

Initially, there was something like a vacuum fluctuation

that produced through random chance a universe

that was able to reproduce just one.

So now it had one offspring.

And then over time, it was able to make more and more

until it evolved into a highly structured universe

with a very long lifetime, with a great deal of complexity

and importantly, especially importantly for Lee Smolin,

stars, stars make black holes.

Therefore, we should expect our universe

to be optimized, have its physical parameters optimized

to make very large numbers of stars

because that’s how you make black holes

and black holes make offspring.

So we expect the physics of our universe to have evolved

to maximize fecundity, the number of offspring.

And the way Lee Smolin argues you do that

is through stars that the biggest ones die

in these core collapse supernova

that make a black hole and a child.

Okay, first of all, I agree with you

that this is back to our fractal view of everything

from intelligence to our universe.

That is very compelling and a very powerful idea

that unites the origin of life

and perhaps the origin of ideas and intelligence.

So from a Dawkins perspective here on earth,

the evolution of those and then the evolution

of the laws of physics that led to us.

I mean, it’s beautiful.

And then you stacking on top of that,

that maybe we are one of the offspring.

Right, okay, so before getting into where I’d like

to take that idea, let me just a little bit more groundwork.

There is this concept of the multiverse

and it can be confusing.

Different people use the word multiverse in different ways.

In the multiverse that I think is relevant to picture

when trying to grasp Lee Smolin’s idea,

essentially every vacuum fluctuation

can be referred to as a universe.

It occurs, it borrows energy from the vacuum

for some finite amount of time

and it evanesces back into the quantum vacuum.

And ideas of Guth before that and Andrei Linde

with eternal inflation aren’t that different

that you would expect nature

due to the quantum properties of the vacuum,

which we know exist, they’re measurable

through things like the Casimir effect and others.

You know that there are these fluctuations

that are occurring.

What Smolin is arguing is that there is

this extensive multiverse, that this universe,

what we can measure and interact with

is not unique in nature.

It’s just our residents, it’s where we reside.

And there are countless, potentially infinity

other universes, other entire evolutionary trajectories

that have evolved into things like

what you were mentioning a second ago

with different parameters and different ways

of achieving complexity and reproduction

and all that stuff.

So it’s not that the evolutionary process

is a funnel towards this end point, not at all.

Just like the biological evolutionary process

that has occurred within our universe

is not a unique route toward achieving

one specific chosen kind of species.

No, we have extraordinary diversity around us.

That’s what evolution does.

And for any one species like us,

you might feel like we’re at the center of this process.

We’re the destination of this process,

but we’re just one of the many

nearly infinite branches of this process.

And I suspect it is exactly infinite.

I mean, I just can’t understand how with this idea,

you can ever draw a boundary around it and say,

no, the universe, I mean, the multiverse

has 10 to the one quadrillion components,

but not infinity.

I don’t know that.

Well, yeah, I have cognitively incapable

as I think all of us are

and truly understanding the concept of infinity.

And the concept of nothing as well.

And nothing, but also the concept of a lot

is pretty difficult.

I can just, I can count.

I run out of fingers at a certain point

and then you’re screwed.

And when you’re wearing shoes

and you can’t even get down to your toes, it’s like.

It’s like, all right, a thousand fine, a million.

Is that what?

And then it gets crazier and crazier.

Right, right.

So this particular, so when we say technology, by the way,

I mean, there’s some, not to over romanticize the thing,

but there is some aspect about this branch of ours

that allows us to, for the universe to know itself.

Yes, yes.

So to be, to have like little conscious cognitive fingers

that are able to feel like to scratch the head.

Right, right, right.

To be able to construct E equals MC squared

and to introspect, to start to gain some understanding

of the laws that govern it.

Isn’t that, isn’t that kind of amazing?

Okay, I’m just human, but it feels like that,

if I were to build a system that does this kind of thing,

that evolves laws of physics, that evolves life,

that evolves intelligence, that my goal would be

to come up with things that are able to think about itself.

Right, aren’t we kind of close to the design specs,

the destination?

We’re pretty close, I don’t know.

I mean, I’m spending my career designing things

that I hope will think about themselves,

so you and I aren’t too far apart on that one.

But then maybe that problem is a lot harder

than we imagine.

Maybe we need to.

Let’s not get, let’s not get too far

because I want to emphasize something that,

what you’re saying is, isn’t it fascinating

that the universe evolved something

that can be conscious, reflect on itself?

But Lee Smolin’s idea didn’t take us there, remember?

It took us to stars.

Lee Smolin has argued, I think,

right on almost every single way

that cosmological natural selection

could lead to a universe with rich structure.

And he argued that the structure,

the physics of our universe is designed

to make a lot of stars so that they can make black holes.

But that doesn’t explain what we’re doing here.

In order for that to be an explanation of us,

what you have to assume is that once you made that universe

that was capable of producing stars,

life, planets, all these other things,

we’re along for the ride.

They got lucky.

We’re kind of arising, growing up in the cracks,

but the universe isn’t here for us.

We’re still kind of a fluke in that picture.

And I can’t, I don’t necessarily have

like a philosophical opposition to that stance.

It’s just not, okay, so I don’t think it’s complete.

So it seems like whatever we got going on here to you,

it seems like whatever we have here on earth

seems like a thing you might want to select for

in this whole big process.

Exactly.

So if what you are truly,

if your entire evolutionary process

only cares about fecundity,

it only cares about making offspring universes

because then there’s gonna be the most of them

in that local region of hyperspace,

which is the set of all possible universes, let’s say.

You don’t care how those universes are made.

You know they have to be made by black holes.

This is what inflationary theory tells us.

The big bang tells us that black holes make universes.

But what if there was a technological means

to make universes?

Stars require a ton of matter

because they’re not thinking very carefully

about how you make a black hole.

They’re just using gravity, you know?

But if we devise technologies

that can efficiently compress matter into a singularity,

it turns out that if you can compress about 10 kilograms

into a very small volume,

that will make a black hole

that is likely highly probable to inflate

into its own offspring universe.

This is according to calculations done by other people

who are professional quantum theorists,

quantum field theorists,

and I hope I am grasping what they’re telling me correctly.

I am somewhat of a translator here.

But so that’s the position

that is particularly intriguing to me,

which is that what might have happened is that,

okay, this particular branch on the vast tree of evolution,

cosmological evolution that we’re talking about,

not biological evolution within our universe,

but cosmological evolution,

went through exactly the process

that Elise Mullen described,

got to the stage where stars were making lots of black holes

but then continued to evolve and somehow bridged that gap

and made intelligence and intelligence

capable of devising technologies

because technologies, intelligent species

working in conjunction with technologies

could then produce even more.

Yeah, more efficiently, more faster and better

and more different.

Then you start to have different kind of mechanisms

and mutation perhaps, all that kind of stuff.

And so if you do a simple calculation that says,

all right, if I want to,

we know roughly how many core collapse supernovae

have resulted in black holes in our galaxy

since the beginning of the universe

and it’s something like a billion.

So then you would have to estimate

that it would be possible for a technological civilization

to produce more than a billion black holes

with the energy and matter at their disposal.

And so one of the calculations in that paper,

back of the envelope,

but I think revealing nonetheless is that

if you take a relatively common asteroid,

something that’s about a kilometer in diameter,

what I’m thinking of is just scrap material

laying around in our solar system

and break it up into 10 kilogram chunks

and turn each of those into a universe,

then you would have made at least a trillion black holes

outpacing the star production rate

by some three orders of magnitude.

That’s one asteroid.

So now if you envision an intelligent species

that would potentially have been devised initially

by humans, but then based on superconducting

optoelectronic networks, no doubt,

and they go out and populate,

they don’t have to fill the galaxy.

They just have to get out to the asteroid belt.

They could potentially dramatically outpace

the rate at which stars are producing offspring universes.

And then wouldn’t you expect that

that’s where we came from instead of a star?

Yeah, so you have to somehow become masters of gravity,

so like, or generate.

John, this is really gravity.

So stars make black holes with gravity,

but any force that can make the energy density

can compactify matter to produce

a great enough energy density can form a singularity.

It doesn’t, it would not likely be gravity.

It’s the weakest force.

You’re more likely to use something like the technologies

that we’re developing for fusion, for example.

So I don’t know, the Large Ignition Facility

recently blasted a pellet with 100 really bright lasers

and caused that to get dense enough

to engage in nuclear fusion.

So something more like that,

or a tokamak with a really hot plasma, I’m not sure.

Something, I don’t know exactly how it would be done.

I do like the idea of that,

especially just been reading a lot about gravitational waves

and the fact that us humans with our technological

capabilities, one of the most impressive

technological accomplishments of human history is LIGO,

being able to precisely detect gravitational waves.

I’m particularly find appealing the idea

that other alien civilizations from very far distances

communicate with gravity, with gravitational waves,

because as you become greater and greater master of gravity,

which seems way out of reach for us right now,

maybe that seems like a effective way of sending signals,

especially if your job is to manufacture black holes.

Right.

So that, so let me ask there,

whatever, I mean, broadly thinking,

because we tend to think other alien civilizations

would be very human like,

but if we think of alien civilizations out there

as basically generators of black holes,

however they do it, because they got stars,

do you think there’s a lot of them

in our particular universe out there?

In our universe?

Well, okay, let me ask, okay, this is great.

Let me ask a very generic question

and then let’s see how you answer it,

which is how many alien civilizations are out there?

If the hypothesis that I just described

is on the right track,

it would mean that the parameters of our universe

have been selected so that intelligent civilizations

will occur in sufficient numbers

so that if they reach something

like supreme technological maturity,

let’s define that as the ability to produce black holes,

then that’s not a highly improbable event.

It doesn’t need to happen often

because as I just described,

if you get one of them in a galaxy,

you’re gonna make more black holes

than the stars in that galaxy.

But there’s also not a super strong motivation,

well, it’s not obvious that you need them

to be ubiquitous throughout the galaxy.

Right.

One of the things that I try to emphasize in that paper

is that given this idea

of how our parameters might’ve been selected,

it’s clear that it’s a series of trade offs, right?

If you make, I mean, in order for intelligent life

of our variety or anything resembling us to occur,

you need a bunch of stuff, you need stars.

So that’s right back to Smolin’s roots of this idea,

but you also need water to have certain properties.

You need things like the rocky planets,

like the Earth to be within the habitable zone,

all these things that you start talking about

in the field of astrobiology,

trying to understand life in the universe,

but you can’t over emphasize,

you can’t tune the parameters so precisely

to maximize the number of stars

or to give water exactly the properties

or to make rocky planets like Earth the most numerous.

You have to compromise on all these things.

And so I think the way to test this idea

is to look at what parameters are necessary

for each of these different subsystems,

and I’ve laid out a few that I think are promising,

there could be countless others,

and see how changing the parameters

makes it more or less likely that stars would form

and have long lifetimes or that rocky planets

in the habitable zone are likely to form,

all these different things.

So we can test how much these things are in a tug of war

with each other, and the prediction would be

that we kind of sit at this central point

where if you move the parameters too much,

stars aren’t stable, or life doesn’t form,

or technology’s infeasible,

because life alone, at least the kind of life

that we know of, cannot make black holes.

We don’t have this, well, I’m speaking for myself,

you’re a very fit and strong person,

but it might be possible for you,

but not for me to compress matter.

So we need these technologies, but we don’t know,

we have not been able to quantify yet

how finely adjusted the parameters would need to be

in order for silicon to have the properties it does.

Okay, this is not directly speaking to what you’re saying,

you’re getting to the Fermi paradox,

which is where are they, where are the life forms out there,

how numerous are they, that sort of thing.

What I’m trying to argue is that

if this framework is on the right track,

a potentially correct explanation for our existence,

we, it doesn’t necessarily predict

that intelligent civilizations are just everywhere,

because even if you just get one of them in a galaxy,

which is quite rare, it could be enough

to dramatically increase the fecundity

of the universe as a whole.

Yeah, and I wonder, once you start generating

the offspring for universes, black holes,

how that has effect on the,

what kind of effect does it have

on the other candidate’s civilizations

within that universe?

Maybe it has a destructive aspect,

or there could be some arguments

about once you have a lot of offspring,

that that just quickly accelerates

to where the other ones can’t even catch up.

It could, but I guess if you want me

to put my chips on the table or whatever,

I think I come down more on the side

that intelligent life civilizations are rare.

And I guess I follow Max Tegmark here.

And also there’s a lot of papers coming out recently

in the field of astrobiology that are seeming to say,

all right, you just work through the numbers

on some modified Drake equation or something like that.

And it looks like it’s not improbable.

You shouldn’t be surprised that an intelligent species

has arisen in our galaxy,

but if you think there’s one the next solar system over,

it’s highly improbable.

So I can see that the number,

the probability of finding a civilization in a galaxy,

maybe it’s most likely that you’re gonna find

one to a hundred or something.

But okay, now it’s really important

to put a time window on that, I think,

because does that mean in the entire lifetime of the galaxy

before it, so for in our case, before we run into Andromeda,

I think it’s highly probable, I shouldn’t say I think,

it’s tempting to believe that it’s highly probable

that in that entire lifetime of your galaxy,

you’re gonna get at least one intelligent species,

maybe thousands or something like that.

But it’s also, I think, a little bit naive to think

that they’re going to coincide in time

and we’ll be able to observe them.

And also, if you look at the span of life on Earth,

the Earth history, it was surprising to me

to kind of look at the amount of time,

first of all, the short amount of time,

there’s no life, it’s surprising.

Life sprang up pretty quickly.

It’s single cell.

But that’s the point I’m trying to make

is like so much of life on Earth

was just like single cell organisms, like most of it.

Most of it was like boring bacteria type of stuff.

Well, bacteria are fascinating, but I take your point.

No, I get it.

I mean, no offense to them.

But this kind of speaking from the perspective

of your paper of something that’s able

to generate technology as we kind of understand it,

that’s a very short moment in time

relative to that full history of life on Earth.

And maybe our universe is just saturated

with bacteria like humans.

Right.

But not the special extra AGI super humans,

that those are very rare.

And once those spring up, everything just goes to like,

it accelerates very quickly.

Yeah, we just don’t have enough data to really say,

but I find this whole subject extremely engaging.

I mean, there’s this concept,

I think it’s called the Rare Earth Hypothesis,

which is that basically stating that,

okay, microbes were here right away

after the Hadian era where we were being bombarded.

Well, after, yeah, bombarded by comets, asteroids,

things like that, and also after the moon formed.

So once things settled down a little bit,

in a few hundred million years,

you have microbes everywhere.

And it could have been, we don’t know exactly

when it could have been remarkably brief that that took.

So it does indicate that, okay,

life forms relatively easily.

I think that alone is sort of a checker on the scale

for the argument that the parameters that allow

even microbial life to form are not just a fluke.

But anyway, that aside, yes,

then there was this long dormant period,

not dormant, things were happening,

but important things were happening

for some two and a half billion years or something

after the metabolic process

that releases oxygen was developed.

Then basically the planet’s just sitting there,

getting more and more oxygenated,

more and more oxygenated until it’s enough

that you can build these large, complex organisms.

And so the Rare Earth Hypothesis would argue

that the microbes are common everywhere

in any planet that’s roughly in the habitable zone

and has some water on it, it’s probably gonna have those.

But then getting to this Cambrian explosion

that happened some between 500 and 600 million years ago,

that’s rare, you know?

And I buy that, I think that is rare.

So if you say how much life is in our galaxy,

I think that’s probably the right answer

is that microbes are everywhere.

Cambrian explosion is extremely rare.

And then, but the Cambrian explosion kind of went like that

where within a couple of tens or a hundred million years,

all of these body plans came into existence.

And basically all of the body plans

that are now in existence on the planet

were formed in that brief window

and we’ve just been shuffling around since then.

So then what caused humans to pop out of that?

I mean, that could be another extremely rare threshold

that a planet roughly in the habitable zone with water

is not guaranteed to cross, you know?

To me, it’s fascinating for being humble,

like the humans cannot possibly be the most amazing thing

that such, if you look at the entirety of the system

that Lee Smolin and you paint,

that cannot possibly be the most amazing thing

that process generates.

So like, if you look at the evolution,

what’s the equivalent in the cosmological evolution

and its selection for technology,

the equivalent of the human eye or the human brain?

Universes that are able to do some like,

they don’t need the damn stars.

They’re able to just do some incredible generation

of complexity fast, like much more than,

if you think about it,

it’s like most of our universe is pretty freaking boring.

There’s not much going on, there’s a few rocks flying around

and there’s some like apes

that are just like doing podcasts on some weird planet.

It just seems very inefficient.

If you think about like the amazing thing in the human eye,

the visual cortex can do, the brain, the nervous,

everything that makes us more powerful

than single cell organisms.

Like if there’s an equivalent of that for universes,

like the richness of physics

that could be expressed

through a particular set of parameters.

Like, I mean, like for me,

I’m a sort of from a computer science perspective,

huge fan of cellular automata,

which is a nice sort of pretty visual way

to illustrate how different laws

can result in drastically different levels of complexity.

So like, it’s like, yeah, okay.

So we’re all like celebrating,

look, our little cellular automata

is able to generate pretty triangles and squares

and therefore we achieve general intelligence.

And then there’ll be like some badass Chuck Norris type,

like universal Turing machine type of cellular automata.

They’re able to generate other cellular automata

that does any arbitrary level of computation off the bat.

Like those have to then exist.

And then we’re just like, we’ll be forgotten.

This story, this podcast just entertains

a few other apes for a few months.

Well, I’m kind of surprised to hear your cynicism.

No, I’m very up.

I usually think of you as like one who celebrates humanity

and all its forms and things like that.

And I guess I just, I don’t,

I see it the way you just described.

I mean, okay, we’ve been here for 13.7 billion years

and you’re saying, gosh, that’s a long time.

Let’s get on with the show already.

Some other universe could have kicked our butt by now,

but that’s putting a characteristic time.

I mean, why is 13.7 billion a long time?

I mean, compared to what?

I guess, so when I look at our universe,

I see this extraordinary hierarchy

that has developed over that time.

So at the beginning, it was a chaotic mess of some plasma

and nothing interesting going on there.

And even for the first stars to form,

that a lot of really interesting evolutionary processes

had to occur, by evolutionary in that sense,

I just mean taking place over extended periods of time

and structures are forming then.

And then it took that first generation of stars

in order to produce the metals

that then can more efficiently produce

another generation of stars.

We’re only the third generation of stars.

So we might still be pretty quick to the game here.

So, but I don’t think, I don’t, okay.

So then you have these stars

and then you have solar systems on those solar systems.

You have rocky worlds, you have gas giants,

like all this complexity.

And then you start getting life

and the complexity that’s evolved

through the evolutionary process in life forms

is just, it’s not a let down to me.

Just seeing that.

Some of it is like some of the planets is like icy,

it’s like different flavors of ice cream.

They’re icy, but there might be water underneath.

All kinds of life forms with some volcanoes,

all kinds of weird stuff.

No, no, I don’t, I think it’s beautiful.

I think our life is beautiful.

And I think it was designed that by design,

the scarcity of the whole thing.

I think mortality, as terrifying as it is,

is fundamental to the whole reason we enjoy everything.

No, I think it’s beautiful.

I just think that all of us conscious beings

in the grand scheme of basically every scale

will be completely forgotten.

Well, that’s true.

I think everything is transient

and that would go back to maybe something more like Lao Tzu,

the Tao Te Ching or something where it’s like,

yes, there is nothing but change.

There is nothing but emergence and dissolve and that’s it.

But I just, in this picture,

this hierarchy that’s developed,

I don’t mean to say that now it gets to us

and that’s the pinnacle.

In fact, I think at a high level,

the story I’m trying to tease out in my research is about,

okay, well, so then what’s the next level of hierarchy?

And if it’s, okay, we’re kind of pretty smart.

I mean, talking about people like Lee Small

and Alan Guth, Max Tegmark, okay, we’re really smart.

Talking about me, okay, we’re kind of,

we can find our way to the grocery store or whatever,

but what’s next?

I mean, what if there’s another level of hierarchy

that grows on top of us

that is even more profoundly capable?

And I mean, we’ve talked a lot

about superconducting sensors.

Imagine these cognitive systems far more capable than us

residing somewhere else in the solar system

off of the surface of the earth,

where it’s much darker, much colder,

much more naturally suited to them.

And they have these sensors that can detect single photons

of light from radio waves out to all across the spectrum

of the gamma rays and just see the whole universe.

And they just live in space

with these massive collection optics so that they,

what do they do?

They just look out and experience that vast array

of what’s being developed.

And if you’re such a system,

presumably you would do some things for fun.

And the kind of fun thing I would do

as somebody who likes video games

is I would create and maintain

and observe something like earth.

So in some sense, we’re like all what players on a stage

for this superconducting cold computing system out there.

I mean, all of this is fascinating to think.

The fact that you’re actually designing systems

here on earth that are trying to push this technological

at the very cutting edge and also thinking about

how does the like the evolution of physical laws

lead us to the way we are is fascinating.

That coupling is fascinating.

It’s like the ultimate rigorous application of philosophy

to the rigorous application of engineering.

So Jeff, you’re one of the most fascinating.

I’m so glad I did not know much about you

except through your work.

And I’m so glad we got this chance to talk.

You’re one of the best explainers

of exceptionally difficult concepts.

And you’re also, speaking of like fractal,

you’re able to function intellectually

at all levels of the stack, which I deeply appreciate.

This was really fun.

You’re a great educator, a great scientist.

It’s an honor that you would spend

your valuable time with me.

It’s an honor that you would spend your time with me as well.

Thanks, Jeff.

Thanks for listening to this conversation

with Jeff Schoenlein.

To support this podcast,

please check out our sponsors in the description.

And now let me leave you with some words

from the great John Carmack,

who surely will be a guest on this podcast soon.

Because of the nature of Moore’s Law,

anything that an extremely clever graphics programmer

can do at one point can be replicated

by a merely competent programmer

some number of years later.

Thank you for listening and hope to see you next time.

comments powered by Disqus