Lex Fridman Podcast - #162 - Jim Keller The Future of Computing, AI, Life, and Consciousness

The following is a conversation with Jim Keller,

his second time in the podcast.

Jim is a legendary microprocessor architect

and is widely seen as one of the greatest

engineering minds of the computing age.

In a peculiar twist of space time in our simulation,

Jim is also a brother in law of Jordan Peterson.

We talk about this and about computing,

artificial intelligence, consciousness, and life.

Quick mention of our sponsors.

Athletic Greens All In One Nutrition Drink,

Brooklyn and Sheets, ExpressVPN,

and Belcampo Grass Fed Meat.

Click the sponsor links to get a discount

and to support this podcast.

As a side note, let me say that Jim is someone who,

on a personal level, inspired me to be myself.

There was something in his words, on and off the mic,

or perhaps that he even paid attention to me at all,

that almost told me, you’re all right, kid.

A kind of pat on the back that can make the difference

between a mind that flourishes

and a mind that is broken down

by the cynicism of the world.

So I guess that’s just my brief few words

of thank you to Jim, and in general,

gratitude for the people who have given me a chance

on this podcast, in my work, and in life.

If you enjoy this thing, subscribe on YouTube,

review on Apple Podcast, follow on Spotify,

support on Patreon, or connect with me

on Twitter, Alex Friedman.

And now, here’s my conversation with Jim Keller.

What’s the value and effectiveness

of theory versus engineering, this dichotomy,

in building good software or hardware systems?

Well, good design is both.

I guess that’s pretty obvious.

By engineering, do you mean reduction of practice

of known methods?

And then science is the pursuit of discovering things

that people don’t understand.

Or solving unknown problems.

Definitions are interesting here,

but I was thinking more in theory,

constructing models that kind of generalize

about how things work.

And engineering is actually building stuff.

The pragmatic, like, okay, we have these nice models,

but how do we actually get things to work?

Maybe economics is a nice example.

Like, economists have all these models

of how the economy works,

and how different policies will have an effect,

but then there’s the actual, okay,

let’s call it engineering,

of like, actually deploying the policies.

So computer design is almost all engineering.

And reduction of practice of known methods.

Now, because of the complexity of the computers we built,

you know, you could think you’re,

well, we’ll just go write some code,

and then we’ll verify it, and then we’ll put it together,

and then you find out that the combination

of all that stuff is complicated.

And then you have to be inventive

to figure out how to do it, right?

So that definitely happens a lot.

And then, every so often, some big idea happens.

But it might be one person.

And that idea is in the space of engineering,

or is it in the space of…

Well, I’ll give you an example.

So one of the limits of computer performance

is branch prediction.

So, and there’s a whole bunch of ideas

about how good you could predict a branch.

And people said, there’s a limit to it,

it’s an asymptotic curve.

And somebody came up with a better way

to do branch prediction, it was a lot better.

And he published a paper on it,

and every computer in the world now uses it.

And it was one idea.

So the engineers who build branch prediction hardware

were happy to drop the one kind of training array

and put it in another one.

So it was a real idea.

And branch prediction is one of the key problems

underlying all of sort of the lowest level of software.

It boils down to branch prediction.

Boils down to uncertainty.

Computers are limited by…

Single thread computer is limited by two things.

The predictability of the path of the branches

and the predictability of the locality of data.

So we have predictors that now predict

both of those pretty well.

So memory is a couple hundred cycles away,

local cache is a couple cycles away.

When you’re executing fast,

virtually all the data has to be in the local cache.

So a simple program says,

add one to every element in an array,

it’s really easy to see what the stream of data will be.

But you might have a more complicated program

that says, get an element of this array,

look at something, make a decision,

go get another element, it’s kind of random.

And you can think, that’s really unpredictable.

And then you make this big predictor

that looks at this kind of pattern and you realize,

well, if you get this data and this data,

then you probably want that one.

And if you get this one and this one and this one,

you probably want that one.

And is that theory or is that engineering?

Like the paper that was written,

was it asymptotic kind of discussion

or is it more like, here’s a hack that works well?

It’s a little bit of both.

Like there’s information theory in it, I think somewhere.

Okay, so it’s actually trying to prove some kind of stuff.

But once you know the method,

implementing it is an engineering problem.

Now there’s a flip side of this,

which is in a big design team,

what percentage of people think

their plan or their life’s work is engineering

versus inventing things?

So lots of companies will reward you for filing patents.

Some, many big companies get stuck

because to get promoted,

you have to come up with something new.

And then what happens is everybody’s trying

to do some random new thing,

99% of which doesn’t matter.

And the basics get neglected.

Or there’s a dichotomy, they think like the cell library

and the basic CAD tools or basic software validation methods,

that’s simple stuff.

They wanna work on the exciting stuff.

And then they spend lots of time

trying to figure out how to patent something.

And that’s mostly useless.

But the breakthrough is on simple stuff.

No, no, you have to do the simple stuff really well.

If you’re building a building out of bricks,

you want great bricks.

So you go to two places that sell bricks.

So one guy says, yeah, they’re over there in a ugly pile.

And the other guy is like lovingly tells you

about the 50 kinds of bricks and how hard they are

and how beautiful they are and how square they are.

Which one are you gonna buy bricks from?

Which is gonna make a better house?

So you’re talking about the craftsman,

the person who understands bricks,

who loves bricks, who loves the varieties.

That’s a good word.

Good engineering is great craftsmanship.

And when you start thinking engineering is about invention

and you set up a system that rewards invention,

the craftsmanship gets neglected.

Okay, so maybe one perspective is the theory,

the science overemphasizes invention

and engineering emphasizes craftsmanship.

And therefore, so it doesn’t matter what you do,

theory, engineering. Well, everybody does.

Like read the tech ranks are always talking

about some breakthrough or innovation

and everybody thinks that’s the most important thing.

But the number of innovative ideas

is actually relatively low.

We need them, right?

And innovation creates a whole new opportunity.

Like when some guy invented the internet, right?

Like that was a big thing.

The million people that wrote software against that

were mostly doing engineering software writing.

So the elaboration of that idea was huge.

I don’t know if you know Brendan Eich,

he wrote JavaScript in 10 days.

That’s an interesting story.

It makes me wonder, and it was famously for many years

considered to be a pretty crappy programming language.

Still is perhaps.

It’s been improving sort of consistently.

But the interesting thing about that guy is,

you know, he doesn’t get any awards.

You don’t get a Nobel Prize or a Fields Medal or.

For inventing a crappy piece of, you know, software code.

That is currently the number one programming language

in the world and runs,

now is increasingly running the backend of the internet.

Well, does he know why everybody uses it?

Like that would be an interesting thing.

Was it the right thing at the right time?

Cause like when stuff like JavaScript came out,

like there was a move from, you know,

writing C programs and C++ to what they call

managed code frameworks,

where you write simple code, it might be interpreted,

it has lots of libraries, productivity is high,

and you don’t have to be an expert.

So, you know, Java was supposed to solve

all the world’s problems.

It was complicated.

JavaScript came out, you know,

after a bunch of other scripting languages.

I’m not an expert on it.

But was it the right thing at the right time?

Or was there something, you know, clever?

Cause he wasn’t the only one.

There’s a few elements.

And maybe if he figured out what it was,

then he’d get a prize.

Like that.

Yeah, you know, maybe his problem is he hasn’t defined this.

Or he just needs a good promoter.

Well, I think there was a bunch of blog posts

written about it, which is like,

wrong is right, which is like doing the crappy thing fast.

Just like hacking together the thing

that answers some of the needs.

And then iterating over time, listening to developers.

Like listening to people who actually use the thing.

This is something you can do more in software.

But the right time, like you have to sense,

you have to have a good instinct

of when is the right time for the right tool.

And make it super simple.

And just get it out there.

The problem is, this is true with hardware.

This is less true with software.

Is there’s backward compatibility

that just drags behind you as, you know,

as you try to fix all the mistakes of the past.

But the timing.

It was good.

There’s something about that.

And it wasn’t accidental.

You have to like give yourself over to the,

you have to have this like broad sense

of what’s needed now.

Both scientifically and like the community.

And just like this, it was obvious that there was no,

the interesting thing about JavaScript

is everything that ran in the browser at the time,

like Java and I think other like Scheme,

other programming languages,

they were all in a separate external container.

And then JavaScript was literally

just injected into the webpage.

It was the dumbest possible thing

running in the same thread as everything else.

And like it was inserted as a comment.

So JavaScript code is inserted as a comment in the HTML code.

And it was, I mean, there’s,

it’s either genius or super dumb, but it’s like.

Right, so it had no apparatus for like a virtual machine

and container, it just executed in the framework

of the program that’s already running.

Yeah, that’s cool.

And then because something about that accessibility,

the ease of its use resulted in then developers innovating

of how to actually use it.

I mean, I don’t even know what to make of that,

but it does seem to echo across different software,

like stories of different software.

PHP has the same story, really crappy language.

They just took over the world.

I always have a joke that the random length instructions,

variable length instructions, that’s always one,

even though they’re obviously worse.

Like nobody knows why.

X86 is arguably the worst architecture on the planet.

It’s one of the most popular ones.

Well, I mean, isn’t that also the story of risk versus,

I mean, is that simplicity?

There’s something about simplicity that us

in this evolutionary process is valued.

If it’s simple, it spreads faster, it seems like.

Or is that not always true?

Not always true.

Yeah, it could be simple is good, but too simple is bad.

So why did risk win, you think, so far?

Did risk win?

In the long archivist tree.

We don’t know.

So who’s gonna win?

What’s risk, what’s CISC, and who’s gonna win in that space

in these instruction sets?

AI software’s gonna win, but there’ll be little computers

that run little programs like normal all over the place.

But we’re going through another transformation, so.

But you think instruction sets underneath it all will change?

Yeah, they evolve slowly.

They don’t matter very much.

They don’t matter very much, okay.

I mean, the limits of performance are predictability

of instructions and data.

I mean, that’s the big thing.

And then the usability of it is some quality of design,

quality of tools, availability.

Like right now, x86 is proprietary with Intel and AMD,

but they can change it any way they want independently.

ARM is proprietary to ARM,

and they won’t let anybody else change it.

So it’s like a sole point.

And RISC 5 is open source, so anybody can change it,

which is super cool.

But that also might mean it gets changed

too many random ways that there’s no common subset of it

that people can use.

Do you like open or do you like closed?

Like if you were to bet all your money on one

or the other, RISC 5 versus it?

No idea.

It’s case dependent?

Well, x86, oddly enough, when Intel first started

developing it, they licensed like seven people.

So it was the open architecture.

And then they moved faster than others

and also bought one or two of them.

But there was seven different people making x86

because at the time there was 6502 and Z80s and 8086.

And you could argue everybody thought Z80

was the better instruction set,

but that was proprietary to one place.

Oh, and the 6800.

So there’s like four or five different microprocessors.

Intel went open, got the market share

because people felt like they had multiple sources from it,

and then over time it narrowed down to two players.

So why, you as a historian, why did Intel win for so long

with their processors?

I mean, I mean.

They were great.

Their process development was great.

Oh, so it’s just looking back to JavaScript

and what I like is Microsoft and Netscape

and all these internet browsers.

Microsoft won the browser game

because they aggressively stole other people’s ideas

like right after they did it.

You know, I don’t know

if Intel was stealing other people’s ideas.

They started making.

In a good way, stealing in a good way just to clarify.

They started making RAMs, random access memories.

And then at the time

when the Japanese manufacturers came up,

you know, they were getting out competed on that

and they pivoted the microprocessors

and they made the first, you know,

integrated microprocessor grant programs.

It was the 4D04 or something.

Who was behind that pivot?

That’s a hell of a pivot.

Andy Grove and he was great.

That’s a hell of a pivot.

And then they led semiconductor industry.

Like they were just a little company, IBM,

all kinds of big companies had boatloads of money

and they out innovated everybody.

Out innovated, okay.

Yeah, yeah.

So it’s not like marketing, it’s not any of that stuff.

Their processor designs were pretty good.

I think the, you know, Core 2 was probably the first one

I thought was great.

It was a really fast processor and then Haswell was great.

What makes a great processor in that?

Oh, if you just look at it,

it’s performance versus everybody else.

It’s, you know, the size of it, the usability of it.

So it’s not specific,

some kind of element that makes you beautiful.

It’s just like literally just raw performance.

Is that how you think about processors?

It’s just like raw performance?

Of course.

It’s like a horse race.

The fastest one wins.


You don’t care how.

Just as long as it wins.

Well, there’s the fastest in the environment.

Like, you know, for years you made the fastest one you could

and then people started to have power limits.

So then you made the fastest at the right PowerPoint.

And then when we started doing multi processors,

like if you could scale your processors

more than the other guy,

you could be 10% faster on like a single thread,

but you have more threads.

So there’s lots of variability.

And then ARM really explored,

like, you know, they have the A series

and the R series and the M series,

like a family of processors

for all these different design points

from like unbelievably small and simple.

And so then when you’re doing the design,

it’s sort of like this big pallet of CPUs.

Like they’re the only ones with a credible,

you know, top to bottom pallet.

What do you mean a credible top to bottom?

Well, there’s people who make microcontrollers

that are small, but they don’t have a fast one.

There’s people who make fast processors,

but don’t have a medium one or a small one.

Is that hard to do that full pallet?

That seems like a…

Yeah, it’s a lot of different.

So what’s the difference in the ARM folks and Intel

in terms of the way they’re approaching this problem?

Well, Intel, almost all their processor designs

were, you know, very custom high end,

you know, for the last 15, 20 years.

So the fastest horse possible.


In one horse race.

Yeah, and then architecturally they’re really good,

but the company itself was fairly insular

to what’s going on in the industry with CAD tools and stuff.

And there’s this debate about custom design

versus synthesis and how do you approach that?

I’d say Intel was slow on getting to synthesize processors.

ARM came in from the bottom and they generated IP,

which went to all kinds of customers.

So they had very little say

on how the customer implemented their IP.

So ARM is super friendly to the synthesis IP environment.

Whereas Intel said,

we’re gonna make this great client chip or server chip

with our own CAD tools, with our own process,

with our own, you know, other supporting IP

and everything only works with our stuff.

So is that, is ARM winning the mobile platform space

in terms of process?


And so in that, what you’re describing

is why they’re winning.

Well, they had lots of people doing lots

of different experiments.

So they controlled the processor architecture and IP,

but they let people put in lots of different chips.

And there was a lot of variability in what happened there.

Whereas Intel, when they made their mobile,

their foray into mobile,

they had one team doing one part, right?

So it wasn’t 10 experiments.

And then their mindset was PC mindset,

Microsoft software mindset.

And that brought a whole bunch of things along

that the mobile world and the embedded world don’t do.

Do you think it was possible for Intel to pivot hard

and win the mobile market?

That’s a hell of a difficult thing to do, right?

For a huge company to just pivot.

I mean, it’s so interesting to,

because we’ll talk about your current work.

It’s like, it’s clear that PCs were dominating

for several decades, like desktop computers.

And then mobile, it’s unclear.

It’s a leadership question.

Like Apple under Steve Jobs, when he came back,

they pivoted multiple times.

You know, they built iPads and iTunes and phones

and tablets and great Macs.

Like who knew computers should be made out of aluminum?

Nobody knew that.

But they’re great.

It’s super fun.

That was Steve?

Yeah, Steve Jobs.

Like they pivoted multiple times.

And you know, the old Intel, they did that multiple times.

They made DRAMs and processors and processes

and I gotta ask this,

what was it like working with Steve Jobs?

I didn’t work with him.

Did you interact with him?


I said hi to him twice in the cafeteria.

What did he say?


He said, hey fellas.

He was friendly.

He was wandering around and with somebody,

he couldn’t find a table because the cafeteria was packed

and I gave him my table.

But I worked for Mike Colbert who talked to,

like Mike was the unofficial CTO of Apple

and a brilliant guy and he worked for Steve for 25 years,

maybe more and he talked to Steve multiple times a day

and he was one of the people who could put up with Steve’s,

let’s say, brilliance and intensity

and Steve really liked him and Steve trusted Mike

to translate the shit he thought up

into engineering products that work

and then Mike ran a group called Platform Architecture

and I was in that group.

So many times I’d be sitting with Mike

and the phone would ring and it’d be Steve

and Mike would hold the phone like this

because Steve would be yelling about something or other.

And then he would translate.

And he’d translate and then he would say,

Steve wants us to do this.


Was Steve a good engineer or no?

I don’t know.

He was a great idea guy.

Idea person.

And he’s a really good selector for talent.

Yeah, that seems to be one of the key elements

of leadership, right?

And then he was a really good first principles guy.

Like somebody would say something couldn’t be done

and he would just think, that’s obviously wrong, right?

But you know, maybe it’s hard to do.

Maybe it’s expensive to do.

Maybe we need different people.

You know, there’s like a whole bunch of,

if you want to do something hard,

you know, maybe it takes time.

Maybe you have to iterate.

There’s a whole bunch of things you could think about

but saying it can’t be done is stupid.

How would you compare?

So it seems like Elon Musk is more engineering centric

but is also, I think he considers himself a designer too.

He has a design mind.

Steve Jobs feels like he’s much more idea space,

design space versus engineering.

Just make it happen.

Like the world should be this way.

Just figure it out.

But he used computers.

You know, he had computer people talk to him all the time.

Like Mike was a really good computer guy.

He knew computers could do.

Computer meaning computer hardware?

Like hardware, software, all the pieces.

And then he would have an idea about

what could we do with this next.

That was grounded in reality.

It wasn’t like he was just finger painting on the wall

and wishing somebody would interpret it.

So he had this interesting connection

because he wasn’t a computer architect or designer

but he had an intuition from the computers we had

to what could happen.

And it’s interesting you say intuition

because it seems like he was pissing off a lot of engineers

in his intuition about what can and can’t be done.

Those, like the, what is all these stories

about like floppy disks and all that kind of stuff.

Yeah, so in Steve, the first round,

like he’d go into a lab and look at what’s going on

and hate it and fire people or ask somebody

in the elevator what they’re doing for Apple.

And not be happy.

When he came back, my impression was

is he surrounded himself

with a relatively small group of people

and didn’t really interact outside of that as much.

And then the joke was you’d see like somebody moving

a prototype through the quad with a black blanket over it.

And that was because it was secret, partly from Steve

because they didn’t want Steve to see it until it was ready.

Yeah, the dynamic with Johnny Ive and Steve is interesting.

It’s like you don’t wanna,

he ruins as many ideas as he generates.

Yeah, yeah.

It’s a dangerous kind of line to walk.

If you have a lot of ideas,

like Gordon Bell was famous for ideas, right?

And it wasn’t that the percentage of good ideas

was way higher than anybody else.

It was, he had so many ideas

and he was also good at talking to people about it

and getting the filters right.

And seeing through stuff.

Whereas Elon was like, hey, I wanna build rockets.

So Steve would hire a bunch of rocket guys

and Elon would go read rocket manuals.

So Elon is a better engineer, a sense like,

or like more like a love and passion for the manuals.

And the details.

The details, the craftsmanship too, right?

Well, I guess Steve had craftsmanship too,

but of a different kind.

What do you make of the,

just to stay in there for just a little longer,

what do you make of like the anger

and the passion and all of that?

The firing and the mood swings and the madness,

the being emotional and all of that, that’s Steve.

And I guess Elon too.

So what, is that a bug or a feature?

It’s a feature.

So there’s a graph, which is Y axis productivity,

X axis at zero is chaos,

and infinity is complete order, right?

So as you go from the origin,

as you improve order, you improve productivity.

And at some point, productivity peaks,

and then it goes back down again.

Too much order, nothing can happen.


But the question is, how close to the chaos is that?

No, no, no, here’s the thing,

is once you start moving in the direction of order,

the force vector to drive you towards order is unstoppable.

Oh, so it’s a slippery slope.

And every organization will move to the place

where their productivity is stymied by order.

So you need a…

So the question is, who’s the counter force?

Because it also feels really good.

As you get more organized, the productivity goes up.

The organization feels it, they orient towards it, right?

They hired more people.

They got more guys who couldn’t run process,

you get bigger, right?

And then inevitably, the organization gets captured

by the bureaucracy that manages all the processes.


All right, and then humans really like that.

And so if you just walk into a room and say,

guys, love what you’re doing,

but I need you to have less order.

If you don’t have some force behind that,

nothing will happen.

I can’t tell you on how many levels that’s profound, so.

So that’s why I’d say it’s a feature.

Now, could you be nicer about it?

I don’t know, I don’t know any good examples

of being nicer about it.

Well, the funny thing is to get stuff done,

you need people who can manage stuff and manage people,

because humans are complicated.

They need lots of care and feeding that you need

to tell them they look nice and they’re doing good stuff

and pat them on the back, right?

I don’t know, you tell me, is that needed?

Oh yeah.

Do humans need that?

I had a friend, he started a magic group and he said,

I figured it out.

You have to praise them before they do anything.

I was waiting until they were done.

And they were always mad at me.

Now I tell them what a great job they’re doing

while they’re doing it.

But then you get stuck in that trap,

because then when they’re not doing something,

how do you confront these people?

I think a lot of people that had trauma

in their childhood would disagree with you,

successful people, that you need to first do the rough stuff

and then be nice later.

I don’t know.

Okay, but engineering companies are full of adults

who had all kinds of range of childhoods.

You know, most people had okay childhoods.

Well, I don’t know if…

Lots of people only work for praise, which is weird.

You mean like everybody.

I’m not that interested in it, but…

Well, you’re probably looking for somebody’s approval.

Even still.

Yeah, maybe.

I should think about that.

Maybe somebody who’s no longer with us kind of thing.

I don’t know.

I used to call up my dad and tell him what I was doing.

He was very excited about engineering and stuff.

You got his approval?

Uh, yeah, a lot.

I was lucky.

Like, he decided I was smart and unusual as a kid

and that was okay when I was really young.

So when I did poorly in school, I was dyslexic.

I didn’t read until I was third or fourth grade.

They didn’t care.

My parents were like, oh, he’ll be fine.

So I was lucky.

That was cool.

Is he still with us?

You miss him?

Sure, yeah.

He had Parkinson’s and then cancer.

His last 10 years were tough and I killed him.

Killing a man like that’s hard.

The mind?

Well, it’s pretty good.

Parkinson’s causes slow dementia

and the chemotherapy, I think, accelerated it.

But it was like hallucinogenic dementia.

So he was clever and funny and interesting

and it was pretty unusual.

Do you remember conversations?

From that time?

Like, do you have fond memories of the guy?

Yeah, oh yeah.

Anything come to mind?

A friend told me one time I could draw a computer

on the whiteboard faster than anybody he’d ever met.

I said, you should meet my dad.

Like, when I was a kid, he’d come home and say,

I was driving by this bridge and I was thinking about it

and he pulled out a piece of paper

and he’d draw the whole bridge.

He was a mechanical engineer.

And he would just draw the whole thing

and then he would tell me about it

and then tell me how he would have changed it.

And he had this idea that he could understand

and conceive anything.

And I just grew up with that, so that was natural.

So when I interview people, I ask them to draw a picture

of something they did on a whiteboard

and it’s really interesting.

Like, some people draw a little box

and then they’ll say, and then this talks to this

and I’ll be like, oh, this is frustrating.

I had this other guy come in one time, he says,

well, I designed a floating point in this chip

but I’d really like to tell you how the whole thing works

and then tell you how the floating point works inside of it.

Do you mind if I do that?

And he covered two whiteboards in like 30 minutes

and I hired him.

Like, he was great.

This is craftsman.

I mean, that’s the craftsmanship to that.

Yeah, but also the mental agility

to understand the whole thing,

put the pieces in context,

real view of the balance of how the design worked.

Because if you don’t understand it properly,

when you start to draw it,

you’ll fill up half the whiteboard

with like a little piece of it

and like your ability to lay it out in an understandable way

takes a lot of understanding, so.

And be able to, so zoom into the detail

and then zoom out to the big picture.

Zoom out really fast.

What about the impossible thing?

You see, your dad believed that you can do anything.

That’s a weird feature for a craftsman.


It seems that that echoes in your own behavior.

Like that’s the.

Well, it’s not that anybody can do anything right now, right?

It’s that if you work at it, you can get better at it

and there might not be a limit.

And they did funny things like,

like he always wanted to play piano.

So at the end of his life, he started playing the piano

when he had Parkinson’s and he was terrible.

But he thought if he really worked out in this life,

maybe the next life he’d be better at it.

He might be onto something.

Yeah, he enjoyed doing it.


It’s pretty funny.

Do you think the perfect is the enemy of the good

in hardware and software engineering?

It’s like we were talking about JavaScript a little bit

and the messiness of the 10 day building process.

Yeah, you know, creative tension, right?

So creative tension is you have two different ideas

that you can’t do both, right?

And, but the fact that you wanna do both

causes you to go try to solve that problem.

That’s the creative part.

So if you’re building computers,

like some people say we have the schedule

and anything that doesn’t fit in the schedule we can’t do.


And so they throw out the perfect

because they have a schedule.

I hate that.

Then there’s other people who say

we need to get this perfectly right.

And no matter what, you know, more people, more money,


And there’s a really clear idea about what you want.

Some people are really good at articulating it, right?

So let’s call that the perfect, yeah.


All right, but that’s also terrible

because they never ship anything.

You never hit any goals.

So now you have your framework.


You can’t throw out stuff

because you can’t get it done today

because maybe you’ll get it done tomorrow

or the next project, right?

You can’t, so you have to,

I work with a guy that I really like working with,

but he over filters his ideas.

Over filters?

He’d start thinking about something

and as soon as he figured out what was wrong with it,

he’d throw it out.

And then I start thinking about it

and you come up with an idea

and then you find out what’s wrong with it.

And then you give it a little time to set

because sometimes you figure out how to tweak it

or maybe that idea helps some other idea.

So idea generation is really funny.

So you have to give your ideas space.

Like spaciousness of mind is key.

But you also have to execute programs and get shit done.

And then it turns out computer engineering is fun

because it takes 100 people to build a computer,

200 or 300, whatever the number is.

And people are so variable about temperament

and skill sets and stuff.

That in a big organization,

you find the people who love the perfect ideas

and the people that want to get stuffed on yesterday

and people like to come up with ideas

and people like to, let’s say shoot down ideas.

And it takes the whole, it takes a large group of people.

Some are good at generating ideas, some are good at filtering ideas.

And then all in that giant mess, you’re somehow,

I guess the goal is for that giant mess of people

to find the perfect path through the tension,

the creative tension.

But like, how do you know when you said

there’s some people good at articulating

what perfect looks like, what a good design is?

Like if you’re sitting in a room

and you have a set of ideas

about like how to design a better processor,

how do you know this is something special here?

This is a good idea, let’s try this.

Have you ever brainstormed an idea

with a couple of people that were really smart?

And you kind of go into it and you don’t quite understand it

and you’re working on it.

And then you start talking about it,

putting it on the whiteboard, maybe it takes days or weeks.

And then your brain starts to kind of synchronize.

It’s really weird.

Like you start to see what each other is thinking.

And it starts to work.

Like you can see work.

Like my talent in computer design

is I can see how computers work in my head, like really well.

And I know other people can do that too.

And when you’re working with people that can do that,

like it is kind of an amazing experience.

And then every once in a while you get to that place

and then you find the flaw, which is kind of funny

because you can fool yourself.

The two of you kind of drifted along

in the direction that was useless.

That happens too.

Like you have to, because the nice thing

about computer design is always reduction in practice.

Like you come up with your good ideas

and I know some architects who really love ideas

and then they work on them and they put it on the shelf

and they go work on the next idea and put it on the shelf

and they never reduce it to practice.

So they find out what’s good and bad.

Because almost every time I’ve done something really new,

by the time it’s done, like the good parts are good,

but I know all the flaws, like.


Would you say your career, just your own experience,

is your career defined mostly by flaws or by successes?

Like if…

Again, there’s great tension between those.

If you haven’t tried hard, right?

And done something new, right?

Then you’re not gonna be facing the challenges

when you build it.

Then you find out all the problems with it.


But when you look back, do you see problems?


Oh, when I look back?

What do you remember?

I think earlier in my career,

like EV5 was the second alpha chip.

I was so embarrassed about the mistakes,

I could barely talk about it.

And it was in the Guinness Book of World Records

and it was the fastest processor on the planet.


So it was, and at some point I realized

that was really a bad mental framework

to deal with doing something new.

We did a bunch of new things

and some worked out great and some were bad.

And we learned a lot from it.

And then the next one, we learned a lot.

That EV6 also had some really cool things in it.

I think the proportion of good stuff went up,

but it had a couple of fatal flaws in it that were painful.

And then, yeah.

You learned to channel the pain into like pride.

Not pride, really.

You know, just a realization about how the world works

or how that kind of idea set works.

Life is suffering.

That’s the reality.

No, it’s not.

Well, I know the Buddha said that

and a couple other people are stuck on it.

No, it’s, you know, there’s this kind of weird combination

of good and bad, you know, light and darkness

that you have to tolerate and, you know, deal with.

Yeah, there’s definitely lots of suffering in the world.

Depends on the perspective.

It seems like there’s way more darkness,

but that makes the light part really nice.

What computing hardware or just any kind,

even software design, do you find beautiful

from your own work, from other people’s work?

You’re just, we were just talking about the battleground

of flaws and mistakes and errors,

but things that were just beautifully done.

Is there something that pops to mind?

Well, when things are beautifully done,

usually there’s a well thought out set of abstraction layers.

So the whole thing works in unison nicely.


And when I say abstraction layer,

that means two different components

when they work together, they work independently.

They don’t have to know what the other one is doing.

So that decoupling.


So the famous one was the network stack.

Like there’s a seven layer network stack,

you know, data transport and protocol and all the layers.

And the innovation was,

is when they really wrote, got that right.

Cause networks before that didn’t define those very well.

The layers could innovate independently.

And occasionally the layer boundary would,

the interface would be upgraded.

And that let the design space breathe.

And you could do something new in layer seven

without having to worry about how layer four worked.

And so good design does that.

And you see it in processor designs.

When we did the Zen design at AMD,

we made several components very modular.

And, you know, my insistence at the top was

I wanted all the interfaces defined

before we wrote the RTL for the pieces.

One of the verification leads said,

if we do this right,

I can test the pieces so well independently

when we put it together,

we won’t find all these interaction bugs

cause the floating point knows how the cache works.

And I was a little skeptical,

but he was mostly right.

That the modularity of the design

greatly improved the quality.

Is that universally true in general?

Would you say about good designs,

the modularity is like usually modular?

Well, we talked about this before.

Humans are only so smart.

Like, and we’re not getting any smarter, right?

But the complexity of things is going up.

So, you know, a beautiful design can’t be bigger

than the person doing it.

It’s just, you know, their piece of it.

Like the odds of you doing a really beautiful design

of something that’s way too hard for you is low, right?

If it’s way too simple for you,

it’s not that interesting.

It’s like, well, anybody could do that.

But when you get the right match of your expertise

and, you know, mental power to the right design size,

that’s cool, but that’s not big enough

to make a meaningful impact in the world.

So now you have to have some framework

to design the pieces so that the whole thing

is big and harmonious.

But, you know, when you put it together,

it’s, you know, sufficiently interesting to be used.

And, you know, so that’s what a beautiful design is.

Matching the limits of that human cognitive capacity

to the module that you can create

and creating a nice interface between those modules

and thereby, do you think there’s a limit

to the kind of beautiful complex systems

we can build with this kind of modular design?

It’s like, you know, if we build increasingly

more complicated, you can think of like the internet.

Okay, let’s scale it down.

Or you can think of like social network,

like Twitter as one computing system.

But those are little modules, right?

But it’s built on so many components

nobody at Twitter even understands.


So if an alien showed up and looked at Twitter,

he wouldn’t just see Twitter as a beautiful,

simple thing that everybody uses, which is really big.

You would see the network, it runs on the fiber optics,

the data is transported to the computers.

The whole thing is so bloody complicated,

nobody at Twitter understands it.

And so that’s what the alien would see.

So yeah, if an alien showed up and looked at Twitter

or looked at the various different network systems

that you could see on Earth.

So imagine they were really smart

and they could comprehend the whole thing.

And then they sort of evaluated the human

and thought, this is really interesting.

No human on this planet comprehends the system they built.

No individual, well, would they even see individual humans?

Like we humans are very human centric, entity centric.

And so we think of us as the central organism

and the networks as just the connection of organisms.

But from a perspective of an alien,

from an outside perspective, it seems like.

Yeah, I get it.

We’re the ants and they’d see the ant colony.

The ant colony, yeah.

Or the result of production of the ant colony,

which is like cities and it’s,

in that sense, humans are pretty impressive.

The modularity that we’re able to,

and how robust we are to noise and mutation

and all that kind of stuff.

Well, that’s because it’s stress tested all the time.


You know, you build all these cities with buildings

and you get earthquakes occasionally

and, you know, wars, earthquakes.

Viruses every once in a while.

You know, changes in business plans

or, you know, like shipping or something.

Like as long as it’s all stress tested,

then it keeps adapting to the situation.

So that’s a curious phenomenon.

Well, let’s go, let’s talk about Moore’s Law a little bit.

It’s at the broad view of Moore’s Law

was just exponential improvement of computing capability.

Like OpenAI, for example, recently published

this kind of papers looking at the exponential improvement

in the training efficiency of neural networks

for like ImageNet and all that kind of stuff.

We just got better on this purely software side,

just figuring out better tricks and algorithms

for training neural networks.

And that seems to be improving significantly faster

than the Moore’s Law prediction, you know.

So that’s in the software space.

What do you think if Moore’s Law continues

or if the general version of Moore’s Law continues,

do you think that comes mostly from the hardware,

from the software, some mix of the two,

some interesting, totally,

so not the reduction of the size of the transistor

kind of thing, but more in the,

in the totally interesting kinds of innovations

in the hardware space, all that kind of stuff.

Well, there’s like a half a dozen things

going on in that graph.

So one is there’s initial innovations

that had a lot of headroom to be exploited.

So, you know, the efficiency of the networks

has improved dramatically.

And then the decomposability of those and the use going,

you know, they started running on one computer,

then multiple computers, then multiple GPUs,

and then arrays of GPUs, and they’re up to thousands.

And at some point, so it’s sort of like

they were consumed, they were going from

like a single computer application

to a thousand computer application.

So that’s not really a Moore’s Law thing.

That’s an independent vector.

How many computers can I put on this problem?

Because the computers themselves are getting better

on like a Moore’s Law rate,

but their ability to go from one to 10

to 100 to a thousand, you know, was something.

And then multiplied by, you know, the amount of computes

it took to resolve like AlexNet to ResNet to transformers.

It’s been quite, you know, steady improvements.

But those are like S curves, aren’t they?

That’s the exactly kind of S curves

that are underlying Moore’s Law from the very beginning.

So what’s the biggest, what’s the most productive,

rich source of S curves in the future, do you think?

Is it hardware, is it software, or is it?

So hardware is going to move along relatively slowly.

Like, you know, double performance every two years.

There’s still…

I like how you call that slowly.

Yeah, that’s the slow version.

The snail’s pace of Moore’s Law.

Maybe we should trademark that one.

Whereas the scaling by number of computers, you know,

can go much faster, you know.

I’m sure at some point Google had a, you know,

their initial search engine was running on a laptop,

you know, like.

And at some point they really worked on scaling that.

And then they factored the indexer from, you know,

this piece and this piece and this piece,

and they spread the data on more and more things.

And, you know, they did a dozen innovations.

But as they scaled up the number of computers on that,

it kept breaking, finding new bottlenecks

in their software and their schedulers,

and made them rethink.

Like, it seems insane to do a scheduler

across 1,000 computers to schedule parts of it

and then send the results to one computer.

But if you want to schedule a million searches,

that makes perfect sense.

So there’s the scaling by just quantity

is probably the richest thing.

But then as you scale quantity,

like a network that was great on 100 computers

may be completely the wrong one.

You may pick a network that’s 10 times slower

on 10,000 computers, like per computer.

But if you go from 100 to 10,000, it’s 100 times.

So that’s one of the things that happened

when we did internet scaling.

This efficiency went down, not up.

The future of computing is inefficiency, not efficiency.

But scale, inefficient scale.

It’s scaling faster than inefficiency bites you.

And as long as there’s, you know, dollar value there,

like scaling costs lots of money.

But Google showed, Facebook showed, everybody showed

that the scale was where the money was at.

It was, and so it was worth the financial.

Do you think, is it possible that like basically

the entirety of Earth will be like a computing surface?

Like this table will be doing computing.

This hedgehog will be doing computing.

Like everything really inefficient,

dumb computing will be leveraged.

The science fiction books, they call it computronium.


We turn everything into computing.

Well, most of the elements aren’t very good for anything.

Like you’re not gonna make a computer out of iron.

Like, you know, silicon and carbon have like nice structures.

You know, we’ll see what you can do with the rest of it.

Like people talk about, well, maybe we can turn the sun

into computer, but it’s hydrogen and a little bit of helium.


What I mean is more like actually just adding computers

to everything.

Oh, okay.

So you’re just converting all the mass of the universe

into computer.

No, no, no.

So not using.

It’d be ironic from the simulation point of view.

It’s like the simulator build mass, the simulates.

Yeah, I mean, yeah.

So, I mean, ultimately this is all heading

towards a simulation.

Yeah, well, I think I might’ve told you this story.

At Tesla, they were deciding,

so they wanna measure the current coming out of the battery

and they decided between putting the resistor in there

and putting a computer with a sensor in there.

And the computer was faster than the computer

I worked on in 1982.

And we chose the computer

because it was cheaper than the resistor.

So, sure, this hedgehog costs $13

and we can put an AI that’s as smart as you

in there for five bucks.

It’ll have one.

So computers will be everywhere.

I was hoping it wouldn’t be smarter than me because.

Well, everything’s gonna be smarter than you.

But you were saying it’s inefficient.

I thought it was better to have a lot of dumb things.

Well, Moore’s law will slowly compact that stuff.

So even the dumb things will be smarter than us.

The dumb things are gonna be smart

or they’re gonna be smart enough to talk to something

that’s really smart.

You know, it’s like.

Well, just remember, like a big computer chip.


You know, it’s like an inch by an inch

and, you know, 40 microns thick.

It doesn’t take very much, very many atoms

to make a high power computer.


And 10,000 of them can fit in a shoebox.

But, you know, you have the cooling and power problems,

but, you know, people are working on that.

But they still can’t write compelling poetry or music

or understand what love is or have a fear of mortality.

So we’re still winning.

Neither can most of humanity, so.

Well, they can write books about it.

So, but speaking about this,

this walk along the path of innovation

towards the dumb things being smarter than humans,

you are now the CTO of 10storrent as of two months ago.

They build hardware for deep learning.

How do you build scalable and efficient deep learning?

This is such a fascinating space.

Yeah, yeah, so it’s interesting.

So up until recently,

I thought there was two kinds of computers.

There are serial computers that run like C programs,

and then there’s parallel computers.

So the way I think about it is, you know,

parallel computers have given parallelism.

Like, GPUs are great because you have a million pixels,

and modern GPUs run a program on every pixel.

They call it the shader program, right?

So, or like finite element analysis.

You build something, you know,

you make this into little tiny chunks,

you give each chunk to a computer,

so you’re given all these chunks,

you have parallelism like that.

But most C programs, you write this linear narrative,

and you have to make it go fast.

To make it go fast, you predict all the branches,

all the data fetches, and you run that.

More parallel, but that’s found parallelism.

AI is, I’m still trying to decide how fundamental this is.

It’s a given parallelism problem.

But the way people describe the neural networks,

and then how they write them in PyTorch, it makes graphs.

Yeah, that might be fundamentally different

than the GPU kind of.

Parallelism, yeah, it might be.

Because when you run the GPU program on all the pixels,

you’re running, you know, it depends,

this group of pixels say it’s background blue,

and it runs a really simple program.

This pixel is, you know, some patch of your face,

so you have some really interesting shader program

to give you the impression of translucency.

But the pixels themselves don’t talk to each other.

There’s no graph, right?

So you do the image, and then you do the next image,

and you do the next image,

and you run eight million pixels,

eight million programs every time,

and modern GPUs have like 6,000 thread engines in them.

So, you know, to get eight million pixels,

each one runs a program on, you know, 10 or 20 pixels.

And that’s how they work, but there’s no graph.

But you think graph might be a totally new way

to think about hardware.

So Rajagat Dori and I have been having this conversation

about given versus found parallelism.

And then the kind of walk,

because we got more transistors,

like, you know, computers way back when

did stuff on scalar data.

Now we did it on vector data, famous vector machines.

Now we’re making computers that operate on matrices, right?

And then the category we said that was next was spatial.

Like, imagine you have so much data

that, you know, you want to do the compute on this data,

and then when it’s done, it says,

send the result to this pile of data on some software on that.

And it’s better to think about it spatially

than to move all the data to a central processor

and do all the work.

So spatially, you mean moving in the space of data

as opposed to moving the data.

Yeah, you have a petabyte data space

spread across some huge array of computers.

And when you do a computation somewhere,

you send the result of that computation

or maybe a pointer to the next program

to some other piece of data and do it.

But I think a better word might be graph.

And all the AI neural networks are graphs.

Do some computations, send the result here,

do another computation, do a data transformation,

do a merging, do a pooling, do another computation.

Is it possible to compress and say

how we make this thing efficient,

this whole process efficient, this different?

So first, the fundamental elements in the graphs

are things like matrix multiplies, convolutions,

data manipulations, and data movements.

So GPUs emulate those things with their little singles,

you know, basically running a single threaded program.

And then there’s, you know, and NVIDIA calls it a warp

where they group a bunch of programs

that are similar together.

So for efficiency and instruction use.

And then at a higher level, you kind of,

you take this graph and you say this part of the graph

is a matrix multiplier, which runs on these 32 threads.

But the model at the bottom was built

for running programs on pixels, not executing graphs.

So it’s emulation, ultimately.

So is it possible to build something

that natively runs graphs?

Yes, so that’s what 10storrent did.


Where are we on that?

How, like, in the history of that effort,

are we in the early days?

Yeah, I think so.

10storrent started by a friend of mine,

Labisha Bajek, and I was his first investor.

So I’ve been, you know, kind of following him

and talking to him about it for years.

And in the fall when I was considering things to do,

I decided, you know, we held a conference last year

with a friend, organized it,

and we wanted to bring in thinkers.

And two of the people were Andre Carpassi and Chris Ladner.

And Andre gave this talk, it’s on YouTube,

called Software 2.0, which I think is great.

Which is, we went from programmed computers,

where you write programs, to data program computers.

You know, like the future of software is data programs,

the networks.

And I think that’s true.

And then Chris has been working,

he worked on LLVM, the low level virtual machine,

which became the intermediate representation

for all compilers.

And now he’s working on another project called MLIR,

which is mid level intermediate representation,

which is essentially under the graph

about how do you represent that kind of computation

and then coordinate large numbers

of potentially heterogeneous computers.

And I would say technically, Tens Torrents,

you know, two pillars of those two ideas,

software 2.0 and mid level representation.

But it’s in service of executing graph programs.

The hardware is designed to do that.

So it’s including the hardware piece.


And then the other cool thing is,

for a relatively small amount of money,

they did a test chip and two production chips.

So it’s like a super effective team.

And unlike some AI startups,

where if you don’t build the hardware

to run the software that they really want to do,

then you have to fix it by writing lots more software.

So the hardware naturally does matrix multiply,

convolution, the data manipulations,

and the data movement between processing elements

that you can see in the graph,

which I think is all pretty clever.

And that’s what I’m working on now.

So the, I think it’s called the Grace Call Processor.

I introduced last year.

It’s, you know, there’s a bunch of measures of performance.

We’re talking about horses.

It seems to outperform 368 trillion operations per second.

It seems to outperform NVIDIA’s Tesla T4 system.

So these are just numbers.

What do they actually mean in real world performance?

Like what are the metrics for you

that you’re chasing in your horse race?

Like what do you care about?

Well, first, so the native language of,

you know, people who write AI network programs

is PyTorch now, PyTorch, TensorFlow.

There’s a couple others.

Do you think PyTorch is one over TensorFlow?

Or is it just?

I’m not an expert on that.

I know many people who have switched

from TensorFlow to PyTorch.

And there’s technical reasons for it.

I use both.

Both are still awesome.

But the deepest love is for PyTorch currently.

Yeah, there’s more love for that.

And that may change.

So the first thing is when they write their programs,

can the hardware execute it pretty much as it was written?

Right, so PyTorch turns into a graph.

We have a graph compiler that makes that graph.

Then it fractions the graph down.

So if you have big matrix multiply,

we turn it into right size chunks

to run on the processing elements.

It hooks all the graph up.

It lays out all the data.

There’s a couple of mid level representations of it

that are also simulatable.

So that if you’re writing the code,

you can see how it’s gonna go through the machine,

which is pretty cool.

And then at the bottom, it schedules kernels,

like math, data manipulation, data movement kernels,

which do this stuff.

So we don’t have to write a little program

to do matrix multiply,

because we have a big matrix multiplier.

There’s no SIMD program for that.

But there is scheduling for that, right?

So one of the goals is,

if you write a piece of PyTorch code

that looks pretty reasonable,

you should be able to compile it, run it on the hardware

without having to tweak it

and do all kinds of crazy things to get performance.

There’s not a lot of intermediate steps.

It’s running directly as written.

Like on a GPU, if you write a large matrix multiply naively,

you’ll get five to 10% of the peak performance of the GPU.

Right, and then there’s a bunch of people

who’ve published papers on this,

and I read them about what steps do you have to do.

And it goes from pretty reasonable,

well, transpose one of the matrices.

So you do row ordered, not column ordered,

block it so that you can put a block of the matrix

on different SMs, groups of threads.

But some of it gets into little details,

like you have to schedule it just so,

so you don’t have register conflicts.

So they call them CUDA ninjas.

CUDA ninjas, I love it.

To get to the optimal point,

you either use a prewritten library,

which is a good strategy for some things,

or you have to be an expert

in micro architecture to program it.

Right, so the optimization step

is way more complicated with the GPU.

So our goal is if you write PyTorch,

that’s good PyTorch, you can do it.

Now there’s, as the networks are evolving,

they’ve changed from convolutional to matrix multiply.

People are talking about conditional graphs,

they’re talking about very large matrices,

they’re talking about sparsity,

they’re talking about problems

that scale across many, many chips.

So the native data item is a packet.

So you send a packet to a processor, it gets processed,

it does a bunch of work,

and then it may send packets to other processors,

and they execute in like a data flow graph

kind of methodology.

Got it.

We have a big network on chip,

and then the second chip has 16 ethernet ports

to hook lots of them together,

and it’s the same graph compiler across multiple chips.

So that’s where the scale comes in.

So it’s built to scale naturally.

Now, my experience with scaling is as you scale,

you run into lots of interesting problems.

So scaling is the mountain to climb.


So the hardware is built to do this,

and then we’re in the process of.

Is there a software part to this

with ethernet and all that?

Well, the protocol at the bottom,

we sent, it’s an ethernet PHY,

but the protocol basically says,

send the packet from here to there.

It’s all point to point.

The header bit says which processor to send it to,

and we basically take a packet off our on chip network,

put an ethernet header on it,

send it to the other end to strip the header off,

and send it to the local thing.

It’s pretty straightforward.

Human to human interaction is pretty straightforward too,

but when you get a million of us,

we could do some crazy stuff together.

Yeah, it’s gonna be fun.

So is that the goal is scale?

So like, for example, I’ve been recently

doing a bunch of robots at home

for my own personal pleasure.

Am I going to ever use 10th Story, or is this more for?

There’s all kinds of problems.

Like, there’s small inference problems,

or small training problems, or big training problems.

What’s the big goal?

Is it the big training problems,

or the small training problems?

Well, one of the goals is to scale

from 100 milliwatts to a megawatt, you know?

So like, really have some range on the problems,

and the same kind of AI programs

work at all different levels.

So that’s the goal.

The natural, since the natural data item

is a packet that we can move around,

it’s built to scale, but so many people have small problems.

Right, right.

But the, you know.

Like, inside that phone is a small problem to solve.

So do you see 10th Story potentially being inside a phone?

Well, the power efficiency of local memory,

local computation, and the way we built it is pretty good.

And then there’s a lot of efficiency

on being able to do conditional graphs and sparsity.

I think it’s, for complicated networks

that wanna go in a small factor, it’s gonna be quite good.

But we have to prove that, that’s all.

It’s a fun problem.

And that’s the early days of the company, right?

It’s a couple years, you said?

But you think, you invested, you think they’re legit.


And so you joined.

Yeah, I joined.

Well, that’s.

That’s a really interesting place to be.

Like, the AI world is exploding, you know.

And I looked at some other opportunities

like build a faster processor, which people want.

But that’s more on an incremental path

than what’s gonna happen in AI in the next 10 years.


So this is kind of, you know,

an exciting place to be part of.

Yeah, the revolutions will be happening

in the very space that Tesla is.

And then lots of people are working on it,

but there’s lots of technical reasons why some of them,

you know, aren’t gonna work out that well.

And, you know, that’s interesting.

And there’s also the same problem

about getting the basics right.

Like, we’ve talked to customers about exciting features.

And at some point we realized that,

Labish and I were realizing they want to hear first

about memory bandwidth, local bandwidth,

compute intensity, programmability.

They want to know the basics, power management,

how the network ports work, what are the basics,

do all the basics work.

Because it’s easy to say, we’ve got this great idea,

you know, the crack GPT3, but the people we talked to

want to say, if I buy the, so we have a PCI Express card

with our chip on it, if you buy the card,

you plug it in your machine to download the driver,

how long does it take me to get my network to run?

Right, right.

You know, that’s a real question.

It’s a very basic question.

So, yeah.

Is there an answer to that yet,

or is it trying to get to that?

Our goal is like an hour.


When can I buy a Tesla?

Pretty soon.

Or my, for the small case training.

Yeah, pretty soon.



I love the idea of you inside the room

with the Carpathi, Andre Carpathi and Chris Ladner.

Very, very interesting, very brilliant people,

very out of the box thinkers,

but also like first principles thinkers.

Well, they both get stuff done.

They only get stuff done to get their own projects done.

They talk about it clearly.

They educate large numbers of people,

and they’ve created platforms for other people

to go do their stuff on.

Yeah, the clear thinking that’s able to be communicated

is kind of impressive.

It’s kind of remarkable to, yeah, I’m a fan.

Well, let me ask,

because I talk to Chris actually a lot these days.

He’s been one of the, just to give him a shout out,

he’s been so supportive as a human being.

So everybody’s quite different.

Like great engineers are different,

but he’s been like sensitive to the human element

in a way that’s been fascinating.

Like he was one of the early people

on this stupid podcast that I do to say like,

don’t quit this thing,

and also talk to whoever the hell you want to talk to.

That kind of from a legit engineer to get like props

and be like, you can do this.

That was, I mean, that’s what a good leader does, right?

To just kind of let a little kid do his thing,

like go do it, let’s see what turns out.

That’s a pretty powerful thing.

But what do you, what’s your sense about,

he used to be, no, I think stepped away from Google, right?

He’s at SciFive, I think.

What’s really impressive to you

about the things that Chris has worked on?

Because we mentioned the optimization,

the compiler design stuff, the LLVM,

then there’s, he’s also at Google worked at the TPU stuff.

He’s obviously worked on Swift,

so the programming language side.

Talking about people that work in the entirety of the stack.

What, from your time interacting with Chris

and knowing the guy, what’s really impressive to you

that just inspires you?

Well, like LLVM became the defacto platform

for the defacto platform for compilers.

It’s amazing.

And it was good code quality, good design choices.

He hit the right level of abstraction.

There’s a little bit of the right time, the right place.

And then he built a new programming language called Swift,

which after, let’s say some adoption resistance

became very successful.

I don’t know that much about his work at Google,

although I know that that was a typical,

they started TensorFlow stuff and it was new.

They wrote a lot of code and then at some point

it needed to be refactored to be,

because its development slowed down,

why PyTorch started a little later and then passed it.

So he did a lot of work on that.

And then his idea about MLIR,

which is what people started to realize

is the complexity of the software stack above

the low level IR was getting so high

that forcing the features of that into the level

was putting too much of a burden on it.

So he’s splitting that into multiple pieces.

And that was one of the inspirations for our software stack

where we have several intermediate representations

that are all executable and you can look at them

and do transformations on them before you lower the level.

So that was, I think we started before MLIR

really got far enough along to use,

but we’re interested in that.

He’s really excited about MLIR.

That’s his like little baby.

So he, and there seems to be some profound ideas on that

that are really useful.

So each one of those things has been,

as the world of software gets more and more complicated,

how do we create the right abstraction levels

to simplify it in a way that people can now work independently

on different levels of it?

So I would say all three of those projects,

LLVM, Swift, and MLIR did that successfully.

So I’m interested in what he’s gonna do next

in the same kind of way.


On either the TPU or maybe the Nvidia GPU side,

how does 10th Story think, or the ideas underlying it,

does it have to be 10th Story?

Just this kind of graph focused,

graph centric hardware, deep learning centric hardware,

beat NVIDIAs, do you think it’s possible

for it to basically overtake NVIDIA?


What’s that process look like?

What’s that journey look like, you think?

Well, GPUs were built to run shader programs

on millions of pixels, not to run graphs.


So there’s a hypothesis that says

the way the graphs are built

is going to be really interesting

to be inefficient on computing this.

And then the primitives is not a SIMD program,

it’s matrix multiply convolution.

And then the data manipulations are fairly extensive about,

like, how do you do a fast transpose with a program?

I don’t know if you’ve ever written a transpose program.

They’re ugly and slow, but in hardware,

you can do really well.

Like, I’ll give you an example.

So when GPU accelerators first started doing triangles,

like, so you have a triangle

which maps on a set of pixels.

So you build, it’s very easy,

straightforward to build a hardware engine

that’ll find all those pixels.

And it’s kind of weird

because you walk along the triangle to get to the edge,

and then you have to go back down to the next row

and walk along, and then you have to decide on the edge

if the line of the triangle is like half on the pixel,

what’s the pixel color?

Because it’s half of this pixel and half the next one.

That’s called rasterization.

And you’re saying that could be done in hardware?

No, that’s an example of that operation

as a software program is really bad.

I’ve written a program that did rasterization.

The hardware that does it has actually less code

than the software program that does it,

and it’s way faster.

Right, so there are certain times

when the abstraction you have, rasterize a triangle,

you know, execute a graph, you know, components of a graph.

But the right thing to do in the hardware software boundary

is for the hardware to naturally do it.

And so the GPU is really optimized

for the rasterization of triangles.

Well, you know, that’s just, well, like in a modern,

you know, that’s a small piece of modern GPUs.

What they did is that they still rasterize triangles

when you’re running in a game, but for the most part,

most of the computation in the area of the GPU

is running shader programs.

But they’re single threaded programs on pixels, not graphs.

I have to be honest, I’d say I don’t actually know

the math behind shader, shading and lighting

and all that kind of stuff.

I don’t know what.

They look like little simple floating point programs

or complicated ones.

You can have 8,000 instructions in a shader program.

But I don’t have a good intuition

why it could be parallelized so easily.

No, it’s because you have 8 million pixels in every single.

So when you have a light, right, that comes down,

the angle, you know, the amount of light,

like say this is a line of pixels across this table, right?

The amount of light on each pixel is subtly different.

And each pixel is responsible for figuring out what.

Figuring it out.

So that pixel says, I’m this pixel.

I know the angle of the light.

I know the occlusion.

I know the color I am.

Like every single pixel here is a different color.

Every single pixel gets a different amount of light.

Every single pixel has a subtly different translucency.

So to make it look realistic,

the solution was you run a separate program on every pixel.

See, but I thought there’s like reflection

from all over the place.

Every pixel. Yeah, but there is.

So you build a reflection map,

which also has some pixelated thing.

And then when the pixel is looking at the reflection map,

it has to calculate what the normal of the surface is.

And it does it per pixel.

By the way, there’s boatloads of hacks on that.

You know, like you may have a lower resolution light map,

your reflection map.

There’s all these, you know, tax they do.

But at the end of the day, it’s per pixel computation.

And it’s so happening that you can map

graph like computation onto this pixel central computation.

You can do floating point programs

on convolutions and the matrices.

And Nvidia invested for years in CUDA.

First for HPC, and then they got lucky with the AI trend.

But do you think they’re going to essentially

not be able to hardcore pivot out of their?

We’ll see.

That’s always interesting.

How often do big companies hardcore pivot?


How much do you know about Nvidia, folks?

Some. Some?

Well, I’m curious as well.

Who’s ultimately, as a…

Well, they’ve innovated several times.

But they’ve also worked really hard on mobile.

They’ve worked really hard on radios.

You know, they’re fundamentally a GPU company.

Well, they tried to pivot.

There’s an interesting little game and play

in autonomous vehicles, right?

With, or semi autonomous, like playing with Tesla

and so on and seeing that’s dipping a toe

into that kind of pivot.

They came out with this platform,

which is interesting technically.

But it was like a 3000 watt, you know,

3000 watt, $3,000 GPU platform.

I don’t know if it’s interesting technically.

It’s interesting philosophically.

Technically, I don’t know if it’s the execution

of the craftsmanship is there.

I’m not sure.

But I didn’t get a sense.

I think they were repurposing GPUs

for an automotive solution.

Right, it’s not a real pivot.

They didn’t build a ground up solution.


Like the chips inside Tesla are pretty cheap.

Like Mobileye has been doing this.

They’re doing the classic work from the simplest thing.


I mean, 40 square millimeter chips.

And Nvidia, their solution had 800 millimeter chips

and two 200 millimeter chips.

And, you know, like boatloads are really expensive DRAMs.

And, you know, it’s a really different approach.

And Mobileye fit the, let’s say,

automotive cost and form factor.

And then they added features as it was economically viable.

And Nvidia said, take the biggest thing

and we’re gonna go make it work.

You know, and that’s also influenced like Waymo.

There’s a whole bunch of autonomous startups

where they have a 5,000 watt server in their trunk.


But that’s because they think, well, 5,000 watts

and, you know, $10,000 is okay

because it’s replacing a driver.

Elon’s approach was that port has to be cheap enough

to put it in every single Tesla,

whether they turn on autonomous driving or not.

Which, and Mobileye was like,

we need to fit in the bomb and, you know,

cost structure that car companies do.

So they may sell you a GPS for 1500 bucks,

but the bomb for that, it’s like $25.

Well, and for Mobileye, it seems like neural networks

were not first class citizens, like the computation.

They didn’t start out as a…

Yeah, it was a CV problem.


And did classic CV and found stoplights and lines.

And they were really good at it.

Yeah, and they never, I mean,

I don’t know what’s happening now,

but they never fully pivoted.

I mean, it’s like, it’s the Nvidia thing.

And then as opposed to,

so if you look at the new Tesla work,

it’s like neural networks from the ground up, right?

Yeah, and even Tesla started with a lot of CV stuff in it

and Andrei’s basically been eliminating it.

Move everything into the network.

So without, this isn’t like confidential stuff,

but you sitting on a porch, looking over the world,

looking at the work that Andrei’s doing,

that Elon’s doing with Tesla Autopilot,

do you like the trajectory of where things are going

on the hardware side?

Well, they’re making serious progress.

I like the videos of people driving the beta stuff.

I guess taking some pretty complicated intersections

and all that, but it’s still an intervention per drive.

I mean, I have autopilot, the current autopilot,

my Tesla, I use it every day.

Do you have full self driving beta or no?


So you like where this is going?

They’re making progress.

It’s taking longer than anybody thought.

You know, my wonder is, you know, hardware three,

is it enough computing off by two, off by five,

off by 10, off by a hundred?


And I thought it probably wasn’t enough,

but they’re doing pretty well with it now.


And one thing is the data set gets bigger,

the training gets better.

And then there’s this interesting thing is you sort of train

and build an arbitrary size network that solves the problem.

And then you refactor the network down to the thing

that you can afford to ship, right?

So the goal isn’t to build a network that fits in the phone.

It’s to build something that actually works.

And then how do you make that most effective

on the hardware you have?

And they seem to be doing that much better

than a couple of years ago.

Well, the one really important thing is also

what they’re doing well is how to iterate that quickly,

which means like it’s not just about one time deployment,

one building, it’s constantly iterating the network

and trying to automate as many steps as possible, right?

And that’s actually the principles of the Software 2.0,

like you mentioned with Andre is it’s not just,

I mean, I don’t know what the actual,

his description of Software 2.0 is.

If it’s just high level philosophical or their specifics,

but the interesting thing about what that actually looks

in the real world is it’s that what I think Andre calls

the data engine, it’s like it’s the iterative improvement

of the thing.

You have a neural network that does stuff,

fails on a bunch of things and learns from it

over and over and over.

So you’re constantly discovering edge cases.

So it’s very much about like data engineering,

like figuring out, it’s kind of what you were talking about

with TestTorrent is you have the data landscape.

And you have to walk along that data landscape

in a way that is constantly improving the neural network.

And that feels like that’s the central piece of it.

And there’s two pieces of it.

Like you find edge cases that don’t work

and then you define something that goes,

get your data for that.

But then the other constraint is whether you have

to label it or not.

Like the amazing thing about like the GPT3 stuff

is it’s unsupervised.

So there’s essentially infinite amount of data.

Now there’s obviously infinite amount of data available

from cars of people successfully driving.

But the current pipelines are mostly running

on labeled data, which is human limited.

So when that becomes unsupervised,

it’ll create unlimited amount of data,

which then they’ll scale.

Now the networks that may use that data

might be way too big for cars,

but then there’ll be the transformation from now

we have unlimited data, I know exactly what I want.

Now can I turn that into something that fits in the car?

And that process is gonna happen all over the place.

Every time you get to the place where you have

unlimited data, and that’s what software 2.0 is about,

unlimited data training networks to do stuff

without humans writing code to do it.

And ultimately also trying to discover,

like you’re saying, the self supervised formulation

of the problem.

So the unsupervised formulation of the problem.

Like in driving, there’s this really interesting thing,

which is you look at a scene that’s before you,

and you have data about what a successful human driver did

in that scene one second later.

It’s a little piece of data that you can use

just like with GPT3 as training.

Currently, even though Tesla says they’re using that,

it’s an open question to me, how far can you,

can you solve all of the driving

with just that self supervised piece of data?

And like, I think.

Well, that’s what Common AI is doing.

That’s what Common AI is doing,

but the question is how much data.

So what Common AI doesn’t have is as good

of a data engine, for example, as Tesla does.

That’s where the, like the organization of the data.

I mean, as far as I know, I haven’t talked to George,

but they do have the data.

The question is how much data is needed,

because we say infinite very loosely here.

And then the other question, which you said,

I don’t know if you think it’s still an open question is,

are we on the right order of magnitude

for the compute necessary?

That is this, is it like what Elon said,

this chip that’s in there now is enough

to do full self driving,

or do we need another order of magnitude?

I think nobody actually knows the answer to that question.

I like the confidence that Elon has, but.

Yeah, we’ll see.

There’s another funny thing is you don’t learn to drive

with infinite amounts of data.

You learn to drive with an intellectual framework

that understands physics and color and horizontal surfaces

and laws and roads and all your experience

from manipulating your environment.

Like, look, there’s so many factors go into that.

So then when you learn to drive,

like driving is a subset of this conceptual framework

that you have, right?

And so with self driving cars right now,

we’re teaching them to drive with driving data.

You never teach a human to do that.

You teach a human all kinds of interesting things,

like language, like don’t do that, watch out.

There’s all kinds of stuff going on.

Well, this is where you, I think previous time

we talked about where you poetically disagreed

with my naive notion about humans.

I just think that humans will make

this whole driving thing really difficult.

Yeah, all right.

I said, humans don’t move that slow.

It’s a ballistics problem.

It’s a ballistics, humans are a ballistics problem,

which is like poetry to me.

It’s very possible that in driving

they’re indeed purely a ballistics problem.

And I think that’s probably the right way to think about it.

But I still, they still continue to surprise me,

those damn pedestrians, the cyclists,

other humans in other cars and.

Yeah, but it’s gonna be one of these compensating things.

So like when you’re driving,

you have an intuition about what humans are going to do,

but you don’t have 360 cameras and radars

and you have an attention problem.

So the self driving car comes in with no attention problem,

360 cameras right now, a bunch of other features.

So they’ll wipe out a whole class of accidents, right?

And emergency braking with radar

and especially as it gets AI enhanced

will eliminate collisions, right?

But then you have the other problems

of these unexpected things where

you think your human intuition is helping,

but then the cars also have a set of hardware features

that you’re not even close to.

And the key thing of course is if you wipe out

a huge number of kind of accidents,

then it might be just way safer than a human driver,

even though, even if humans are still a problem,

that’s hard to figure out.

Yeah, that’s probably what will happen.

Those autonomous cars will have a small number of accidents

humans would have avoided, but they’ll wipe,

they’ll get rid of the bulk of them.

What do you think about like Tesla’s dojo efforts

or it can be bigger than Tesla in general.

It’s kind of like the tense torrent trying to innovate,

like this is the dichotomy, like should a company

try to from scratch build its own

neural network training hardware?

Well, first of all, I think it’s great.

So we need lots of experiments, right?

And there’s lots of startups working on this

and they’re pursuing different things.

I was there when we started dojo and it was sort of like,

what’s the unconstrained computer solution

to go do very large training problems?

And then there’s fun stuff like, we said,

well, we have this 10,000 watt board to cool.

Well, you go talk to guys at SpaceX

and they think 10,000 watts is a really small number,

not a big number.

And there’s brilliant people working on it.

I’m curious to see how it’ll come out.

I couldn’t tell you, I know it pivoted

a few times since I left, so.

So the cooling does seem to be a big problem.

I do like what Elon said about it, which is like,

we don’t wanna do the thing unless it’s way better

than the alternative, whatever the alternative is.

So it has to be way better than like racks or GPUs.

Yeah, and the other thing is just like,

you know, the Tesla autonomous driving hardware,

it was only serving one software stack.

And the hardware team and the software team

were tightly coupled.

You know, if you’re building a general purpose AI solution,

then you know, there’s so many different customers

with so many different needs.

Now, something Andre said is, I think this is amazing.

10 years ago, like vision, recommendation, language,

were completely different disciplines.

He said, the people literally couldn’t talk to each other.

And three years ago, it was all neural networks,

but the very different neural networks.

And recently, it’s converging on one set of networks.

They vary a lot in size, obviously, they vary in data,

vary in outputs, but the technology has converged

a good bit.

Yeah, these transformers behind GPT3,

it seems like they could be applied to video,

they could be applied to a lot of, and it’s like,

and they’re all really simple.

And it was like they literally replace letters with pixels.

It does vision, it’s amazing.

And then size actually improves the thing.

So the bigger it gets, the more compute you throw at it,

the better it gets.

And the more data you have, the better it gets.

So then you start to wonder, well,

is that a fundamental thing?

Or is this just another step to some fundamental understanding

about this kind of computation?

Which is really interesting.

Us humans don’t want to believe that that kind of thing

will achieve conceptual understandings, you were saying,

like you’ll figure out physics, but maybe it will.


Maybe it will.

Well, it’s worse than that.

It’ll understand physics in ways that we can’t understand.

I like your Stephen Wolfram talk where he said,

you know, there’s three generations of physics.

There was physics by reasoning.

Well, big things should fall faster than small things,


That’s reasoning.

And then there’s physics by equations.

Like, you know, but the number of programs in the world

that are solved with a single equation is relatively low.

Almost all programs have, you know,

more than one line of code, maybe 100 million lines of code.

So he said, then now we’re going to physics by equation,

which is his project, which is cool.

I might point out there was two generations of physics

before reasoning habit.

Like all animals, you know, know things fall

and, you know, birds fly and, you know, predators know

how to, you know, solve a differential equation

to cut off a accelerating, you know, curving animal path.

And then there was, you know, the gods did it, right?

So, right.

So there was, you know, there’s five generations.

Now, software 2.0 says programming things

is not the last step.


So there’s going to be a physics past Stephen Wolfram’s con.

That’s not explainable to us humans.

And actually there’s no reason that I can see

well that even that’s the limit.

Like, there’s something beyond that.

I mean, they’re usually, like, usually when you have

this hierarchy, it’s not like, well, if you have this step

and this step and this step and they’re all qualitatively

different and conceptually different, it’s not obvious why,

you know, six is the right number of hierarchy steps

and not seven or eight or.

Well, then it’s probably impossible for us to,

to comprehend something that’s beyond the thing

that’s not explainable.


But the thing that, you know, understands the thing

that’s not explainable to us will conceive the next one.

And like, I’m not sure why there’s a limit to it.

Click your brain hurts.

That’s a sad story.

If we look at our own brain, which is an interesting

illustrative example in your work with test story

and trying to design deep learning architectures,

do you think about the brain at all?

Maybe from a hardware designer perspective,

if you could change something about the brain,

what would you change or do?

Funny question.

Like, how would you do it?

So your brain is really weird.

Like, you know, your cerebral cortex where we think

we do most of our thinking is what,

like six or seven neurons thick?


Like, that’s weird.

Like all the big networks are way bigger than that.

Like way deeper.

So that seems odd.

And then, you know, when you’re thinking if it’s,

if the input generates a result you can lose,

it goes really fast.

But if it can’t, that generates an output

that’s interesting, which turns into an input

and then your brain to the point where you mold things

over for days and how many trips

through your brain is that, right?

Like it’s, you know, 300 milliseconds or something

to get through seven levels of neurons.

I forget the number exactly.

But then it does it over and over and over as it searches.

And the brain clearly looks like some kind of graph

because you have a neuron with connections

and it talks to other ones

and it’s locally very computationally intense,

but it’s also does sparse computations

across a pretty big area.

There’s a lot of messy biological type of things

and it’s meaning like, first of all,

there’s mechanical, chemical and electrical signals.

It’s all that’s going on.

Then there’s the asynchronicity of signals.

And there’s like, there’s just a lot of variability

that seems continuous and messy

and just the mess of biology.

And it’s unclear whether that’s a good thing

or it’s a bad thing, because if it’s a good thing

that we need to run the entirety of the evolution,

well, we’re gonna have to start with basic bacteria

to create something.

So imagine we could control,

you could build a brain with 10 layers.

Would that be better or worse?

Or more connections or less connections,

or we don’t know to what level our brains are optimized.

But if I was changing things,

like you can only hold like seven numbers in your head.

Like why not a hundred or a million?

Never thought of that.

And why can’t we have like a floating point processor

that can compute anything we want

and see it all properly?

Like that would be kind of fun.

And why can’t we see in four or eight dimensions?

Because 3D is kind of a drag.

Like all the hard mass transforms

are up in multiple dimensions.

So you could imagine a brain architecture

that you could enhance with a whole bunch of features

that would be really useful for thinking about things.

It’s possible that the limitations you’re describing

are actually essential for like the constraints

are essential for creating like the depth of intelligence.

Like that, the ability to reason.

It’s hard to say

because like your brain is clearly a parallel processor.

10 billion neurons talking to each other

at a relatively low clock rate.

But it produces something

that looks like a serial thought process.

It’s a serial narrative in your head.

That’s true.

But then there are people famously who are visual thinkers.

Like I think I’m a relatively visual thinker.

I can imagine any object and rotate it in my head

and look at it.

And there are people who say

they don’t think that way at all.

And recently I read an article about people

who say they don’t have a voice in their head.

They can talk.

But when they, you know, it’s like,

well, what are you thinking?

No, they’ll describe something that’s visual.

So that’s curious.

Now, if you’re saying,

if we dedicated more hardware to holding information,

like, you know, 10 numbers or a million numbers,

like would that distract us from our ability

to form this kind of singular identity?

Like it dissipates somehow.

But maybe, you know, future humans

will have many identities

that have some higher level organization

but can actually do lots more things in parallel.

Yeah, there’s no reason, if we’re thinking modularly,

there’s no reason we can’t have multiple consciousnesses

in one brain.

Yeah, and maybe there’s some way to make it faster

so that the, you know, the area of the computation

could still have a unified feel to it

while still having way more ability

to do parallel stuff at the same time.

Could definitely be improved.

Could be improved?


Okay, well, it’s pretty good right now.

Actually, people don’t give it enough credit.

The thing is pretty nice.

The, you know, the fact that the right ends

seem to be, give a nice, like,

spark of beauty to the whole experience.

I don’t know.

I don’t know if it can be improved easily.

It could be more beautiful.

I don’t know how, I, what?

What do you mean, what do you mean how?

All the ways you can’t imagine.

No, but that’s the whole point.

I wouldn’t be able to,

the fact that I can imagine ways

in which it could be more beautiful means.

So do you know, you know, Ian Banks, his stories?

So the super smart AIs there live,

mostly live in the world of what they call infinite fun

because they can create arbitrary worlds.

So they interact in, you know, the story has it.

They interact in the normal world and they’re very smart

and they can do all kinds of stuff.

And, you know, a given mind can, you know,

talk to a million humans at the same time

because we’re very slow and for reasons,

you know, artificial, the story,

they’re interested in people and doing stuff,

but they mostly live in this other land of thinking.

My inclination is to think that the ability

to create infinite fun will not be so fun.

That’s sad.

Well, there are so many things to do.

Imagine being able to make a star move planets around.

Yeah, yeah, but because we can imagine that

is why life is fun, if we actually were able to do it,

it would be a slippery slope

where fun wouldn’t even have a meaning

because we just consistently desensitize ourselves

by the infinite amounts of fun we’re having.

And the sadness, the dark stuff is what makes it fun.

I think that could be the Russian.

It could be the fun makes it fun

and the sadness makes it bittersweet.

Yeah, that’s true.

Fun could be the thing that makes it fun.

So what do you think about the expansion,

not through the biology side,

but through the BCI, the brain computer interfaces?

Yeah, you got a chance to check out the Neuralink stuff.

It’s super interesting.

Like humans like our thoughts to manifest as action.

You know, like as a kid, you know,

like shooting a rifle was super fun,

driving a mini bike, doing things.

And then computer games, I think,

for a lot of kids became the thing

where they can do what they want.

They can fly a plane, they can do this, they can do this.

But you have to have this physical interaction.

Now imagine, you could just imagine stuff and it happens.

Like really richly and interestingly.

Like we kind of do that when we dream.

Like dreams are funny because like if you have some control

or awareness in your dreams,

like it’s very realistic looking,

or not realistic looking, it depends on the dream.

But you can also manipulate that.

And you know, what’s possible there is odd.

And the fact that nobody understands it, it’s hilarious, but.

Do you think it’s possible to expand

that capability through computing?


Is there some interesting,

so from a hardware designer perspective,

is there, do you think it’ll present totally new challenges

in the kind of hardware required that like,

so this hardware isn’t standalone computing.

Well, this is not working with the brain.

So today, computer games are rendered by GPUs.


Right, so, but you’ve seen the GAN stuff, right?

Where trained neural networks render realistic images,

but there’s no pixels, no triangles, no shaders,

no light maps, no nothing.

So the future of graphics is probably AI, right?


AI is heavily trained by lots of real data, right?

So if you have an interface with a AI renderer, right?

So if you say render a cat, it won’t say,

well, how tall’s the cat and how big it,

you know, it’ll render a cat.

And you might say, oh, a little bigger, a little smaller,

you know, make it a tabby, shorter hair.

You know, like you could tweak it.

Like the amount of data you’ll have to send

to interact with a very powerful AI renderer

could be low.

But the question is brain computer interfaces

would need to render not onto a screen,

but render onto the brain and like directly

so that there’s a bandwidth.

Well, it could do it both ways.

I mean, our eyes are really good sensors.

They could render onto a screen

and we could feel like we’re participating in it.

You know, they’re gonna have, you know,

like the Oculus kind of stuff.

It’s gonna be so good when a projection to your eyes,

you think it’s real.

You know, they’re slowly solving those problems.

And I suspect when the renderer of that information

into your head is also AI mediated,

they’ll be able to give you the cues that, you know,

you really want for depth and all kinds of stuff.

Like your brain is partly faking your visual field, right?

Like your eyes are twitching around,

but you don’t notice that.

Occasionally they blank, you don’t notice that.

You know, there’s all kinds of things.

Like you think you see over here,

but you don’t really see there.

It’s all fabricated.

Yeah, peripheral vision is fascinating.

So if you have an AI renderer that’s trained

to understand exactly how you see

and the kind of things that enhance the realism

of the experience, it could be super real actually.

So I don’t know what the limits to that are,

but obviously if we have a brain interface

that goes inside your visual cortex

in a better way than your eyes do, which is possible,

it’s a lot of neurons, maybe that’ll be even cooler.

Well, the really cool thing is that it has to do

with the infinite fun that you were referring to,

which is our brains seem to be very limited.

And like you said, computations.

It’s also very plastic.

Very plastic, yeah.

Yeah, so it’s a interesting combination.

The interesting open question is the limits

of that neuroplasticity, like how flexible is that thing?

Because we haven’t really tested it.

We know about that at the experiments

where they put like a pressure pad on somebody’s head

and had a visual transducer pressurize it

and somebody slowly learned to see.


Especially at a young age, if you throw a lot at it,

like what can it, so can you like arbitrarily expand it

with computing power?

So connected to the internet directly somehow?

Yeah, the answer’s probably yes.

So the problem with biology and ethics

is like there’s a mess there.

Like us humans are perhaps unwilling to take risks

into directions that are full of uncertainty.

So it’s like. No, no.

90% of the population’s unwilling to take risks.

The other 10% is rushing into the risks

unaided by any infrastructure whatsoever.

And that’s where all the fun happens in society.

There’s been huge transformations

in the last couple thousand years.

Yeah, it’s funny.

I got a chance to interact with this Matthew Johnson

from Johns Hopkins.

He’s doing this large scale study of psychedelics.

It’s becoming more and more,

I’ve gotten a chance to interact

with that community of scientists working on psychedelics.

But because of that, that opened the door to me

to all these, what do they call it?

Psychonauts, the people who, like you said,

the 10% who are like, I don’t care.

I don’t know if there’s a science behind this.

I’m taking this spaceship to,

if I’m being the first on Mars, I’ll be.

Psychedelics are interesting in the sense

that in another dimension, like you said,

it’s a way to explore the limits of the human mind.

Like, what is this thing capable of doing?

Because you kind of, like when you dream, you detach it.

I don’t know exactly the neuroscience of it,

but you detach your reality from what your mind,

the images your mind is able to conjure up

and your mind goes into weird places and entities appear.

Somehow Freudian type of trauma

is probably connected in there somehow,

but you start to have these weird, vivid worlds that like.

So do you actively dream?

Do you, why not?

I have like six hours of dreams a night.

It’s like really useful time.

I know, I haven’t, I don’t for some reason.

I just knock out and I have sometimes anxiety inducing

kind of like very pragmatic nightmare type of dreams,

but nothing fun, nothing.

Nothing fun?

Nothing fun.

I try, I unfortunately have mostly have fun

in the waking world, which is very limited

in the amount of fun you can have.

It’s not that limited either.

Yeah, that’s why.

We’ll have to talk.

Yeah, I need instructions.


There’s like a manual for that.

You might wanna.

I’ll look it up.

I’ll ask Elon.

What would you dream?

You know, years ago when I read about, you know,

like, you know, a book about how to have, you know,

become aware of your dreams.

I worked on it for a while.

Like there’s this trick about, you know,

imagine you can see your hands and look out

and I got somewhat good at it.

Like, but my mostly, when I’m thinking about things

or working on problems, I prep myself before I go to sleep.

It’s like, I pull into my mind all the things

I wanna work on or think about.

And then that, let’s say, greatly improves the chances

that I’ll work on that while I’m sleeping.

And then I also, you know, basically ask to remember it.

And I often remember very detailed.

Within the dream.


Or outside the dream.

Well, to bring it up in my dreaming

and then to remember it when I wake up.

It’s just, it’s more of a meditative practice.

You say, you know, to prepare yourself to do that.

Like if you go to, you know, to sleep,

still gnashing your teeth about some random thing

that happened that you’re not that really interested in,

you’ll dream about it.

That’s really interesting.


But you can direct your dreams somewhat by prepping.

Yeah, I’m gonna have to try that.

It’s really interesting.

Like the most important, the interesting,

not like what did this guy send in an email

kind of like stupid worry stuff,

but like fundamental problems

you’re actually concerned about.


And interesting things you’re worried about.

Or books you’re reading or, you know,

some great conversation you had

or some adventure you want to have.

Like there’s a lot of space there.

And it seems to work that, you know,

my percentage of interesting dreams and memories went up.

Is there, is that the source of,

if you were able to deconstruct like

where some of your best ideas came from,

is there a process that’s at the core of that?

Like, so some people, you know, walk and think,

some people like in the shower, the best ideas hit them.

If you talk about like Newton, Apple hitting them on the head.

No, I found out a long time ago,

I process things somewhat slowly.

So like in college, I had friends who could study

at the last minute, get an A the next day.

I can’t do that at all.

So I always front loaded all the work.

Like I do all the problems early, you know,

for finals, like the last three days,

I wouldn’t look at a book because I want, you know,

cause like a new fact day before finals may screw up

my understanding of what I thought I knew.

So my goal was to always get it in and give it time to soak.

And I used to, you know,

I remember when we were doing like 3D calculus,

I would have these amazing dreams of 3D surfaces

with normal, you know, calculating the gradient.

And it’s just like all come up.

So it was like really fun, like very visual.

And if I got cycles of that, that was useful.

And the other is, is don’t over filter your ideas.

Like I like that process of brainstorming

where lots of ideas can happen.

I like people who have lots of ideas.

But then there’s a, yeah, I’ll let them sit

and let it breathe a little bit

and then reduce it to practice.

Like at some point you really have to, does it really work?

Like, you know, is this real or not, right?

But you have to do both.

There’s creative tension there.

Like how do you be both open and, you know, precise?

Have you had ideas that you just,

that sit in your mind for like years before the?


It’s an interesting way to just generate ideas

and just let them sit, let them sit there for a while.

I think I have a few of those ideas.

You know, that was so funny.

Yeah, I think that’s, you know,

creativity this one or something.

For the slow thinkers in the room, I suppose.

As I, some people, like you said, are just like, like the.

Yeah, it’s really interesting.

There’s so much diversity in how people think.

You know, how fast or slow they are,

how well they remember or don’t.

Like, you know, I’m not super good at remembering facts,

but processes and methods.

Like in our engineering, I went to Penn State

and almost all our engineering tests were open book.

I could remember the page and not the formula.

But as soon as I saw the formula,

I could remember the whole method if I’d learned it.


So it’s just a funny, where some people could, you know,

I’d watch friends like flipping through the book,

trying to find the formula,

even knowing that they’d done just as much work.

And I would just open the book

and I was on page 27, about half,

I could see the whole thing visually.


And, you know.

And you have to learn that about yourself

and figure out what would function optimally.

I had a friend who was always concerned

he didn’t know how he came up with ideas.

He had lots of ideas, but he said they just sort of popped up.

Like, you’d be working on something, you have this idea,

like, where does it come from?

But you can have more awareness of it.

Like, how your brain works is a little murky

as you go down from the voice in your head

or the obvious visualizations.

Like, when you visualize something, how does that happen?

Yeah, that’s right.

You know, if I say, you know, visualize a volcano,

it’s easy to do, right?

And what does it actually look like when you visualize it?

I can visualize to the point where I don’t see very much

out of my eyes and I see the colors

of the thing I’m visualizing.

Yeah, but there’s a shape, there’s a texture,

there’s a color, but there’s also conceptual visualization.

Like, what are you actually visualizing

when you’re visualizing a volcano?

Just like with peripheral vision,

you think you see the whole thing.

Yeah, yeah, yeah, that’s a good way to say it.

You know, you have this kind of almost peripheral vision

of your visualizations, they’re like these ghosts.

But if, you know, if you work on it,

you can get a pretty high level of detail.

And somehow you can walk along those visualizations

and come up with an idea, which is weird.

But when you’re thinking about solving problems,

like, you’re putting information in,

you’re exercising the stuff you do know,

you’re sort of teasing the area that you don’t understand

and don’t know, but you can almost, you know,

feel, you know, that process happening.

You know, that’s how I, like,

like, I know sometimes when I’m working really hard

on something, like, I get really hot when I’m sleeping.

And, you know, it’s like, we got the blank throw,

I wake up, all the blanks are on the floor.

And, you know, every time it’s, well,

I wake up and think, wow, that was great.

You know?

Are you able to reverse engineer

what the hell happened there?

Well, sometimes it’s vivid dreams

and sometimes it’s just kind of, like you say,

like shadow thinking that you sort of have this feeling

you’re going through this stuff, but it’s not that obvious.

Isn’t that so amazing that the mind

just does all these little experiments?

I never, you know, I always thought it’s like a river

that you can’t, you’re just there for the ride,

but you’re right, if you prep it.

No, it’s all understandable.

Meditation really helps.

You gotta start figuring out,

you need to learn language of your own mind.

And there’s multiple levels of it, but.

The abstractions again, right?

It’s somewhat comprehensible and observable

and feelable or whatever the right word is.

You know, you’re not alone for the ride.

You are the ride.

I have to ask you, hardware engineer,

working on neural networks now, what’s consciousness?

What the hell is that thing?

Is that just some little weird quirk

of our particular computing device?

Or is it something fundamental

that we really need to crack open

if we’re to build good computers?

Do you ever think about consciousness?

Like why it feels like something to be?

I know, it’s really weird.



I mean, everything about it’s weird.

First, it’s a half a second behind reality, right?

It’s a post hoc narrative about what happened.

You’ve already done stuff

by the time you’re conscious of it.

And your consciousness generally

is a single threaded thing,

but we know your brain is 10 billion neurons

running some crazy parallel thing.

And there’s a really big sorting thing going on there.

It also seems to be really reflective

in the sense that you create a space in your head.

Like we don’t really see anything, right?

Like photons hit your eyes,

it gets turned into signals,

it goes through multiple layers of neurons.

I’m so curious that that looks glassy

and that looks not glassy.

Like how the resolution of your vision is so high

you have to go through all this processing.

Where for most of it, it looks nothing like vision.

Like there’s no theater in your mind, right?

So we have a world in our heads.

We’re literally just isolated behind our sensors.

But we can look at it, speculate about it,

speculate about alternatives, problem solve, what if.

There’s so many things going on

and that process is lagging reality.

And it’s single threaded

even though the underlying thing is like massively parallel.

So it’s so curious.

So imagine you’re building an AI computer.

If you wanted to replicate humans,

well, you’d have huge arrays of neural networks

and apparently only six or seven deep, which is hilarious.

They don’t even remember seven numbers,

but I think we can upgrade that a lot, right?

And then somewhere in there,

you would train the network to create

basically the world that you live in, right?

So like tell stories to itself

about the world that it’s perceiving.

Well, create the world, tell stories in the world

and then have many dimensions of like side shows to it.

Like we have an emotional structure,

like we have a biological structure.

And that seems hierarchical too.

Like if you’re hungry, it dominates your thinking.

If you’re mad, it dominates your thinking.

And we don’t know if that’s important

to consciousness or not,

but it certainly disrupts, intrudes in the consciousness.

Like so there’s lots of structure to that.

And we like to dwell on the past.

We like to think about the future.

We like to imagine, we like to fantasize, right?

And the somewhat circular observation of that

is the thing we call consciousness.

Now, if you created a computer system

and did all things, create worldviews,

create the future alternate histories,

dwelled on past events, accurately or semi accurately.

Well, consciousness just spring up like naturally.

Well, would that look and feel conscious to you?

Like you seem conscious to me, but I don’t know.

Off of the external observer sense.

Do you think a thing that looks conscious is conscious?

Like do you, again, this is like an engineering

kind of question, I think, because like.

I don’t know.

If we want to engineer consciousness,

is it okay to engineer something

that just looks conscious?

Or is there a difference between something that is?

Well, we evolve consciousness

because it’s a super effective way to manage our affairs.

Yeah, this is a social element, yeah.

Well, it gives us a planning system.

We have a huge amount of stuff.

Like when we’re talking, like the reason

we can talk really fast is we’re modeling each other

at a really high level of detail.

And consciousness is required for that.

Well, all those components together

manifest consciousness, right?

So if we make intelligent beings

that we want to interact with that we’re like

wondering what they’re thinking,

looking forward to seeing them,

when they interact with them, they’re interesting,

surprising, you know, fascinating, you know,

they will probably feel conscious like we do

and we’ll perceive them as conscious.

I don’t know why not, but you never know.

Another fun question on this,

because from a computing perspective,

we’re trying to create something

that’s humanlike or superhumanlike.

Let me ask you about aliens.


Do you think there’s intelligent alien civilizations

out there and do you think their technology,

their computing, their AI bots,

their chips are of the same nature as ours?

Yeah, I’ve got no idea.

I mean, if there’s lots of aliens out there

that have been awfully quiet,

you know, there’s speculation about why.

There seems to be more than enough planets out there.

There’s a lot.

There’s intelligent life on this planet

that seems quite different, you know,

like dolphins seem like plausibly understandable,

octopuses don’t seem understandable at all.

If they lived longer than a year,

maybe they would be running the planet.

They seem really smart.

And their neural architecture

is completely different than ours.

Now, who knows how they perceive things.

I mean, that’s the question is for us intelligent beings,

we might not be able to perceive other kinds of intelligence

if they become sufficiently different than us.

Yeah, like we live in the current constrained world,

you know, it’s three dimensional geometry

and the geometry defines a certain amount of physics.

And, you know, there’s like how time works seems to work.

There’s so many things that seem like

a whole bunch of the input parameters to the, you know,

another conscious being are the same.

Yes, like if it’s biological,

biological things seem to be

in a relatively narrow temperature range, right?

Because, you know, organics aren’t stable,

too cold or too hot.

Now, so if you specify the list of things that input to that,

but as soon as we make really smart, you know, beings

and they go solve about how to think

about a billion numbers at the same time

and how to think in end dimensions.

There’s a funny science fiction book

where all the society had uploaded into this matrix.

And at some point, some of the beings in the matrix thought,

I wonder if there’s intelligent life out there.

So they had to do a whole bunch of work to figure out

like how to make a physical thing

because their matrix was self sustaining

and they made a little spaceship

and they traveled to another planet when they got there,

there was like life running around,

but there was no intelligent life.

And then they figured out that there was these huge,

you know, organic matrix all over the planet

inside there where intelligent beings

had uploaded themselves into that matrix.

So everywhere intelligent life was,

soon as it got smart, it upleveled itself

into something way more interesting than 3D geometry.

Yeah, it escaped whatever this,

not escaped, uplevel is better.

The essence of what we think of as an intelligent being,

I tend to like the thought experiment of the organism,

like humans aren’t the organisms.

I like the notion of like Richard Dawkins and memes

that ideas themselves are the organisms,

like that are just using our minds to evolve.

So like we’re just like meat receptacles

for ideas to breed and multiply and so on.

And maybe those are the aliens.

Yeah, so Jordan Peterson has a line that says,

you know, you think you have ideas, but ideas have you.

Yeah, good line.

Which, and then we know about the phenomenon of groupthink

and there’s so many things that constrain us.

But I think you can examine all that

and not be completely owned by the ideas

and completely sucked into groupthink.

And part of your responsibility as a human

is to escape that kind of phenomenon,

which isn’t, it’s one of the creative tension things again,

you’re constructed by it, but you can still observe it

and you can think about it and you can make choices

about to some level, how constrained you are by it.

And it’s useful to do that.

And, but at the same time, and it could be by doing that,

you know, the group and society you’re part of

becomes collectively even more interesting.

So, you know, so the outside observer will think,

wow, you know, all these Lexus running around

with all these really independent ideas

have created something even more interesting

in the aggregate.

So, I don’t know, those are lenses to look at the situation

that’ll give you some inspiration,

but I don’t think they’re constrained.


As a small little quirk of history,

it seems like you’re related to Jordan Peterson,

like you mentioned.

He’s going through some rough stuff now.

Is there some comment you can make

about the roughness of the human journey, the ups and downs?

Well, I became an expert in Benza withdrawal,

like, which is, you took Benza to Aspen’s,

and at some point they interact with GABA circuits,

you know, to reduce anxiety and do a hundred other things.

Like there’s actually no known list of everything they do

because they interact with so many parts of your body.

And then once you’re on them, you habituate to them

and you have a dependency.

It’s not like you’re a drug dependency

where you’re trying to get high.

It’s a metabolic dependency.

And then if you discontinue them,

there’s a funny thing called kindling,

which is if you stop them and then go,

you know, you’ll have a horrible withdrawal symptoms.

And if you go back on them at the same level,

you won’t be stable.

And that unfortunately happened to him.

Because it’s so deeply integrated

into all the kinds of systems in the body.

It literally changes the size and numbers

of neurotransmitter sites in your brain.

So there’s a process called the Ashton protocol

where you taper it down slowly over two years

to people go through that goes through unbelievable hell.

And what Jordan went through seemed to be worse

because on advice of doctors, you know,

we’ll stop taking these and take this.

It was the disaster.

And he got some, yeah, it was pretty tough.

He seems to be doing quite a bit better intellectually.

You can see his brain clicking back together.

I spent a lot of time with him.

I’ve never seen anybody suffer so much.

Well, his brain is also like this powerhouse, right?

So I wonder, does a brain that’s able to think deeply

about the world suffer more through these kinds

of withdrawals, like?

I don’t know.

I’ve watched videos of people going through withdrawal.

They all seem to suffer unbelievably.

And, you know, my heart goes out to everybody.

And there’s some funny math about this.

Some doctor said, as best he can tell, you know,

there’s the standard recommendations.

Don’t take them for more than a month

and then taper over a couple of weeks.

Many doctors prescribe them endlessly,

which is against the protocol, but it’s common, right?

And then something like 75% of people, when they taper,

it’s, you know, half the people have difficulty,

but 75% get off okay.

20% have severe difficulty

and 5% have life threatening difficulty.

And if you’re one of those, it’s really bad.

And the stories that people have on this

is heartbreaking and tough.

So you put some of the fault at the doctors.

They just not know what the hell they’re doing.

No, no, it’s hard to say.

It’s one of those commonly prescribed things.

Like one doctor said, what happens is,

if you’re prescribed them for a reason

and then you have a hard time getting off,

the protocol basically says you’re either crazy

or dependent and you get kind of pushed

into a different treatment regime.

You’re a drug addict or a psychiatric patient.

And so like one doctor said, you know,

I prescribed them for 10 years thinking

I was helping my patients

and I realized I was really harming them.

And you know, the awareness of that is slowly coming up.

The fact that they’re casually prescribed to people

is horrible and it’s bloody scary.

And some people are stable on them,

but they’re on them for life.

Like once you, you know, it’s another one of those drugs.

But benzos long range have real impacts on your personality.

People talk about the benzo bubble

where you get disassociated from reality

and your friends a little bit.

It’s really terrible.

The mind is terrifying.

We were talking about how the infinite possibility of fun,

but like it’s the infinite possibility of suffering too,

which is one of the dangers of like expansion

of the human mind.

It’s like, I wonder if all the possible experiences

that an intelligent computer can have,

is it mostly fun or is it mostly suffering?

So like if you brute force expand the set of possibilities,

like are you going to run into some trouble

in terms of like torture and suffering and so on?

Maybe our human brain is just protecting us

from much more possible pain and suffering.

Maybe the space of pain is like much larger

than we could possibly imagine.

And that.

The world’s in a balance.

You know, all the literature on religion and stuff is,

you know, the struggle between good and evil

is balanced for very finely tuned

for reasons that are complicated.

But that’s a long philosophical conversation.

Speaking of balance that’s complicated,

I wonder because we’re living through

one of the more important moments in human history

with this particular virus.

It seems like pandemics have at least the ability

to kill off most of the human population at their worst.

And there’s just fascinating

because there’s so many viruses in this world.

There’s so many, I mean, viruses basically run the world

in the sense that they’ve been around very long time.

They’re everywhere.

They seem to be extremely powerful

in the distributed kind of way.

But at the same time, they’re not intelligent

and they’re not even living.

Do you have like high level thoughts about this virus

that like in terms of you being fascinated or terrified

or somewhere in between?

So I believe in frameworks, right?

So like one of them is evolution.

Like we’re evolved creatures, right?


And one of the things about evolution

is it’s hyper competitive.

And it’s not competitive out of a sense of evil.

It’s competitive as a sense of there’s endless variation

and variations that work better when.

And then over time, there’s so many levels

of that competition.

Like multicellular life partly exists

because of the competition

between different kinds of life forms.

And we know sex partly exists to scramble our genes

so that we have genetic variation

against the invasion of the bacteria and the viruses.

And it’s endless.

Like I read some funny statistic,

like the density of viruses and bacteria in the ocean

is really high.

And one third of the bacteria die every day

because a virus is invading them.

Like one third of them.


Like I don’t know if that number is true,

but it was like the amount of competition

and what’s going on is stunning.

And there’s a theory as we age,

we slowly accumulate bacterias and viruses

and as our immune system kind of goes down,

that’s what slowly kills us.

It just feels so peaceful from a human perspective

when we sit back and are able

to have a relaxed conversation.

And there’s wars going on out there.

Like right now, you’re harboring how many bacteria?

And the ones, many of them are parasites on you

and some of them are helpful

and some of them are modifying your behavior

and some of them are, it’s just really wild.

But this particular manifestation is unusual

in the demographic, how it hit

and the political response that it engendered

and the healthcare response it engendered

and the technology it engendered, it’s kind of wild.

Yeah, the communication on Twitter that it led to,

all that kind of stuff, at every single level, yeah.

But what usually kills life,

the big extinctions are caused by meteors and volcanoes.

That’s the one you’re worried about

as opposed to human created bombs that we launch.

Solar flares are another good one.

Occasionally, solar flares hit the planet.

So it’s nature.

Yeah, it’s all pretty wild.

On another historic moment, this is perhaps outside

but perhaps within your space of frameworks

that you think about that just happened,

I guess a couple of weeks ago is,

I don’t know if you’re paying attention at all,

is the GameStop and Wall Street bets.

It’s super fun.

So it’s really fascinating.

There’s kind of a theme to this conversation today

because it’s like neural networks,

it’s cool how there’s a large number of people

in a distributed way, almost having a kind of fun,

we’re able to take on the powerful elites,

elite hedge funds, centralized powers and overpower them.

Do you have thoughts on this whole saga?

I don’t know enough about finance,

but it was like the Elon, Robinhood guy when they talked.

Yeah, what’d you think about that?

Well, Robinhood guy didn’t know

how the finance system worked.

That was clear, right?

He was treating like the people

who settled the transactions as a black box.

And suddenly somebody called him up and say,

hey, black box calling you, your transaction volume

means you need to put out $3 billion right now.

And he’s like, I don’t have $3 billion.

Like I don’t even make any money on these trades.

Why do I owe $3 billion while you’re sponsoring the trade?

So there was a set of abstractions

that I don’t think either, like now we understand it.

Like this happens in chip design.

Like you buy wafers from TSMC or Samsung or Intel,

and they say it works like this

and you do your design based on that.

And then chip comes back and doesn’t work.

And then suddenly you started having to open the black boxes.

The transistors really work like they said,

what’s the real issue?

So there’s a whole set of things

that created this opportunity and somebody spotted it.

Now, people spot these kinds of opportunities all the time.

So there’s been flash crashes,

there’s always short squeezes are fairly regular.

Every CEO I know hates the shorts

because they’re trying to manipulate their stock

in a way that they make money

and deprive value from both the company

and the investors.

So the fact that some of these stocks were so short,

it’s hilarious that this hasn’t happened before.

I don’t know why, and I don’t actually know why

some serious hedge funds didn’t do it to other hedge funds.

And some of the hedge funds

actually made a lot of money on this.

So my guess is we know 5% of what really happened

and that a lot of the players don’t know what happened.

And the people who probably made the most money

aren’t the people that they’re talking about.


Do you think there was something,

I mean, this is the cool kind of Elon,

you’re the same kind of conversationalist,

which is like first principles questions of like,

what the hell happened?

Just very basic questions of like,

was there something shady going on?

What, who are the parties involved?

It’s the basic questions everybody wants to know about.

Yeah, so like we’re in a very hyper competitive world,

but transactions like buying and selling stock

is a trust event.

I trust the company, represented themselves properly.

I bought the stock because I think it’s gonna go up.

I trust that the regulations are solid.

Now, inside of that, there’s all kinds of places

where humans over trust and this exposed,

let’s say some weak points in the system.

I don’t know if it’s gonna get corrected.

I don’t know if we have close to the real story.

Yeah, my suspicion is we don’t.

And listen to that guy, he was like a little wide eyed

about and then he did this and then he did that.

And I was like, I think you should know more

about your business than that.

But again, there’s many businesses

when like this layer is really stable,

you stop paying attention to it.

You pay attention to the stuff that’s bugging you or new.

You don’t pay attention to the stuff

that just seems to work all the time.

You just, sky’s blue every day, California.

And every once in a while it rains

and everybody’s like, what do we do?

Somebody go bring in the lawn furniture.

It’s getting wet.

You don’t know why it’s getting wet.

Yeah, it doesn’t always work.

I was blue for like a hundred days and now it’s, so.

But part of the problem here with Vlad,

the CEO of Robinhood is the scaling

that we’ve been talking about is there’s a lot

of unexpected things that happen with the scaling

and you have to be, I think the scaling forces you

to then return to the fundamentals.

Well, it’s interesting because when you buy and sell stocks,

the scaling is, the stocks don’t only move

in a certain range and if you buy a stock,

you can only lose that amount of money.

On the short market, you can lose a lot more

than you can benefit.

Like it has a weird cost function

or whatever the right word for that is.

So he was trading in a market

where he wasn’t actually capitalized for the downside.

If it got outside a certain range.

Now, whether something nefarious has happened,

I have no idea, but at some point,

the financial risk to both him and his customers

was way outside of his financial capacity

and his understanding how the system work was clearly weak

or he didn’t represent himself.

I don’t know the person and when I listened to him,

it could have been the surprise question was like,

and then these guys called and it sounded like

he was treating stuff as a black box.

Maybe he shouldn’t have, but maybe he has a whole pile

of experts somewhere else and it was going on.

I don’t know.

Yeah, I mean, this is one of the qualities

of a good leader is under fire, you have to perform.

And that means to think clearly and to speak clearly.

And he dropped the ball on those things

because and understand the problem quickly,

learn and understand the problem at this basic level.

What the hell happened?

And my guess is, at some level it was amateurs trading

against experts slash insiders slash people

with special information.

Outsiders versus insiders.

Yeah, and the insiders, my guess is the next time

this happens, we’ll make money on it.

The insiders always win?

Well, they have more tools and more incentive.

I mean, this always happens.

Like the outsiders are doing this for fun.

The insiders are doing this 24 seven.

But there’s numbers in the outsiders.

This is the interesting thing is it could be

a new chapter. There’s numbers

on the insiders too.

Different kind of numbers, yeah.

But this could be a new era because, I don’t know,

at least I didn’t expect that a bunch of Redditors could,

there’s millions of people who can get together.

It was a surprise attack.

The next one will be a surprise.

But don’t you think the crowd, the people are planning

the next attack?

We’ll see.

But it has to be a surprise.

It can’t be the same game.

And so the insiders.

It’s like, it could be there’s a very large number

of games to play and they can be agile about it.

I don’t know.

I’m not an expert.

Right, that’s a good question.

The space of games, how restricted is it?

Yeah, and the system is so complicated

it could be relatively unrestricted.

And also during the last couple of financial crashes,

what set it off was sets of derivative events

where Nassim Taleb’s thing is they’re trying

to lower volatility in the short run

by creating tail events.

And the system’s always evolved towards that

and then they always crash.

The S curve is the start low, ramp, plateau, crash.

It’s 100% effective.

In the long run.

Let me ask you some advice to put on your profound hat.

There’s a bunch of young folks who listen to this thing

for no good reason whatsoever.

Undergraduate students, maybe high school students,

maybe just young folks, a young at heart

looking for the next steps to take in life.

What advice would you give to a young person today

about life, maybe career, but also life in general?

Get good at some stuff.

Well, get to know yourself, right?

Get good at something that you’re actually interested in.

You have to love what you’re doing to get good at it.

You really gotta find that.

Don’t waste all your time doing stuff

that’s just boring or bland or numbing, right?

Don’t let old people screw you.

Well, people get talked into doing all kinds of shit

and racking up huge student debts

and there’s so much crap going on.

And then drains your time and drains your energy.

The Eric Weinstein thesis that the older generation

won’t let go and they’re trapping all the young people.

Do you think there’s some truth to that?

Yeah, sure.

Just because you’re old doesn’t mean you stop thinking.

I know lots of really original old people.

I’m an old person.

But you have to be conscious about it.

You can fall into the ruts and then do that.

I mean, when I hear young people spouting opinions

that sounds like they come from Fox News or CNN,

I think they’ve been captured by groupthink and memes.

They’re supposed to think on their own.

So if you find yourself repeating

what everybody else is saying,

you’re not gonna have a good life.

Like, that’s not how the world works.

It seems safe, but it puts you at great jeopardy

for being boring or unhappy.

How long did it take you to find the thing

that you have fun with?

Oh, I don’t know.

I’ve been a fun person since I was pretty little.

So everything.

I’ve gone through a couple periods of depression in my life.

For a good reason or for a reason

that doesn’t make any sense?

Yeah, like some things are hard.

Like you go through mental transitions in high school.

I was really depressed for a year

and I think I had my first midlife crisis at 26.

I kind of thought, is this all there is?

Like I was working at a job that I loved,

but I was going to work and all my time was consumed.

What’s the escape out of that depression?

What’s the answer to is this all there is?

Well, a friend of mine, I asked him,

because he was working his ass off,

I said, what’s your work life balance?

Like there’s work, friends, family, personal time.

Are you balancing any of that?

And he said, work 80%, family 20%.

And I tried to find some time to sleep.

Like there’s no personal time.

There’s no passionate time.

Like the young people are often passionate about work.

So I was certainly like that.

But you need to have some space in your life

for different things.

And that creates, that makes you resistant

to the whole, the deep dips into depression kind of thing.

Yeah, well, you have to get to know yourself too.

Meditation helps.

Some physical, something physically intense helps.

Like the weird places your mind goes kind of thing.

Like, and why does it happen?

Why do you do what you do?

Like triggers, like the things that cause your mind

to go to different places kind of thing,

or like events like.

Your upbringing for better or worse,

whether your parents are great people or not,

you come into adulthood with all kinds of emotional burdens.

And you can see some people are so bloody stiff

and restrained, and they think the world’s

fundamentally negative, like you maybe.

You have unexplored territory.


Or you’re afraid of something.

Definitely afraid of quite a few things.

Then you gotta go face them.

Like what’s the worst thing that can happen?

You’re gonna die, right?

Like that’s inevitable.

You might as well get over that.

Like 100%, that’s right.

Like people are worried about the virus,

but you know, the human condition is pretty deadly.

There’s something about embarrassment

that’s, I’ve competed a lot in my life,

and I think the, if I’m to introspect it,

the thing I’m most afraid of is being like humiliated,

I think.

Yeah, nobody cares about that.

Like you’re the only person on the planet

that cares about you being humiliated.


It’s like a really useless thought.

It is.

It’s like, you’re all humiliated.

Something happened in a room full of people,

and they walk out, and they didn’t think about it

one more second.

Or maybe somebody told a funny story to somebody else.

And then it dissipates it throughout, yeah.


No, I know it too.

I mean, I’ve been really embarrassed about shit

that nobody cared about myself.


It’s a funny thing.

So the worst thing ultimately is just.

Yeah, but that’s a cage,

and then you have to get out of it.


Like once you, here’s the thing.

Once you find something like that,

you have to be determined to break it.

Because otherwise you’ll just,

so you accumulate that kind of junk,

and then you die as a mess.

So the goal, I guess it’s like a cage within a cage.

I guess the goal is to die in the biggest possible cage.

Well, ideally you’d have no cage.

People do get enlightened.

I’ve met a few.

It’s great.

You’ve found a few?

There’s a few out there?

I don’t know.

Of course there are.

I don’t know.

Either that or it’s a great sales pitch.

There’s enlightened people writing books

and doing all kinds of stuff.

It’s a good way to sell a book.

I’ll give you that.

You’ve never met somebody you just thought,

they just kill me.

Like they just, like mental clarity, humor.

No, 100%, but I just feel like

they’re living in a bigger cage.

They have their own.

You still think there’s a cage?

There’s still a cage.

You secretly suspect there’s always a cage.

There’s nothing outside the universe.

There’s nothing outside the cage.

You work in a bunch of companies,

you lead a lot of amazing teams.

I’m not sure if you’ve ever been

like in the early stages of a startup,

but do you have advice for somebody

that wants to do a startup or build a company,

like build a strong team of engineers that are passionate

and just want to solve a big problem?

Like, is there a more specifically on that point?

Well, you have to be really good at stuff.

If you’re going to lead and build a team,

you better be really interested

in how people work and think.

The people or the solution to the problem.

So there’s two things, right?

One is how people work and the other is the…

Well, actually there’s quite a few successful startups.

It’s pretty clear the founders

don’t know anything about people.

Like the idea was so powerful that it propelled them.

But I suspect somewhere early,

they hired some people who understood people

because people really need a lot of care and feeding

to collaborate and work together

and feel engaged and work hard.

Like startups are all about out producing other people.

Like you’re nimble because you don’t have any legacy.

You don’t have a bunch of people

who are depressed about life just showing up.

So startups have a lot of advantages that way.

Do you like the, Steve Jobs talked about this idea

of A players and B players.

I don’t know if you know this formulation.

Yeah, no.

Organizations that get taken over by B player leaders

often really underperform their C players.

That said, in big organizations,

there’s so much work to do.

And there’s so many people who are happy

to do what the leadership or the big idea people

would consider menial jobs.

And you need a place for them,

but you need an organization that both values and rewards

them but doesn’t let them take over the leadership of it.

Got it.

So you need to have an organization

that’s resistant to that.

But in the early days, the notion with Steve

was that like one B player in a room of A players

will be like destructive to the whole.

I’ve seen that happen.

I don’t know if it’s like always true.

You run into people who are clearly B players

but they think they’re A players

and so they have a loud voice at the table

and they make lots of demands for that.

But there’s other people who are like, I know who I am.

I just wanna work with cool people on cool shit

and just tell me what to do and I’ll go get it done.

So you have to, again, this is like people skills.

What kind of person is it?

I’ve met some really great people I love working with

that weren’t the biggest ID people or the most productive

ever but they show up, they get it done.

They create connection and community that people value.

It’s pretty diverse so I don’t think

there’s a recipe for that.

I gotta ask you about love.

I heard you’re into this now.

Into this love thing?

Yeah, is this, do you think this is your solution

to your depression?

No, I’m just trying to, like you said,

delighting people and occasionally trying to sell a book.

I’m writing a book about love.

You’re writing a book about love?

No, I’m not, I’m not.

I have a friend of mine, he’s gonna,

he said you should really write a book

about your management philosophy.

He said it’d be a short book.

Well, that one was thought pretty well.

What role do you think love, family, friendship,

all that kind of human stuff play in a successful life?

You’ve been exceptionally successful in the space

of running teams, building cool shit in this world,

creating some amazing things.

What, did love get in the way?

Did love help the family get in the way?

Did family help friendship?

You want the engineer’s answer?


But first, love is functional, right?

It’s functional in what way?

So we habituate ourselves to the environment.

And actually, Jordan Peterson told me this line.

So you go through life and you just get used to everything,

except for the things you love.

They remain new.

Like, this is really useful for, you know,

like other people’s children and dogs and trees.

You just don’t pay that much attention to them.

Your own kids, you monitor them really closely.

Like, and if they go off a little bit,

because you love them, if you’re smart,

if you’re gonna be a successful parent,

you notice it right away.

You don’t habituate to just things you love.

And if you want to be successful at work,

if you don’t love it,

you’re not gonna put the time in somebody else.

It’s somebody else that loves it.

Like, because it’s new and interesting,

and that lets you go to the next level.

So it’s the thing, it’s just a function

that generates newness and novelty

and surprises, you know, all those kinds of things.

It’s really interesting.

There’s people who figured out lots of frameworks for this.

Like, humans seem to go,

in partnership, go through interests.

Like, suddenly somebody’s interesting,

and then you’re infatuated with them,

and then you’re in love with them.

And then you, you know, different people have ideas

about parental love or mature love.

Like, you go through a cycle of that,

which keeps us together,

and it’s super functional for creating families

and creating communities and making you support somebody

despite the fact that you don’t love them.

Like, and it can be really enriching.

You know, now, in the work life balance scheme,

if alls you do is work,

you think you may be optimizing your work potential,

but if you don’t love your work

or you don’t have family and friends

and things you care about,

your brain isn’t well balanced.

Like, everybody knows the experience of,

he works on something all week.

He went home, took two days off, and he came back in.

The odds of you working on the thing,

you picking up right where you left off is zero.

Your brain refactored it.

But being in love is great.

It’s like changes the color of the light in the room.

It creates a spaciousness that’s different.

It helps you think.

It makes you strong.

Bukowski had this line about love being a fog

that dissipates with the first light of reality

in the morning.

That’s depressing.

I think it’s the other way around.

It lasts.

Well, like you said, it’s a function.

It’s a thing that generates.

It can be the light that actually enlivens your world

and creates the interest and the power and the strength

to go do something.

Well, it’s like, that sounds like,

you know, there’s like physical love, emotional love,

intellectual love, spiritual love, right?

Isn’t it all the same thing, kind of?


You should differentiate that.

Maybe that’s your problem.

In your book, you should refine that a little bit.

Is it different chapters?

Yeah, there’s different chapters.

What’s these, aren’t these just different layers

of the same thing, the stack of physical?

People, some people are addicted to physical love

and they have no idea about emotional or intellectual love.

I don’t know if they’re the same things.

I think they’re different.

That’s true.

They could be different.

I guess the ultimate goal is for it to be the same.

Well, if you want something to be bigger and interesting,

you should find all its components and differentiate them,

not clump it together.

Like, people do this all the time.

Yeah, the modularity.

Get your abstraction layers right

and then you have room to breathe.

Well, maybe you can write the forward to my book

about love.

Or the afterwards.

And the after.

You really tried.

I feel like Lex has made a lot of progress in this book.

Well, you have things in your life that you love.

Yeah, yeah.

And they are, you’re right, they’re modular.

It’s quality.

And you can have multiple things with the same person

or the same thing.

But, yeah.

Depending on the moment of the day.

Yeah, there’s, like what Bukowski described

is that moment when you go from being in love

to having a different kind of love.


And that’s a transition.

But when it happens, if you read the owner’s manual

and you believed it, you would have said,

oh, this happened.

It doesn’t mean it’s not love.

It’s a different kind of love.

But maybe there’s something better about that.

As you grow old, all you do is regret how you used to be.

It’s sad.


You should have learned a lot of things

because like who you can be in your future self

is actually more interesting and possibly delightful

than being a mad kid in love with the next person.

Like, that’s super fun when it happens.

But that’s, you know, 5% of the possibility.

Yeah, that’s right.

There’s a lot more fun to be had in the long lasting stuff.

Yeah, or meaning, you know, if that’s your thing.

Which is a kind of fun.

It’s a deeper kind of fun.

And it’s surprising.

You know, that’s, like the thing I like is surprises.

You know, and you just never know what’s gonna happen.

But you have to look carefully and you have to work at it

and you have to think about it and you know, it’s.

Yeah, you have to see the surprises when they happen, right?

You have to be looking for it.

From the branching perspective, you mentioned regrets.

Do you have regrets about your own trajectory?

Oh yeah, of course.

Yeah, some of it’s painful,

but you wanna hear the painful stuff?


I would say, like in terms of working with people,

when people did stuff I didn’t like,

especially if it was a bit nefarious,

I took it personally and I also felt it was personal

about them.

But a lot of times, like humans are,

you know, most humans are a mess, right?

And then they act out and they do stuff.

And the psychologist I heard a long time ago said,

you tend to think somebody does something to you.

But really what they’re doing is they’re doing

what they’re doing while they’re in front of you.

It’s not that much about you, right?

And as I got more interested in,

you know, when I work with people,

I think about them and probably analyze them

and understand them a little bit.

And then when they do stuff, I’m way less surprised.

And if it’s bad, I’m way less hurt.

And I react way less.

Like I sort of expect everybody’s got their shit.

Yeah, and it’s not about you as much.

It’s not about me that much.

It’s like, you know, you do something

and you think you’re embarrassed, but nobody cares.

Like, and somebody’s really mad at you,

the odds of it being about you.

No, they’re getting mad the way they’re doing that

because of some pattern they learned.

And you know, and maybe you can help them

if you care enough about it.

But, or you could see it coming and step out of the way.

Like, I wish I was way better at that.

I’m a bit of a hothead.

And in support of that.

You said with Steve, that was a feature, not a bug.

Yeah, well, he was using it as the counter force

to orderliness that would crush his work.

Well, you were doing the same.

Yeah, maybe.

I don’t think I, I don’t think my vision was big enough.

It was more like I just got pissed off and did stuff.

I’m sure that’s the, yeah, you’re telling me.

I don’t know if it had the,

it didn’t have the amazing effect

of creating the trillion dollar company.

It was more like I just got pissed off and left

and, or made enemies that I shouldn’t have.

And yeah, it’s hard.

Like, I didn’t really understand politics

until I worked at Apple where, you know,

Steve was a master player of politics

and his staff had to be, or they wouldn’t survive him.

And it was definitely part of the culture.

And then I’ve been in companies where they say

it’s political, but it’s all, you know,

fun and games compared to Apple.

And it’s not that the people at Apple are bad people.

It’s just, they operate politically at a higher level.

You know, it’s not like, oh, somebody said something bad

about somebody, somebody else, which is most politics.

It’s, you know, they had strategies

about accomplishing their goals.

Sometimes, you know, over the dead bodies of their enemies.

You know, with sophistication, yeah,

more Game of Thrones than sophistication

and like a big time factor rather than a, you know.

Wow, that requires a lot of control over your emotions,

I think, to have a bigger strategy in the way you behave.

Yeah, and it’s effective in the sense

that coordinating thousands of people

to do really hard things where many of the people

in there don’t understand themselves,

much less how they’re participating,

creates all kinds of, you know, drama and problems

that, you know, our solution is political in nature.

Like how do you convince people?

How do you leverage them?

How do you motivate them?

How do you get rid of them?

How do you, you know, like there’s so many layers

of that that are interesting.

And even though some of it, let’s say, may be tough,

it’s not evil unless, you know, you use that skill

to evil purposes, which some people obviously do.

But it’s a skill set that operates, you know.

And I wish I’d, you know, I was interested in it,

but I, you know, it was sort of like,

I’m an engineer, I do my thing.

And, you know, there’s times

when I could have had a way bigger impact

if I, you know, knew how to,

if I paid more attention and knew more about that.

Yeah, about the human layer of the stack.

Yeah, that human political power, you know,

expression layer of the stack.

Just complicated.

And there’s lots to know about it.

I mean, people are good at it, are just amazing.

And when they’re good at it,

and let’s say, relatively kind and oriented

in a good direction, you can really feel,

you can get lots of stuff done and coordinate things

that you never thought possible.

But all people like that also have some pretty hard edges

because, you know, it’s a heavy lift.

And I wish I’d spent more time like that when I was younger.

But maybe I wasn’t ready.

You know, I was a wide eyed kid for 30 years.

Still a bit of a kid.

Yeah, I know.

What do you hope your legacy is

when there’s a book like Hitchhiker’s Guide to the Galaxy,

and this is like a one sentence entry by Jim Waller

from like that guy lived at some point.

There’s not many, you know,

not many people would be remembered.

You’re one of the sparkling little human creatures

that had a big impact on the world.

How do you hope you’ll be remembered?

My daughter was trying to get,

she edited my Wikipedia page

to say that I was a legend and a guru.

But they took it out, so she put it back in.

She’s 15.

I think that was probably the best part of my legacy.

She got her sister, and they were all excited.

They were like trying to put it in the references

because there’s articles and that on the title.

So in the eyes of your kids, you’re a legend.

Well, they’re pretty skeptical

because they don’t be better than that.

They’re like dad.

So yeah, that kind of stuff is super fun.

In terms of the big legends stuff, I don’t care.

You don’t care.

I don’t really care.

You’re just an engineer.

Yeah, I’ve been thinking about building a big pyramid.

So I had a debate with a friend

about whether pyramids or craters are cooler.

And he realized that there’s craters everywhere,

but they built a couple of pyramids 5,000 years ago.

And they remember you for a while.

We’re still talking about it.

So I think that would be cool.

Those aren’t easy to build.

Oh, I know.

And they don’t actually know how they built them,

which is great.

It’s either AGI or aliens could be involved.

So I think you’re gonna have to figure out

quite a few more things than just

the basics of civil engineering.

So I guess you hope your legacy is pyramids.

That would be cool.

And my Wikipedia page, you know,

getting updated by my daughter periodically.

Like those two things would pretty much make it.

Jim, it’s a huge honor talking to you again.

I hope we talk many more times in the future.

I can’t wait to see what you do with Tense Torrent.

I can’t wait to use it.

I can’t wait for you to revolutionize

yet another space in computing.

It’s a huge honor to talk to you.

Thanks for talking to me.

This was fun.

See you next time.

comments powered by Disqus