Lex Fridman Podcast - #147 - Dmitri Dolgov: Waymo and the Future of Self-Driving Cars

🎁Amazon Prime 💗The Drop 📖Kindle Unlimited 🎧Audible Plus 🎵Amazon Music Unlimited 🌿iHerb 💰Binance

The following is a conversation with Dimitri Dolgov, the CTO of Waymo, which

is an autonomous driving company that started as Google self driving car

project in 2009 and became Waymo in 2016.

Dimitri was there all along.

Waymo is currently leading in the fully autonomous vehicle space and that they

actually have an at scale deployment of publicly accessible autonomous vehicles

driving passengers around with no safety driver, with nobody in the driver’s seat.

This to me is an incredible accomplishment of engineering on one of

the most difficult and exciting artificial intelligence challenges of

the 21st century.

Quick mention of a sponsor, followed by some thoughts related to the episode.

Thank you to Triolabs, a company that helps businesses apply machine

learning to solve real world problems.

Blinkist, an app I use for reading through summaries of books, better

help, online therapy with a licensed professional, and Cash App, the app

I use to send money to friends.

Please check out the sponsors in the description to get a discount

at the support this podcast.

As a side note, let me say that autonomous and semi autonomous driving

was the focus of my work at MIT and as a problem space that I find

fascinating and full of open questions from both robotics and a human

psychology perspective.

There’s quite a bit that I could say here about my experiences in academia

on this topic that revealed to me, let’s say the less admirable size of human

beings, but I choose to focus on the positive, on solutions.

I’m brilliant engineers like Dimitri and the team at Waymo, who work

tirelessly to innovate and to build amazing technology that will define

our future.

Because of Dimitri and others like him, I’m excited for this future.

And who knows, perhaps I too will help contribute something of value to it.

If you enjoy this thing, subscribe on YouTube, review it with five stars

and up a podcast, follow on Spotify, support on Patreon, or connect with

me on Twitter at Lex Friedman.

And now here’s my conversation with Dimitri Dolgov.

When did you first fall in love with MIT?

When did you first fall in love with robotics or even computer

science more in general?

Computer science first at a fairly young age, then robotics happened much later.

I think my first interesting introduction to computers was in the late 80s when

we got our first computer, I think it was an IBM, I think IBM AT.

Those things that had like a turbo button in the front, the radio

precedent, you know, make, make the thing goes faster.

Did that already have floppy disks?

Yeah.

Yeah.

Like the, the 5.4 inch ones.

I think there was a bigger inch.

So good.

When something then five inches and three inches.

Yeah, I think that was the five.

I don’t, I maybe that was before that was the giant plates and it didn’t get that.

But it was definitely not the, not the three inch ones.

Anyway, so that, that, you know, we got that computer, I spent the first few

months just playing video games as you would expect, I got bored of that.

So I started messing around and trying to figure out how to, you know, make

the thing do other stuff, got into exploring programming and a couple of

years later, it got to a point where, I actually wrote a game, a lot of games

and a game developer, a Japanese game developer actually offered to buy it

for me for a few hundred bucks.

But you know, for, for a kid in Russia, that’s a big deal.

That’s a big deal.

Yeah.

I did not take the deal.

Wow.

Integrity.

Yeah.

I, I instead, yes, that was not the most acute financial move that I made in my

life, you know, looking back at it now, I, I instead put it, well, you know, I had

a reason I put it online, it was, what’d you call it back in the days?

It was a freeware thing, right?

It was not open source, but you could upload the binaries, you would put the

game online and the idea was that, you know, people like it and then they, you

know, contribute on the send you a little donations, right?

So I did my quick math of like, you know, of course, you know, thousands and

millions of people are going to play my game, send me a couple of bucks a piece,

you know, should definitely do that.

As I said, not, not the best.

You’re already playing with business models at that young age.

Remember what language it was?

What programming, it was a Pascal, which what Pascal, Pascal, and that

a graphical component, so it’s not text based.

Yeah.

It was, uh, like, uh, I think there are 300, 320 by 200, uh, whatever it was.

I think that kind of the earlier, that’s the resolution, right?

And I actually think the reason why this company wanted to buy it is not like the

fancy graphics or the implementation.

That was maybe the idea, uh, of my actual game, the idea of the game.

Well, one of the things I, it’s so funny.

I’m used to play this game called golden X and the simplicity of the graphics and

something about the simplicity of the music, like it’s still haunts me.

I don’t know if that’s a childhood thing.

I don’t know if that’s the same thing for call of duty these days for young kids,

but I still think that the simple one of the games are simple.

That simple purity makes for like allows your imagination to take over and

thereby creating a more magical experience.

Like now with better and better graphics, it feels like your

imagination doesn’t get to, uh, create worlds, which is kind of interesting.

Um, it could be just an old man on a porch, like way waving at kids

these days that have no respect.

But I still think that graphics almost get in the way of the experience.

I don’t know.

Flip a bird.

Yeah, I don’t know if the imagination is closed.

I don’t yet, but that that’s more about games that op like that’s more

like Tetris world where they optimally masterfully, like create a fun, short

term dopamine experience versus I’m more referring to like role playing

games where there’s like a story you can live in it for months or years.

Um, like, uh, there’s an elder scroll series, which is probably my favorite

set of games that was a magical experience.

And that the graphics are terrible.

The characters were all randomly generated, but they’re, I don’t know.

That’s it pulls you in.

There’s a story.

It’s like an interactive version of an elder scrolls Tolkien world.

And you get to live in it.

I don’t know.

I miss it.

It’s one of the things that suck about being an adult is there’s no, you have

to live in the real world as opposed to the elder scrolls world, you know, whatever

brings you joy, right?

Minecraft, right?

Minecraft is a great example.

You create, like it’s not the fancy graphics, but it’s the creation of your own worlds.

Yeah, that one is crazy.

You know, one of the pitches for being a parent that people tell me is that you

can like use the excuse of parenting to, to go back into the video game world.

And like, like that’s like, you know, father, son, father, daughter time, but

really you just get to play video games with your kids.

So anyway, at that time, did you have any ridiculously ambitious dreams of where as

a creator, you might go as an engineer?

Did you, what, what did you think of yourself as, as an engineer, as a tinker,

or did you want to be like an astronaut or something like that?

You know, I’m tempted to make something up about, you know, robots, uh, engineering

or, you know, mysteries of the universe, but that’s not the actual memory that

pops into my mind when you, when you asked me about childhood dreams.

So I’ll actually share the, the, the real thing, uh, when I was maybe four or five

years old, I, you know, as we all do, I thought about, you know, what I wanted

to do when I grow up and I had this dream of being a traffic control cop.

Uh, you know, they don’t have those today’s I think, but you know, back in

the eighties and in Russia, uh, you probably are familiar with that Lex.

They had these, uh, you know, police officers that would stand in the middle

of intersection all day and they would have their like stripe back, black and

white batons that they would use to control the flow of traffic and, you

know, for whatever reasons, I was strangely infatuated with this whole

process and like that, that was my dream.

Uh, that’s what I wanted to do when I grew up and, you know, my parents, uh,

both physics profs, by the way, I think were, you know, a little concerned, uh,

with that level of ambition coming from their child.

Uh, uh, you know, that age.

Well, that it’s an interesting, I don’t know if you can relate,

but I very much love that idea.

I have a OCD nature that I think lends itself very close to the engineering

mindset, which is you want to kind of optimize, you know, solve a problem by

create, creating an automated solution, like a, like a set of rules, that set

of rules you can follow and then thereby make it ultra efficient.

I don’t know if that’s, it was of that nature.

I certainly have that.

There’s like fact, like SimCity and factory building games, all those

kinds of things kind of speak to that engineering mindset, or

did you just like the uniform?

I think it was more of the latter.

I think it was the uniform and the, you know, the, the stripe baton that

made cars go in the right directions.

But I guess, you know, I, it is, I did end up, uh, I guess, uh,

you know, working on the transportation industry one way or another uniform.

No, but that’s right.

Maybe, maybe, maybe it was my, you know, deep inner infatuation with the,

you know, traffic control batons that led to this career.

Okay.

What, uh, when did you, when was the leap from programming to robotics?

That happened later.

That was after grad school, uh, after, and I actually, the most self driving

cars was I think my first real hands on introduction to robotics.

But I never really had that much hands on experience in school and training.

I, you know, worked on applied math and physics.

Then in college, I did more half, uh, abstract computer science.

And it was after grad school that I really got involved in robotics, which

was actually self driving cars.

And, you know, that was a big flip.

What, uh, what grad school?

So I went to grad school in Michigan, and then I did a postdoc at Stanford,

uh, which is, that was the postdoc where I got to play with self driving cars.

Yeah.

So we’ll return there.

Let’s go back to, uh, to Moscow.

So, uh, you know, for episode 100, I talked to my dad and also I

grew up with my dad, I guess.

Uh, so I had to put up with them for many years and, uh, he, he went to the

FISTIEG or MIPT, it’s weird to say in English, cause I’ve heard all this

in Russian, Moscow Institute of Physics and Technology.

And to me, that was like, I met some super interesting, as a child, I met

some super interesting characters.

It felt to me like the greatest university in the world, the most elite

university in the world, and just the people that I met that came out of there

were like, not only brilliant, but also special humans.

It seems like that place really tested the soul, uh, both like in terms

of technically and like spiritually.

So that could be just the romanticization of that place.

I’m not sure, but so maybe you can speak to it, but is it correct to

say that you spent some time at FISTIEG?

Yeah, that’s right.

Six years.

Uh, I got my bachelor’s and master’s in physics and math there.

And it’s actually interesting cause my, my dad, and actually both my parents,

uh, went there and I think all the stories that I heard, uh, like, just

like you, Alex, uh, growing up about the place and, you know, how interesting

and special and magical it was, I think that was a significant, maybe the

main reason, uh, I wanted to go there, uh, for college, uh, enough so that

I actually went back to Russia from the U S I graduated high school in the U S.

Um, and you went back there.

I went back there.

Yeah, that’s exactly the reaction most of my peers in college had.

But, you know, perhaps a little bit stronger that like, you know, point

me out as this crazy kid, were your parents supportive of that?

Yeah.

Yeah.

My games, your previous question, they, uh, they supported me and, you know,

letting me kind of pursue my passions and the things that I was interested in.

That’s a bold move.

Wow.

What was it like there?

It was interesting, you know, definitely fairly hardcore on the fundamentals

of, you know, math and physics and, uh, you know, lots of good memories,

uh, from, you know, from those times.

So, okay.

So Stanford.

How’d you get into autonomous vehicles?

I had the great fortune, uh, and great honor to join Stanford’s

DARPA urban challenge team.

And, uh, 2006 there, this was a third in the sequence of the DARPA challenges.

There were two grand challenges prior to that.

And then in 2007, they held the DARPA urban challenge.

So, you know, I was doing my, my postdoc I had, I joined the team and, uh, worked

on motion planning, uh, for, you know, that, that competition.

So, okay.

So for people who might not know, I know from, from certain autonomous vehicles is

a funny world in a certain circle of people, everybody knows everything.

And then the certain circle, uh, nobody knows anything in terms of general public.

So it’s interesting.

It’s, it’s a good question of what to talk about, but I do think that the urban

challenge is worth revisiting. It’s a fun little challenge.

One that, first of all, like sparked so much, so many incredible minds to focus

on one of the hardest problems of our time in artificial intelligence.

So that’s, that’s a success from a perspective of a single little challenge.

But can you talk about like, what did the challenge involve?

So were there pedestrians, were there other cars, what was the goal?

Uh, who was on the team?

How long did it take any fun, fun sort of specs?

Sure, sure, sure.

So the way the challenge was constructed and just a little bit of backgrounding,

as I mentioned, this was the third, uh, competition in that series.

The first year we’re at the grand challenge called the grand challenge.

The goal there was to just drive in a completely static environment.

You know, you had to drive in a desert, uh, that was very successful.

So then DARPA followed with what they called the urban challenge, where the

goal was to have, you know, build vehicles that could operate in more dynamic

environments and, you know, share them with other vehicles.

There were no pedestrians there, but what DARPA did is they took over

an abandoned air force base.

Uh, and it was kind of like a little fake city that they built out there.

And they had a bunch of, uh, robots, uh, you know, cars, uh, that were

autonomous, uh, in there all at the same time.

Uh, mixed in with other vehicles driven by professional, uh, drivers and each

car, uh, had a mission and so there’s a crude map that they received, uh,

beginning and they had a mission and go here and then there and over here.

Um, and they kind of all were sharing this environment at the same time.

They had to interact with each other.

They had to interact with the human drivers.

There’s this very first, very rudimentary, um, version of, uh,

self driving car that, you know, could operate, uh, and, uh, in a, in an

environment, you know, shared with other dynamic actors that, as you said,

you know, really, you know, many ways, you know, kickstarted this whole industry.

Okay.

So who was on the team and how’d you do?

I forget.

Uh, I came in second.

Uh, perhaps that was my contribution to the team.

I think the Stanford team came in first in the DARPA challenge.

Uh, but then I joined the team and, you know, you were the one with the

bug in the code, I mean, do you have sort of memories of some

particularly challenging things or, you know, one of the cool things,

it’s not, you know, this isn’t a product, this isn’t the thing that, uh, you know,

it there’s, you have a little bit more freedom to experiment so you can take

risks and there’s, uh, so you can make mistakes.

Uh, so is there interesting mistakes?

Is there interesting challenges that stand out to you as some, like, taught

you, um, a good technical lesson or a good philosophical lesson from that time?

Yeah.

Uh, you know, definitely, definitely a very memorable time, not really

challenged, but like one of the most vivid memories that I have from the time.

And I think that was actually one of the days that really got me hooked, uh, on

this whole field was, uh, the first time I got to run my software and I got to

software on the car and, uh, I was working on a part of our planning algorithm,

uh, that had to navigate in parking lots.

So it was something that, you know, called free space emotion planning.

So the very first version of that, uh, was, you know, we tried on the car, it

was on Stanford’s campus, uh, in the middle of the night and you had this

little course constructed with cones, uh, in the middle of a parking lot.

So we’re there in like 3 am, you know, by the time we got the code to, you

know, uh, uh, you know, compile and turn over, uh, and, you know, it drove, I

could actually did something quite reasonable and, you know, it was of

course very buggy at the time and had all kinds of problems, but it was pretty

darn magical.

I remember going back and, you know, later at night and trying to fall

asleep and just, you know, being unable to fall asleep for the rest of the

night, uh, just my mind was blown.

Just like, and that, that, that’s what I’ve been doing ever since for more

than a decade, uh, in terms of challenges and, uh, you know, interesting

memories, like on the day of the competition, uh, it was pretty nerve

wrecking.

Uh, I remember standing there with Mike Montemarillo, who was, uh, the

software lead and wrote most of the code.

I think I did one little part of the planner, Mike, you know, incredibly

that, you know, pretty much the rest of it, uh, with, with, you know, a bunch

of other incredible people, but I remember standing on the day of the

competition, uh, you know, watching the car, you know, with Mike and cars

are completely empty, right?

They’re all there lined up in the beginning of the race and then, you

know, DARPA sends them, you know, on their mission one by one.

So then leave and Mike, you just, they had these sirens, they all had

their different silence silence, right?

Each siren had its own personality, if you will.

So, you know, off they go and you don’t see them.

You just kind of, and then every once in a while they come a little bit

closer to where the audience is and you can kind of hear, you know, the

sound of your car and then, you know, it seems to be moving along.

So that, you know, gives you hope.

And then, you know, it goes away and you can’t hear it for too long.

You start getting anxious, right?

So it’s a little bit like, you know, sending your kids to college and like,

you know, kind of you invested in them.

You hope you, you, you, you, you, you, you build it properly, but like,

it’s still, uh, anxiety inducing.

Uh, so that was, uh, an incredibly, uh, fun, uh, few days in terms of, you

know, bugs, as you mentioned, you know, one that that was my bug that caused

us the loss of the first place, uh, is still a debate that, you know,

occasionally have with people on the CMU team, CMU came first, I should

mention, uh, that you haven’t heard of them, but yeah, it’s something, you

know, it’s a small school, but it’s, it’s, it’s, you know, really a glitch

that, you know, they happen to succeed at something robotics related.

Very scenic though.

So most people go there for the scenery.

Um, yeah, it’s a beautiful campus.

I’m like, unlike Stanford.

So for people, yeah, that’s true.

Unlike Stanford, for people who don’t know, CMU is one of the great robotics

and sort of artificial intelligence universities in the world, CMU, Carnegie

Mellon university, okay, sorry, go ahead.

Good, good PSA.

So in the part that I contributed to, which was navigating parking lots and

the way that part of the mission work is, uh, you in a parking lot, you

would get from DARPA an outline of the map.

You basically get this, you know, giant polygon that defined the

perimeter of the parking lot, uh, and there would be an entrance and, you

know, so maybe multiple entrances or access to it, and then you would get a

goal, uh, within that open space, uh, X, Y, you know, heading where the car had

to park and had no information about the optical, so obstacles that the car might

encounter there.

So it had to navigate a kind of completely free space, uh, from the

entrance to the parking lot into that parking space.

And then, uh, once parked there, it had to, uh, exit the parking lot, you know,

while of course, I’m counting and reasoning about all the obstacles that

it encounters in real time.

So, uh, Our interpretation, or at least my interpretation of the rules was that

you had to reverse out of the parking spot.

And that’s what our cars did.

Even if there’s no obstacle in front, that’s not what CMU’s car did.

And it just kind of drove right through.

So there’s still a debate.

And of course, you know, as you stop and then reverse out and go out the

different way that costs you some time.

And so there’s still a debate whether, you know, it was my poor implementation

that cost us extra time or whether it was, you know, CMU, uh, violating an

important rule of the competition.

And, you know, I have my own, uh, opinion here in terms of other bugs.

And like, uh, I, I have to apologize to Mike Montemarila, uh, for sharing this

on air, but it is actually, uh, one of the more memorable ones.

Uh, and it’s something that’s kind of become a bit of, uh, a metaphor and

a label in the industry, uh, since then, I think, you know, at least in some

circles, it’s called the victory circle or victory lap.

Um, and, uh, uh, our cars did that.

So in one of the missions in the urban challenge, in one of the courses, uh,

there was this big oval, right by the start and finish of the race.

So the ARPA had a lot of the missions would finish kind of in that same location.

Uh, and it was pretty cool because you could see the cars come by, you know,

kind of finished that part leg of the trip, that leg of the mission, and then,

you know, go on and finish the rest of it.

Uh, and other vehicles would, you know, come hit their waypoint, uh, and, you

know, exit the oval and off they would go.

Our car on the hand, which hit the checkpoint, and then it would do an extra

lap around the oval and only then, you know, uh, leave and go on its merry way.

So over the course of the full day, it accumulated, uh, uh,

some extra time and the problem was that we had a bug where it wouldn’t, you know,

start reasoning about the next waypoint and plan a route to get to that next

point until it hit a previous one.

And in that particular case, by the time you hit the, that, that one, it was too

late for us to consider the next one and kind of make a lane change.

So at every time we would do like an extra lap.

So, you know, and that’s the Stanford victory lap.

The victory lap.

Oh, that’s there’s, I feel like there’s something philosophically

profound in there somehow, but, uh, I mean, ultimately everybody is

a winner in that kind of competition.

And it led to sort of famously to the creation of, um, Google self driving

car project and now Waymo.

So can we, uh, give an overview of how is Waymo born?

How’s the Google self driving car project born?

What’s the, what is the mission?

What is the hope?

What is it is the engineering kind of, uh, set of milestones that

it seeks to accomplish, there’s a lot of questions in there.

Uh, yeah, uh, I don’t know, kind of the DARPA urban challenge and the DARPA

and previous DARPA grand challenges, uh, kind of led, I think to a very large

degree to that next step and then, you know, Larry and Sergey, um, uh, Larry

Page and Sergey Brin, uh, uh, Google founders course, uh, I saw that

competition and believed in the technology.

So, you know, the Google self driving car project was born, you know, at that time.

And we started in 2009, it was a pretty small group of us, about a dozen people,

uh, who came together, uh, to, to work on this project at Google.

At that time we saw an incredible early result in the DARPA urban challenge.

I think we’re all incredibly excited, uh, about where we got to and we believed

in the future of the technology, but we still had a very, you know,

very, you know, rudimentary understanding of the problem space.

So the first goal of this project in 2009 was to really better

understand what we’re up against.

Uh, and, you know, with that goal in mind, when we started the project, we created a

few milestones for ourselves, uh, that.

Maximized learnings.

Well, the two milestones were, you know, uh, one was to drive a hundred thousand

miles in autonomous mode, which was at that time, you know, orders of magnitude

that, uh, more than anybody has ever done.

And the second milestone was to drive 10 routes, uh, each one was a hundred miles

long, uh, and there were specifically chosen to become extra spicy and extra

complicated and sample the full complexity of the, that, that, uh, domain.

Um, uh, and you had to drive each one from beginning to end with no intervention,

no human intervention.

So you would get to the beginning of the course, uh, you would press the button

that would engage in autonomy and you had to go for a hundred miles, you know,

beginning to end, uh, with no interventions.

Um, and it sampled again, the full complexity of driving conditions.

Some, uh, were on freeways.

We had one route that went all through all the freeways and all

the bridges in the Bay area.

You know, we had, uh, some that went around Lake Tahoe and kind of mountains,

uh, roads.

We had some that drove through dense urban, um, environments like in downtown

Palo Alto and through San Francisco.

So it was incredibly, uh, interesting, uh, to work on.

And it, uh, it took us just under two years, uh, about a year and a half,

a little bit more to finish both of these milestones.

And in that process, uh, you know, it was an incredible amount of fun,

probably the most fun I had in my professional career.

And you’re just learning so much.

You are, you know, the goal here is to learn and prototype.

You’re not yet starting to build a production system, right?

So you just, you were, you know, this is when you’re kind of working 24 seven

and you’re hacking things together.

And you also don’t know how hard this is.

I mean, that’s the point.

Like, so, I mean, that’s an ambitious, if I put myself in that mindset, even

still, that’s a really ambitious set of goals.

Like just those two picking, picking 10 different, difficult, spicy challenges.

And then having zero interventions.

So like not saying gradually we’re going to like, you know, over a period of 10

years, we’re going to have a bunch of routes and gradually reduce the number

of interventions, you know, that literally says like, by as soon as

possible, we want to have zero and on hard roads.

So like, to me, if I was facing that, it’s unclear that whether that takes

two years or whether that takes 20 years.

I mean, it took us under two.

I guess that that speaks to a really big difference between doing something

once and having a prototype where you are going after, you know, learning

about the problem versus how you go about engineering a product that, you

know, where you look at, you know, you do properly do evaluation, you look

at metrics, you drive down and you’re confident that you can do that.

And I guess that’s the, you know, why it took a dozen people, you know, 16

months or a little bit more than that back in 2009 and 2010 with the

technology of, you know, the more than a decade ago that amount of time to

achieve that milestone of, you know, 10 routes, a hundred miles each and no

interventions, and, you know, it took us a little bit longer to get to, you

know, a full driverless product that customers use.

That’s another really important moment.

Is there some memories of technical lessons or just one, like, what did you

learn about the problem of driving from that experience?

I mean, we can, we can now talk about like what you learned from modern day

Waymo, but I feel like you may have learned some profound things in those

early days, even more so because it feels like what Waymo is now is to trying

to, you know, how to do scale, how to make sure you create a product, how to

make sure it’s like safety and all those things, which is all fascinating

challenges, but like you were facing the more fundamental philosophical

problem of driving in those early days.

Like what the hell is driving as an autonomous, or maybe I’m again

romanticizing it, but is it, is there, is there some valuable lessons you

picked up over there at those two years?

A ton.

The most important one is probably that we believe that it’s doable and we’ve

gotten far enough into the problem that, you know, we had a, I think only a

glimpse of the true complexity of the, that the domain, you know, it’s a

little bit like, you know, climbing a mountain where you kind of, you know,

see the next peak and you think that’s kind of the summit, but then you get

to that and you kind of see that, that this is just the start of the journey.

But we’ve tried, we’ve sampled enough of the problem space and we’ve made

enough rapid success, even, you know, with technology of 2009, 2010, that

it gave us confidence to then, you know, pursue this as a real product.

So, okay.

So the next step, you mentioned the milestones that you had in the, in those

two years, what are the next milestones that then led to the creation of Waymo

and beyond?

Yeah, we had a, it was a really interesting journey and, you know, Waymo

came a little bit later, then, you know, we completed those milestones in 2010.

That was the pivot when we decided to focus on actually building a product

using this technology.

The initial couple of years after that, we were focused on a freeway, you

know, what you would call a driver assist, maybe, you know, an L3 driver

assist program.

Then around 2013, we’ve learned enough about the space and thought more deeply

about, you know, the product that we wanted to build, that we pivoted, we

pivoted towards this vision of building a driver and deploying it fully driverless

vehicles without a person.

And that that’s the path that we’ve been on since then.

And very, it was exactly the right decision for us.

So there was a moment where you’re also considered like, what is the right

trajectory here?

What is the right role of automation in the, in the task of driving?

There’s still, it wasn’t from the early days, obviously you want to go fully

autonomous.

From the early days, it was not.

I think it was in 20, around 2013, maybe that we’ve, that became very clear and

we made that pivot and also became very clear and that it’s either the way you

go building a driver assist system is, you know, fundamentally different from

how you go building a fully driverless vehicle.

So, you know, we’ve pivoted towards the ladder and that’s what we’ve been

working on ever since.

And so that was around 2013, then there’s sequence of really meaningful for us

really important defining milestones since then.

And in 2015, we had our first, actually the world’s first fully driverless

trade on public roads.

It was in a custom built vehicle that we had.

I must’ve seen those.

We called them the Firefly, that, you know, funny looking marshmallow looking

thing.

And we put a passenger, his name was Steve Mann, you know, great friend of

our project from the early days, the man happens to be blind.

So we put them in that vehicle.

The car had no steering wheel, no pedals.

It was an uncontrolled environment.

You know, no, you know, lead or chase cars, no police escorts.

And, you know, we did that trip a few times in Austin, Texas.

So that was a really big milestone.

But that was in Austin.

Yeah.

Okay.

And, you know, we only, but at that time we’re only, it took a tremendous

amount of engineering.

It took a tremendous amount of validation to get to that point.

But, you know, we only did it a few times.

We only did that.

It was a fixed route.

It was not kind of a controlled environment, but it was a fixed route.

And we only did a few times.

Then in 2016, end of 2016, beginning of 2017 is when we founded Waymo, the

company.

That’s when we kind of, that was the next phase of the project where I

wanted, we believed in kind of the commercial vision of this technology.

And it made sense to create an independent entity, you know, within

that alphabet umbrella to pursue this product at scale.

Beyond that in 2017, later in 2017 was another really huge step for us.

Really big milestone where we started, I think it was October of 2017 where

when we started regular driverless operations on public roads, that first

day of operations, we drove in one day.

And that first day, a hundred miles and driverless fashion.

And then we’ve now the most, the most important thing about that milestone

was not that, you know, a hundred miles in one day, but that it was the

start of kind of regular ongoing driverless operations.

And when you say driverless, it means no driver.

That’s exactly right.

So on that first day, we actually hit a mix and in some, we didn’t want

to like, you know, be on YouTube and Twitter that same day.

So in, in many of the rides we had somebody in the driver’s seat, but

they could not disengage like the car, not disengage, but actually on that

first day, some of the miles were driven and just completely empty driver’s seat.

And this is the key distinction that I think people don’t realize it’s, you

know, that oftentimes when you talk about autonomous vehicles, you’re, there’s

often a driver in the seat that’s ready to to take over what’s called a safety

driver and then Waymo is really one of the only companies at least that I’m

aware of, or at least as like boldly and carefully and all, and all of that is

actually has cases.

And now we’ll talk about more and more where there’s literally no driver.

So that’s another, the interesting case of where the driver’s not supposed

to disengage, that’s like a nice middle ground, they’re still there, but

they’re not supposed to disengage, but really there’s the case when there’s

no, okay, there’s something magical about there being nobody in the driver’s seat.

Like, just like to me, you mentioned the first time you wrote some code for free

space navigation of the parking lot, that was like a magical moment to me, just

sort of as an observer of robots, the first magical moment is seeing an

autonomous vehicle turn, like make a left turn, like apply sufficient torque to

the steering wheel to where it, like, there’s a lot of rotation and for some

reason, and there’s nobody in the driver’s seat, for some reason that

communicates that here’s a being with power that makes a decision.

There’s something about like the steering wheel, cause we perhaps romanticize

the notion of the steering wheel, it’s so essential to our conception, our 20th

century conception of a car and it turning the steering wheel with nobody

in driver’s seat, that to me, I think maybe to others, it’s really powerful.

Like this thing is in control and then there’s this leap of trust that you give.

Like I’m going to put my life in the hands of this thing that’s in control.

So in that sense, when there’s no, but no driver in the driver’s seat, that’s a

magical moment for robots.

So I’m, I’ve gotten a chance to last year to take a ride in a, in a

way more vehicle and that, that was the magical moment. There’s like nobody in

the driver’s seat. It’s, it’s like the little details. You would think it

doesn’t matter whether there’s a driver or not, but like if there’s no driver

and the steering wheel is turning on its own, I don’t know. That’s magical.

It’s absolutely magical. I, I have taken many of these rides and like completely

empty car, no human in the car pulls up, you know, you call it on your cell phone.

It pulls up, you get in, it takes you on its way. There’s nobody in the car, but

you, right? That’s something called, you know, fully driverless, you know, our

writer only mode of operation. Yeah. It, it is magical. It is, you know,

transformative. This is what we hear from our writers. It kind of really

changes your experience. And not like that, that really is what unlocks the

real potential of this technology. But, you know, coming back to our journey,

you know, that was 2017 when we started, you know, truly driverless operations.

Then in 2018, we’ve launched our public commercial service that we called

Waymo One in Phoenix. In 2019, we started offering truly driverless writer

only rides to our early rider population of users. And then, you know, 2020 has

also been a pretty interesting year. One of the first ones, less about

technology, but more about the maturing and the growth of Waymo as a company.

We raised our first round of external financing this year, you know, we were

part of Alphabet. So obviously we have access to, you know, significant resources

but as kind of on the journey of Waymo maturing as a company, it made sense

for us to, you know, partially go externally in this round. So, you know,

we’re raised about $3.2 billion from that round. We’ve also started putting

our fifth generation of our driver, our hardware, that is on the new vehicle,

but it’s also a qualitatively different set of self driving hardware.

That is now on the JLR pace. So that was a very important step for us.

Hardware specs, fifth generation. I think it’d be fun to maybe, I apologize if

I’m interrupting, but maybe talk about maybe the generations with a focus on

what we’re talking about on the fifth generation in terms of hardware specs,

like what’s on this car.

Sure. So we separated out, you know, the actual car that we are driving from

the self driving hardware we put on it. Right now we have, so this is, as I

mentioned, the fifth generation, you know, we’ve gone through, we started,

you know, building our own hardware, you know, many, many years ago. And

that, you know, Firefly vehicle also had the hardware suite that was mostly

designed, engineered, and built in house. Lighters are one of the more important

components that we design and build from the ground up. So on the fifth

generation of our drivers of our self driving hardware that we’re switching

to right now, we have, as with previous generations, in terms of sensing,

we have lighters, cameras, and radars, and we have a pretty beefy computer

that processes all that information and makes decisions in real time on

board the car. So in all of the, and it’s really a qualitative jump forward

in terms of the capabilities and the various parameters and the specs of

the hardware compared to what we had before and compared to what you can

kind of get off the shelf in the market today.

Meaning from fifth to fourth or from fifth to first?

Definitely from first to fifth, but also from the fourth.

That was the world’s dumbest question.

Definitely from fourth to fifth, as well as the last step is a big step forward.

So everything’s in house. So like LIDAR is built in house and cameras are

built in house?

You know, it’s different. We work with partners and there’s some components

that we get from our manufacturing and supply chain partners. What exactly

is in house is a bit different. We do a lot of custom design on all of

our sensing modalities, lighters, radars, cameras, you know, exactly.

There’s lighters are almost exclusively in house and some of the

technologies that we have, some of the fundamental technologies there

are completely unique to Waymo. That is also largely true about radars

and cameras. It’s a little bit more of a mix in terms of what we do

ourselves versus what we get from partners.

Is there something super sexy about the computer that you can mention

that’s not top secret? Like for people who enjoy computers for, I

mean, there’s a lot of machine learning involved, but there’s a lot

of just basic compute. You have to probably do a lot of signal

processing on all the different sensors. You have to integrate everything

has to be in real time. There’s probably some kind of redundancy

type of situation. Is there something interesting you can say about

the computer for the people who love hardware? It does have all of

the characteristics, all the properties that you just mentioned.

Redundancy, very beefy compute for general processing, as well as

inference and ML models. It is some of the more sensitive stuff that

I don’t want to get into for IP reasons, but it can be shared a

little bit in terms of the specs of the sensors that we have on the

car. We actually shared some videos of what our

lighters see in the world. We have 29 cameras. We have five lighters.

We have six radars on these vehicles, and you can get a feel for

the amount of data that they’re producing. That all has to be

processed in real time to do perception, to do complex

reasoning. That kind of gives you some idea of how beefy those computers

are, but I don’t want to get into specifics of exactly how we build

them. Okay, well, let me try some more questions that you can get

into the specifics of, like GPU wise. Is that something you can get

into? I know that Google works with GPUs and so on. I mean, for

machine learning folks, it’s kind of interesting. Or is there no…

How do I ask it? I’ve been talking to people in the government about

UFOs and they don’t answer any questions. So this is how I feel

right now asking about GPUs. But is there something interesting that

you could reveal? Or is it just… Or leave it up to our

imagination, some of the compute. Is there any, I guess, is there any

fun trickery? Like I talked to Chris Latner for a second time and he

was a key person about GPUs, and there’s a lot of fun stuff going

on in Google in terms of hardware that optimizes for machine

learning. Is there something you can reveal in terms of how much,

you mentioned customization, how much customization there is for

hardware for machine learning purposes? I’m going to be like that

government person who bought UFOs. I guess I will say that it’s

really… Compute is really important. We have very data hungry

and compute hungry ML models all over our stack. And this is where

both being part of Alphabet, as well as designing our own sensors

and the entire hardware suite together, where on one hand you

get access to really rich raw sensor data that you can pipe

from your sensors into your compute platform and build like

build the whole pipe from sensor raw sensor data to the big

compute as then have the massive compute to process all that

data. And this is where we’re finding that having a lot of

control of that hardware part of the stack is really

advantageous. One of the fascinating magical places to me

again, might not be able to speak to the details, but it is

the other compute, which is like, we’re just talking about a

single car, but the driving experience is a source of a lot

of fascinating data. And you have a huge amount of data

coming in on the car and the infrastructure of storing some

of that data to then train or to analyze or so on. That’s a

fascinating piece of it that I understand a single car. I

don’t understand how you pull it all together in a nice way.

Is that something that you could speak to in terms of the

challenges of seeing the network of cars and then

bringing the data back and analyzing things that like edge

cases of driving, be able to learn on them to improve the

system to see where things went wrong, where things went right

and analyze all that kind of stuff. Is there something

interesting there from an engineering perspective?

Oh, there’s an incredible amount of really interesting

work that’s happening there, both in the real time operation

of the fleet of cars and the information that they exchange

with each other in real time to make better decisions as well

as on the kind of the off board component where you have to

deal with massive amounts of data for training your ML

models, evaluating the ML models for simulating the entire

system and for evaluating your entire system. And this is

where being part of Alphabet has once again been tremendously

advantageous because we consume an incredible amount of

compute for ML infrastructure. We build a lot of custom

frameworks to get good at data mining, finding the

interesting edge cases for training and for evaluation of

the system for both training and evaluating some components

and your sub parts of the system and various ML models,

as well as the evaluating the entire system and simulation.

Okay. That first piece that you mentioned that cars

communicating to each other, essentially, I mean, through

perhaps through a centralized point, but what that’s

fascinating too, how much does that help you? Like if you

imagine, you know, right now the number of way more vehicles

is whatever X. I don’t know if you can talk to what that

number is, but it’s not in the hundreds of millions yet. And

imagine if the whole world is way more vehicles, like that

changes potentially the power of connectivity. Like the more

cars you have, I guess, actually, if you look at

Phoenix, cause there’s enough vehicles, there’s enough, when

there’s like some level of density, you can start to

probably do some really interesting stuff with the fact

that cars can negotiate, can be, can communicate with each

other and thereby make decisions. Is there something

interesting there that you can talk to about like, how does

that help with the driving problem from, as compared to

just a single car solving the driving problem by itself?

Yeah, it’s a spectrum. I first and say that, you know, it’s,

it helps and it helps in various ways, but it’s not required

right now with the way we build our system, like each cars can

operate independently. They can operate with no connectivity.

So I think it is important that, you know, you have a fully

autonomous, fully capable driver that, you know, computerized

driver that each car has. Then, you know, they do share

information and they share information in real time. It

really, really helps. So the way we do this today is, you know,

whenever one car encounters something interesting in the

world, whether it might be an accident or a new construction

zone, that information immediately gets, you know,

uploaded over the air and it’s propagated to the rest of the

fleet. So, and that’s kind of how we think about maps as

priors in terms of the knowledge of our drivers, of our fleet of

drivers that is distributed across the fleet and it’s

updated in real time. So that’s one use case. And

you know, you can imagine as the, you know, the density of

these vehicles go up, that they can exchange more information

in terms of what they’re planning to do and start

influencing how they interact with each other, as well as,

you know, potentially sharing some observations, right, to

help with, you know, if you have enough density of these

vehicles where, you know, one car might be seeing something

that another is relevant to another car that is very

dynamic. You know, it’s not part of kind of your updating

your static prior of the map of the world, but it’s more of a

dynamic information that could be relevant to the decisions

that another car is making real time. So you can see them

exchanging that information and you can build on that. But

again, I see that as an advantage, but it’s not a

requirement. So what about the human in the loop? So when I

got a chance to drive with a ride in a Waymo, you know,

there’s customer service. So like there is somebody that’s

able to dynamically like tune in and help you out. What role

does the human play in that picture? That’s a fascinating

like, you know, the idea of teleoperation, be able to

remotely control a vehicle. So here, what we’re talking

about is like, like frictionless, like a human being

able to in a in a frictionless way, sort of help you out. I

don’t know if they’re able to actually control the vehicle.

Is that something you could talk to? Yes. Okay. To be clear,

we don’t do teleporation. I kind of believe in

teleporation for various reasons. That’s not what we

have in our cars. We do, as you mentioned, have, you know,

version of, you know, customer support. You know, we call it

life health. In fact, we find it that it’s very important for

our ride experience, especially if it’s your first trip, you’ve

never been in a fully driverless ride or only way more

vehicle you get in, there’s nobody there. And so you can

imagine having all kinds of, you know, questions in your head,

like how this thing works. So we’ve put a lot of thought into

kind of guiding our, our writers or customers through that

experience, especially for the first time they get some

information on the phone. If the fully driverless vehicle is

used to service their trip, when you get into the car, we

have an in car, you know, screen and audio that kind of guides

them and explains what to expect. They also have a button

that they can push that will connect them to, you know, a

real life human being that they can talk to, right, about this

whole process. So that’s one aspect of it. There is, you

know, I should mention that there is another function that

humans provide to our cars, but it’s not teleoperation. You can

think of it a little bit more like, you know, fleet

assistance, kind of like, you know, traffic control that you

have, where our cars, again, they’re responsible on their own

for making all of the decisions, all of the driving decisions

that don’t require connectivity. They, you know,

anything that is safety or latency critical is done, you

know, purely autonomously by onboard, our onboard system.

But there are situations where, you know, if connectivity is

available, when a car encounters a particularly challenging

situation, you can imagine like a super hairy scene of an

accident, the cars will do their best, they will recognize that

it’s an off nominal situation, they will do their best to come

up with the right interpretation, the best course

of action in that scenario. But if connectivity is available,

they can ask for confirmation from, you know, human

assistant to kind of confirm those actions and perhaps

provide a little bit of kind of contextual information and

guidance. So October 8th was when you’re talking about the

was Waymo launched the fully self, the public version of

its fully driverless, that’s the right term, I think, service

in Phoenix. Is that October 8th? That’s right. It was the

introduction of fully driverless, right, our only

vehicles into our public Waymo One service. Okay, so that’s

that’s amazing. So it’s like anybody can get into Waymo in

Phoenix. So we previously had early people in our early

rider program, taking fully driverless rides in Phoenix.

And just this a little while ago, we opened on October 8th,

we opened that mode of operation to the public. So I

can download the app and go on a ride. There’s a lot more

demand right now for that service. And then we have

capacity. So we’re kind of managing that. But that’s

exactly the way to describe it. Yeah, that’s interesting. So

there’s more demand than you can handle. Like what has been

reception so far? I mean, okay, so this is a product,

right? That’s a whole nother discussion of like how

compelling of a product it is. Great. But it’s also like one

of the most kind of transformational technologies of

the 21st century. So it’s also like a tourist attraction.

Like it’s fun to, you know, to be a part of it. So it’d be

interesting to see like, what do people say? What do people,

what have been the feedback so far? You know, still early

days, but so far, the feedback has been incredible, incredibly

positive. They, you know, we asked them for feedback during

the ride, we asked them for feedback after the ride as part

of their trip. We asked them some questions, we asked them

to rate the performance of our driver. Most by far, you know,

most of our drivers give us five stars in our app, which is

absolutely great to see. And you know, that’s and we’re

they’re also giving us feedback on you know, things we can

improve. And you know, that’s that’s one of the main reasons

we’re doing this as Phoenix and you know, over the last couple

of years, and every day today, we are just learning a

tremendous amount of new stuff from our users. There’s there’s

no substitute for actually doing the real thing, actually

having a fully driverless product out there in the field

with, you know, users that are actually paying us money to

get from point A to point B. So this is a legitimate like,

there’s a paid service. That’s right. And the idea is you use

the app to go from point A to point B. And then what what are

the A’s? What are the what’s the freedom of the of the starting

and ending places? It’s an area of geography where that

service is enabled. It’s a decent size of geography of

territory. It’s actually larger than the size of San Francisco.

And you know, within that, you have full freedom of, you know,

selecting where you want to go. You know, of course, there’s

some and you on your app, you get a map, you tell the car

where you want to be picked up, where you want the car to pull

over and pick you up. And then you tell it where you want to

be dropped off. All right. And of course, there are some

exclusions, right? You want to be you know, you were in terms

of where the car is allowed to pull over, right? So that you

can do. But you know, besides that, it’s amazing. It’s not

like a fixed just would be very I guess. I don’t know. Maybe

that’s what’s the question behind your question. But it’s

not a, you know, preset set of yes, I guess. So within the

geographic constraints with that within that area anywhere

else, it can be you can be picked up and dropped off

anywhere. That’s right. And you know, people use them on like

all kinds of trips. They we have and we have an incredible

spectrum of riders. We I think the youngest actually have car

seats them and we have, you know, people taking their kids

and rides. I think the youngest riders we had on cars are, you

know, one or two years old, you know, and the full spectrum of

use cases people you can take them to, you know, schools to,

you know, go grocery shopping, to restaurants, to bars, you

know, run errands, you know, go shopping, etc, etc. You can go

to your office, right? Like the full spectrum of use cases,

and people are going to use them in their daily lives to get

around. And we see all kinds of really interesting use cases

and that that that’s providing us incredibly valuable

experience that we then, you know, use to improve our

product. So as somebody who’s been on done a few long rants

with Joe Rogan and others about the toxicity of the internet

and the comments and the negativity in the comments, I’m

fascinated by feedback. I believe that most people are

good and kind and intelligent and can provide, like, even in

disagreement, really fascinating ideas. So on a product

side, it’s fascinating to me, like, how do you get the richest

possible user feedback, like, to improve? What’s, what are the

channels that you use to measure? Because, like, you’re

no longer, that’s one of the magical things about autonomous

vehicles is it’s not like it’s frictionless interaction with

the human. So like, you don’t get to, you know, it’s just

giving a ride. So like, how do you get feedback from people

to in order to improve?

Yeah, great question, various mechanisms. So as part of the

normal flow, we ask people for feedback, they as the car is

driving around, we have on the phone and in the car, and we

have a touchscreen in the car, you can actually click some

buttons and provide real time feedback on how the car is

doing, and how the car is handling a particular situation,

you know, both positive and negative. So that’s one

channel, we have, as we discussed, customer support or

life help, where, you know, if a customer wants to, has a

question, or he has some sort of concern, they can talk to a

person in real time. So that that is another mechanism that

gives us feedback. At the end of a trip, you know, we also ask

them how things went, they give us comments, and you know, star

rating. And you know, if it’s, we also, you know, ask them to

explain what you know, one, well, and you know, what could

be improved. And we have our writers providing very rich

feedback, they’re a lot, a large fraction is very passionate,

very excited about this technology. So we get really

good feedback. We also run UXR studies, right, you know,

specific and that are kind of more, you know, go more in

depth. And we will run both kind of lateral and longitudinal

studies, where we have deeper engagement with our customers,

you know, we have our user experience research team,

tracking over time, that’s things about longitudinal is

cool. That’s that’s exactly right. And you know, that’s

another really valuable feedback, source of feedback.

And we’re just covering a tremendous amount, right?

People go grocery shopping, and they like want to load, you

know, 20 bags of groceries in our cars and like that, that’s

one workflow that you maybe don’t think about, you know,

getting just right when you’re building the driverless

product. I have people like, you know, who bike as part of

their trip. So they, you know, bike somewhere, then they get

on our cars, they take apart their bike, they load into our

vehicle, then go and that’s, you know, how they, you know,

where we want to pull over and how that, you know, get in and

get out process works, provides very useful feedback in terms

of what makes a good pickup and drop off location, we get

really valuable feedback. And in fact, we had to do some really

interesting work with high definition maps, and thinking

about walking directions. And if you imagine you’re in a store,

right in some giant space, and then you know, you want to be

picked up somewhere, like if you just drop a pin at a current

location, which is maybe in the middle of a shopping mall, like

what’s the best location for the car to come pick you up? And

you can have simple heuristics where you’re just going to take

your you know, you clean in distance and find the nearest

spot where the car can pull over that’s closest to you. But

oftentimes, that’s not the most convenient one. You know, I have

many anecdotes where that heuristic breaks in horrible

ways. One example that I often mentioned is somebody wanted to

be, you know, dropped off in Phoenix. And you know, we got

car picked location that was close, the closest to there,

you know, where the pin was dropped on the map in terms of,

you know, latitude and longitude. But it happened to be

on the other side of a parking lot that had this row of

cacti. And the poor person had to like walk all around the

parking lot to get to where they wanted to be in 110 degree

heat. So that, you know, that was about so then, you know, we

took all take all of these, all that feedback from our users

and incorporate it into our system and improve it. Yeah, I

feel like that’s like requires AGI to solve the problem of

like, when you’re, which is a very common case, when you’re in

a big space of some kind, like apartment building, it doesn’t

matter, it’s some large space. And then you call the, like a

Waymo from there, right? Like, whatever, it doesn’t matter,

ride share vehicle. And like, where’s the pin supposed to

drop? I feel like that’s, you don’t think, I think that

requires AGI. I’m gonna, in order to solve. Okay, the

alternative, which I think the Google search engine is taught

is like, there’s something really valuable about the

perhaps slightly dumb answer, but a really powerful one,

which is like, what was done in the past by others? Like, what

was the choice made by others? That seems to be like in terms

of Google search, when you have like billions of searches, you

could, you could see which, like when they recommend what you

might possibly mean, they suggest based on not some machine

learning thing, which they also do, but like, on what was

successful for others in the past and finding a thing that

they were happy with. Is that integrated at all? Waymo, like

what, what pickups worked for others? It is. I think you’re

exactly right. So there’s a real, it’s an interesting

problem. Naive solutions have interesting failure modes. So

there’s definitely lots of things that can be done to

improve. And both learning from, you know, what works, but

doesn’t work in actual heal from getting richer data and

getting more information about the environment and richer

maps. But you’re absolutely right, that there’s something

like there’s some properties of solutions that in terms of the

effect that they have on users so much, much, much better than

others, right? And predictability and

understandability is important. So you can have maybe

something that is not quite as optimal, but is very natural

and predictable to the user and kind of works the same way all

the time. And that matters, that matters a lot for the user

experience. And but you know, to get to the basics, the pretty

fundamental property is that the car actually arrives where you

told it to, right? Like, you can always, you know, change it,

see it on the map, and you can move it around if you don’t

like it. And but like, that property that the car actually

shows up reliably is critical, which, you know, where compared

to some of the human driven analogs, I think, you know, you

can have more predictability. It’s actually the fact, if I

have a little bit of a detour here, I think the fact that

it’s, you know, your phone and the cars, two computers talking

to each other, can lead to some really interesting things we

can do in terms of the user interfaces, both in terms of

function, like the car actually shows up exactly where you told

it, you want it to be, but also some, you know, really

interesting things on the user interface, like as the car is

driving, as you call it, and it’s on the way to come pick

you up. And of course, you get the position of the car and the

route on the map. But and they actually follow that route, of

course. But it can also share some really interesting

information about what it’s doing. So, you know, our cars, as

they are coming to pick you up, if it’s come, if a car is

coming up to a stop sign, it will actually show you that

like, it’s there sitting, because it’s at a stop sign or

a traffic light will show you that it’s got, you know,

sitting at a red light. So, you know, they’re like little

things, right? But I find those little touches really

interesting, really magical. And it’s just, you know, little

things like that, that you can do to kind of delight your

users. You know, this makes me think of, there’s some products

that I just love. Like, there’s a there’s a company called

Rev, Rev.com, where I like for this podcast, for example, I

can drag and drop a video. And then they do all the

captioning. It’s humans doing the captioning, but they

connect, they automate everything of connecting you to

the humans, and they do the captioning and transcription.

It’s all effortless. And it like, I remember when I first

started using them, I was like, life’s good. Like, because it

was so painful to figure that out earlier. The same thing

with something called iZotope RX, this company I use for

cleaning up audio, like the sound cleanup they do. It’s

like drag and drop, and it just cleans everything up very

nicely. Another experience like that I had with Amazon

OneClick purchase, first time. I mean, other places do that

now, but just the effortlessness of purchasing,

making it frictionless. It kind of communicates to me, like,

I’m a fan of design. I’m a fan of products that you can just

create a really pleasant experience. The simplicity of

it, the elegance just makes you fall in love with it. So on

the, do you think about this kind of stuff? I mean, it’s

exactly what we’ve been talking about. It’s like the little

details that somehow make you fall in love with the product.

Is that, we went from like urban challenge days, where

love was not part of the conversation, probably. And to

this point where there’s a, where there’s human beings and

you want them to fall in love with the experience. Is that

something you’re trying to optimize for? Try to think

about, like, how do you create an experience that people love?

Absolutely. I think that’s the vision is removing any friction

or complexity from getting our users, our writers to where

they want to go. Making that as simple as possible. And then,

you know, beyond that, just transportation, making things

and goods get to their destination as seamlessly as

possible. I talked about a drag and drop experience where I

kind of express your intent and then it just magically happens.

And for our writers, that’s what we’re trying to get to is

you download an app and you click and car shows up. It’s

the same car. It’s very predictable. It’s a safe and

high quality experience. And then it gets you in a very

reliable, very convenient, frictionless way to where you

want to be. And along the journey, I think we also want to

do little things to delight our users. Like the ride sharing

companies, because they don’t control the experience, I

think they can’t make people fall in love necessarily with

the experience. Or maybe they, they haven’t put in the effort,

but I think if I were to speak to the ride sharing experience

I currently have, it’s just very, it’s just very

convenient, but there’s a lot of room for like falling in love

with it. Like we can speak to sort of car companies, car

companies do this. Well, you can fall in love with a car,

right? And be like a loyal car person, like whatever. Like I

like badass hot rods, I guess, 69 Corvette. And at this point,

you know, you can’t really, cars are so, owning a car is so

20th century, man. But is there something about the Waymo

experience where you hope that people will fall in love with

it? Is that part of it? Or is it part of, is it just about

making a convenient ride, not ride sharing, I don’t know what

the right term is, but just a convenient A to B autonomous

transport or like, do you want them to fall in love with

Waymo? To maybe elaborate a little bit. I mean, almost like

from a business perspective, I’m curious, like how, do you

want to be in the background invisible or do you want to be

like a source of joy that’s in very much in the foreground? I

want to provide the best, most enjoyable transportation

solution. And that means building it, building our

product and building our service in a way that people do.

Kind of use in a very seamless, frictionless way in their

day to day lives. And I think that does mean, you know, in

some way falling in love in that product, right, just kind of

becomes part of your routine. It comes down my mind to safety,

predictability of the experience, and privacy aspects

of it, right? Our cars, you get the same car, you get very

predictable behavior. And you get a lot of different

things. And that is important. And if you’re going to use it

in your daily life, privacy, and when you’re in a car, you

can do other things. You’re spending a bunch, just another

space where you’re spending a significant part of your life.

And so not having to share it with other people who you don’t

want to share it with, I think is a very nice property. Maybe

you want to take a phone call or do something else in the

vehicle. And, you know, safety on the quality of the driving,

as well as the physical safety of not having to share that

ride is important to a lot of people. What about the idea

that when there’s somebody like a human driving, and they do

a rolling stop on a stop sign, like sometimes like, you know,

you get an Uber or Lyft or whatever, like human driver,

and, you know, they can be a little bit aggressive as

drivers. It feels like there’s not all aggression is bad. Now

that may be a wrong, again, 20th century conception of

driving. Maybe it’s possible to create a driving experience.

Like if you’re in the back, busy doing something, maybe

aggression is not a good thing. It’s a very different kind of

experience perhaps. But it feels like in order to navigate

this world, you need to, how do I phrase this? You need to kind

of bend the rules a little bit, or at least test the rules. I

don’t know what language politicians use to discuss this,

but whatever language they use, you like flirt with the rules.

I don’t know. But like you sort of have a bit of an aggressive

way of driving that asserts your presence in this world,

thereby making other vehicles and people respect your

presence and thereby allowing you to sort of navigate

through intersections in a timely fashion. I don’t know if

any of that made sense, but like, how does that fit into the

experience of driving autonomously? Is that?

It’s a lot of thoughts. This is you’re hitting on a very

important point of a number of behavioral components and, you

know, parameters that make your driving feel assertive and

natural and comfortable and predictable. Our cars will

follow rules, right? They will do the safest thing possible in

all situations. Let me be clear on that. But if you think of

really, really good drivers, just think about

professional lemon drivers, right? They will follow the

rules. They’re very, very smooth, and yet they’re very

efficient. But they’re assertive. They’re comfortable

for the people in the vehicle. They’re predictable for the

other people outside the vehicle that they share the

environment with. And that’s the kind of driver that we want

to build. And you think if maybe there’s a sport analogy

there, right? You can do in very many sports, the true

professionals are very efficient in their movements,

right? They don’t do like, you know, hectic flailing, right?

They’re, you know, smooth and precise, right? And they get

the best results. So that’s the kind of driver that we want to

build. In terms of, you know, aggressiveness. Yeah, you can

like, you know, roll through the stop signs. You can do crazy

lane changes. It typically doesn’t get you to your

destination faster. Typically not the safest or most

predictable, very most comfortable thing to do. But

there is a way to do both. And that’s what we’re

doing. We’re trying to build the driver that is safe,

comfortable, smooth, and predictable. Yeah, that’s a

really interesting distinction. I think in the early days of

autonomous vehicles, the vehicles felt cautious as

opposed to efficient. And I’m still probably, but when I

rode in the Waymo, I mean, there was, it was, it was quite

assertive. It moved pretty quickly. Like, yeah, then he’s

one of the surprising feelings was that it actually, it went

fast. And it didn’t feel like, awkwardly cautious than

autonomous vehicle. Like, like, so I’ve also programmed

autonomous vehicles and everything I’ve ever built was

felt awkwardly, either overly aggressive. Okay, especially

when it was my code, or like, awkwardly cautious is the way

I would put it. And Waymo’s vehicle felt like, assertive

and I think efficient is like the right terminology here.

It wasn’t, and I also like the professional limo driver,

because we often think like, you know, an Uber driver or a

bus driver or a taxi. This is the funny thing is people

think they track taxi drivers are professionals. They, I

mean, it’s, it’s like, that that’s like saying, I’m a

professional walker, just because I’ve been walking all

my life. I think there’s an art to it, right? And if you take

it seriously as an art form, then there’s a certain way that

mastery looks like. It’s interesting to think about what

does mastery look like in driving? And perhaps what we

associate with like aggressiveness is unnecessary,

like, it’s not part of the experience of driving. It’s

like, unnecessary fluff, that efficiency, you can be,

you can create a good driving experience within the rules.

That’s, I mean, you’re the first person to tell me this.

So it’s, it’s kind of interesting. I need to think

about this, but that’s exactly what it felt like with Waymo.

I kind of had this intuition. Maybe it’s the Russian thing.

I don’t know that you have to break the rules in life to get

anywhere, but maybe, maybe it’s possible that that’s not the

case in driving. I have to think about that, but it

certainly felt that way on the streets of Phoenix when I was

there in Waymo, that, that, that that was a very pleasant

experience and it wasn’t frustrating in that like, come

on, move already kind of feeling. It wasn’t, that wasn’t

there. Yeah. I mean, that’s what, that’s what we’re going

after. I don’t think you have to pick one. I think truly good

driving. It gives you both efficiency, a certainness, but

also comfort and predictability and safety. And, you know, it’s,

that’s what fundamental improvements in the core

capabilities truly unlock. And you can kind of think of it as,

you know, a precision and recall trade off. You have certain

capabilities of your model. And then it’s very easy when, you

know, you have some curve of precision and recall, you can

move things around and can choose your operating point and

your training of precision versus recall, false positives

versus false negatives. Right. But then, and you know, you can

tune things on that curve and be kind of more cautious or more

aggressive, but then aggressive is bad or, you know, cautious is

bad, but true capabilities come from actually moving the whole

curve up. And then you are kind of on a very different plane of

those trade offs. And that, that’s what we’re trying to do

here is to move the whole curve up. Before I forget, let’s talk

about trucks a little bit. So I also got a chance to check out

some of the Waymo trucks. I’m not sure if we want to go too

much into that space, but it’s a fascinating one. So maybe we

can mention at least briefly, you know, Waymo is also now

doing autonomous trucking and how different like

philosophically and technically is that whole space of

problems. It’s one of our two big products and you know,

commercial applications of our driver, right? Right. Hailing

and deliveries. You know, we have Waymo One and Waymo Via

moving people and moving goods. You know, trucking is an

example of moving goods. We’ve been working on trucking since

  1. It is a very interesting space. And your question of

how different is it? It has this really nice property that

the first order challenges, like the science, the hard

engineering, whether it’s, you know, hardware or, you know,

onboard software or off board software, all of the, you know,

systems that you build for, you know, training your ML models

for, you know, evaluating your time system. Like those

fundamentals carry over. Like the true challenges of, you

know, driving perception, semantic understanding,

prediction, decision making, planning, evaluation, the

simulator, ML infrastructure, those carry over. Like the data

and the application and kind of the domains might be

different, but the most difficult problems, all of that

carries over between the domains. So that’s very nice.

So that’s how we approach it. We’re kind of build investing

in the core, the technical core. And then there’s

specialization of that core technology to different

product lines, to different commercial applications. So on

just to tease it apart a little bit on trucks. So starting with

the hardware, the configuration of the sensors is different.

They’re different physically, geometrically, you know, different

vehicles. So for example, we have two of our main laser on

the trucks on both sides so that we have, you know, not have the

blind spots. Whereas on the JLR eye pace, we have, you know, one

of it sitting at the very top, but the actual sensors are

almost the same. Now we’re largely the same. So all of the

investment that over the years we’ve put into building our

custom lighters, custom radars, pulling the whole system

together, that carries over very nicely. Then, you know, on the

perception side, the like the fundamental challenges of

seeing, understanding the world, whether it’s, you know, object

detection, classification, you know, tracking, semantic

understanding, all that carries over. You know, yes, there’s

some specialization when you’re driving on freeways, you know,

range becomes more important. The domain is a little bit

different. But again, the fundamentals carry over very,

very nicely. Same, and you guess you get into prediction or

decision making, right, the fundamentals of what it takes to

predict what other people are going to do to find the long

tail to improve your system in that long tail of behavior

prediction and response that carries over right and so on and

so on. So I mean, that’s pretty exciting. By the way, does

Waymo via include using the smaller vehicles for

transportation of goods? That’s an interesting distinction. So

I would say there’s three interesting modes of operation.

So one is moving humans, one is moving goods, and one is like

moving nothing, zero occupancy, meaning like you’re going to

the destination, your empty vehicle. I mean, it’s the third

is the less of it. If that’s the entirety of it, it’s the less,

you know, exciting from the commercial perspective.

Well, I mean, in terms of like, if you think about what’s

inside a vehicle as it’s moving, because it does, you

know, some significant fraction of the vehicle’s movement has

to be empty. I mean, it’s kind of fascinating. Maybe just on

that small point, is there different control and like

policies that are applied for zero occupancy vehicle? So

vehicle with nothing in it, or is it just move as if there is

a person inside? What was with some subtle differences?

As a first order approximation, there are no differences. And

if you think about, you know, safety and comfort and quality

of driving, only part of it has to do with the people or the

goods inside of the vehicle. But you don’t want to be, you

know, you want to drive smoothly, as we discussed, not

for the purely for the benefit of whatever you have inside the

car, right? It’s also for the benefit of the people outside

kind of fitting naturally and predictably into that whole

environment, right? So, you know, yes, there are some

second order things you can do, you can change your route, and

you optimize maybe kind of your fleet, things at the fleet

scale. And you would take into account whether some of your

you know, some of your cars are actually, you know, serving a

useful trip, whether with people or with goods, whereas, you

know, other cars are, you know, driving completely empty to that

next valuable trip that they’re going to provide. But that those

are mostly second order effects. Okay, cool. So Phoenix

is, is an incredible place. And what you’ve announced in

Phoenix is, it’s kind of amazing. But, you know, that’s

just like one city. How do you take over the world? I mean,

I’m asking for a friend. One step at a time.

Is that a cartoon pinky in the brain? Yeah. Okay. But, you

know, gradually is a true answer. So I think the heart of

your question is, can you ask a better question than I asked?

You’re asking a great question. Answer that one. I’m just

gonna, you know, phrase it in the terms that I want to

answer. Exactly right. Brilliant. Please. You know,

where are we today? And, you know, what happens next? And

what does it take to go beyond Phoenix? And what does it

take to get this technology to more places and more people

around the world, right? So our next big area of focus is

exactly that. Larger scale commercialization and just,

you know, scaling up. If I think about, you know, the

main, and, you know, Phoenix gives us that platform and

gives us that foundation of upon which we can build. And

it’s, there are few really challenging aspects of this

whole problem that you have to pull together in order to build

the technology in order to deploy it into the field to go

from a driverless car to a fleet of cars that are providing a

service, and then all the way to commercialization. So, and

then, you know, this is what we have in Phoenix. We’ve taken

the technology from a proof point to an actual deployment

and have taken our driver from, you know, one car to a fleet

that can provide a service. Beyond that, if I think about

what it will take to scale up and, you know, deploy in, you

know, more places with more customers, I tend to think about

three main dimensions, three main axes of scale. One is the

core technology, you know, the hardware and software core

capabilities of our driver. The second dimension is

evaluation and deployment. And the third one is the, you know,

product, commercial, and operational excellence. So you

can talk a bit about where we are along, you know, each one of

those three dimensions about where we are today and, you

know, what has, what will happen next. On, you know, the core

technology, you know, the hardware and software, you

know, together comprise a driver, we, you know, obviously

have that foundation that is providing fully driverless

trips to our customers as we speak, in fact. And we’ve

learned a tremendous amount from that. So now what we’re

doing is we are incorporating all those lessons into some

pretty fundamental improvements in our core technology, both on

the hardware side and on the software side to build a more

general, more robust solution that then will enable us to

massively scale beyond Phoenix. So on the hardware side, all of

those lessons are now incorporated into this fifth

generation hardware platform that is, you know, being

deployed right now. And that’s the platform, the fourth

generation, the thing that we have right now driving in

Phoenix, it’s good enough to operate fully driverlessly,

you know, night and day, you know, various speeds and

various conditions, but the fifth generation is the platform

upon which we want to go to massive scale. We, in turn,

we’ve really made qualitative improvements in terms of the

capability of the system, the simplicity of the architecture,

the reliability of the redundancy. It is designed to be

manufacturable at very large scale and, you know, provides

the right unit economics. So that’s the next big step for us

on the hardware side. That’s already there for scale,

the version five. That’s right. And is that coincidence or

should we look into a conspiracy theory that it’s the

same version as the pixel phone? Is that what’s the

hardware? They neither confirm nor deny. All right, cool. So,

sorry. So that’s the, okay, that’s that axis. What else?

So similarly, you know, hardware is a very discreet

jump, but, you know, similar to how we’re making that change

from the fourth generation hardware to the fifth, we’re

making similar improvements on the software side to make it

more, you know, robust and more general and allow us to kind of

quickly scale beyond Phoenix. So that’s the first dimension of

core technology. The second dimension is evaluation and

deployment. How do you measure your system? How do you

evaluate it? How do you build a release and deployment process

where, you know, with confidence, you can, you know,

regularly release new versions of your driver into a fleet?

How do you get good at it so that it is not, you know, a

huge tax on your researchers and engineers that, you know, so

you can, how do you build all these, you know, processes, the

frameworks, the simulation, the evaluation, the data science,

the validation, so that, you know, people can focus on

improving the system and kind of the releases just go out the

door and get deployed across the fleet. So we’ve gotten really

good at that in Phoenix. That’s been a tremendously difficult

problem, but that’s what we have in Phoenix right now that gives

us that foundation. And now we’re working on kind of

incorporating all the lessons that we’ve learned to make it

more efficient, to go to new places, you know, and scale up

and just kind of, you know, stamp things out. So that’s that

second dimension of evaluation and deployment. And the third

dimension is product, commercial, and operational

excellence, right? And again, Phoenix there is providing an

incredibly valuable platform. You know, that’s why we’re doing

things end to end in Phoenix. We’re learning, as you know, we

discussed a little earlier today, tremendous amount of

really valuable lessons from our users getting really

incredible feedback. And we’ll continue to iterate on that and

incorporate all those lessons into making our product, you

know, even better and more convenient for our users.

So you’re converting this whole process in Phoenix into

something that could be copy and pasted elsewhere. So like,

perhaps you didn’t think of it that way when you were doing

the experimentation in Phoenix, but so how long did you

basically, and you can correct me, but you’ve, I mean, it’s

still early days, but you’ve taken the full journey in

Phoenix, right? As you were saying of like what it takes to

basically automate. I mean, it’s not the entirety of Phoenix,

right? But I imagine it can encompass the entirety of

Phoenix. That’s some near term date, but that’s not even

perhaps important. Like as long as it’s a large enough

geographic area. So what, how copy pastable is that process

currently and how like, you know, like when you copy and

paste in Google docs, I think now in, or in word, you can

like apply source formatting or apply destination formatting.

So how, when you copy and paste the Phoenix into like, say

Boston, how do you apply the destination formatting? Like

how much of the core of the entire process of bringing an

actual public transportation, autonomous transportation

service to a city is there in Phoenix that you understand

enough to copy and paste into Boston or wherever? So we’re

not quite there yet. We’re not at a point where we’re kind of

massively copy and pasting all over the place. But Phoenix,

what we did in Phoenix, and we very intentionally have chosen

Phoenix as our first full deployment area, you know,

exactly for that reason to kind of tease the problem apart,

look at each dimension and focus on the fundamentals of

complexity and de risking those dimensions, and then bringing

the entire thing together to get all the way and force

ourselves to learn all those hard lessons on technology,

hardware and software, on the evaluation deployment, on

operating a service, operating a business using actually

serving our customers all the way so that we’re fully

informed about the most difficult, most important

challenges to get us to that next step of massive copy and

pasting as you said. And that’s what we’re doing right now.

We’re incorporating all those things that we learned into

that next system that then will allow us to kind of copy and

paste all over the place and to massively scale to, you know,

more users and more locations. I mean, you know, just talk a

little bit about, you know, what does that mean along those

different dimensions? So on the hardware side, for example,

again, it’s that switch from the fourth to the fifth

generation. And the fifth generation is designed to kind

of have that property. Can you say what other cities you’re

thinking about? Like, I’m thinking about, sorry, we’re in

San Francisco now. I thought I want to move to San Francisco,

but I’m thinking about moving to Austin. I don’t know why

people are not being very nice about San Francisco currently,

but maybe it’s a small, maybe it’s in vogue right now.

But Austin seems, I visited there and it was, I was in a

Walmart. It’s funny, these moments like turn your life.

There’s this very nice woman with kind eyes, just like stopped

and said, he looks so handsome in that tie, honey, to me. This

has never happened to me in my life, but just the sweetness of

this woman is something I’ve never experienced, certainly on

the streets of Boston, but even in San Francisco where people

wouldn’t, that’s just not how they speak or think. I don’t

know. There’s a warmth to Austin that love. And since

Waymo does have a little bit of a history there, is that a

possibility? Is this your version of asking the question

of like, you know, Dimitri, I know you can’t share your

commercial and deployment roadmap, but I’m thinking about

moving to San Francisco, Austin, like, you know, blink twice if

you think I should move to it. That’s true. That’s true. You

got me. You know, we’ve been testing all over the place. I

think we’ve been testing more than 25 cities. We drive

in San Francisco. We drive in, you know, Michigan for snow.

We are doing significant amount of testing in the Bay Area,

including San Francisco, which is not like, because we’re

talking about the very different thing, which is like a

full on large geographic area, public service. You can’t share

and you, okay. What about Moscow? When is that happening?

Take on Yandex. I’m not paying attention to those folks.

They’re doing, you know, there’s a lot of fun. I mean,

maybe as a way of a question, you didn’t speak to sort of like

policy or like, is there tricky things with government and so

on? Like, is there other friction that you’ve

encountered except sort of technological friction of

solving this very difficult problem? Is there other stuff

that you have to overcome when deploying a public service in

a city? That’s interesting. It’s very important. So we

put significant effort in creating those partnerships and

you know, those relationships with governments at all levels,

local governments, municipalities, state level,

federal level. We’ve been engaged in very deep

conversations from the earliest days of our projects.

Whenever at all of these levels, whenever we go

to test or operate in a new area, we always lead

with a conversation with the local officials.

But the result of that investment is that no,

it’s not challenges we have to overcome, but it is very

important that we continue to have this conversation.

Oh, yeah. I love politicians too. Okay, so Mr. Elon Musk said that

LiDAR is a crutch. What are your thoughts?

I wouldn’t characterize it exactly that way. I know I think LiDAR is

very important. It is a key sensor that we use just like

other modalities, right? As we discussed, our cars use cameras, LiDAR

and radars. They are all very important. They are

at the kind of the physical level. They are very different. They have very

different, you know, physical characteristics.

Cameras are passive. LiDARs and radars are active.

Use different wavelengths. So that means they complement each other

very nicely and together combined, they can be used to

build a much safer and much more capable system.

So, you know, to me it’s more of a question,

you know, why the heck would you handicap yourself and not use one

or more of those sensing modalities when they, you know, undoubtedly just make your

system more capable and safer. Now,

it, you know, what might make sense for one product or

one business might not make sense for another one.

So if you’re talking about driver assist technologies, you make certain design

decisions and you make certain trade offs and make different ones if you are

building a driver that you deploy in fully driverless

vehicles. And, you know, in LiDAR specifically, when this question comes up,

I, you know, typically the criticisms that I hear or, you know, the

counterpoints is that cost and aesthetics.

And I don’t find either of those, honestly, very compelling.

So on the cost side, there’s nothing fundamentally prohibitive

about, you know, the cost of LiDARs. You know, radars used to be very expensive

before people started, you know, before people made certain advances in

technology and, you know, started to manufacture them at massive scale and

deploy them in vehicles, right? You know, similar with LiDARs. And this is

where the LiDARs that we have on our cars, especially the fifth generation,

you know, we’ve been able to make some pretty qualitative discontinuous

jumps in terms of the fundamental technology that allow us to

manufacture those things at very significant scale and at a fraction

of the cost of both our previous generation

as well as a fraction of the cost of, you know, what might be available

on the market, you know, off the shelf right now. And, you know, that improvement

will continue. So I think, you know, cost is not a

real issue. Second one is, you know, aesthetics.

You know, I don’t think that’s, you know, a real issue either.

Beauty is in the eye of the beholder. Yeah. You can make LiDAR sexy again.

I think you’re exactly right. I think it is sexy. Like, honestly, I think form

all of function. Well, okay. You know, I was actually, somebody brought this up to

me. I mean, all forms of LiDAR, even

like the ones that are like big, you can make

look, I mean, you can make look beautiful.

There’s no sense in which you can’t integrate it into design.

Like, there’s all kinds of awesome designs. I don’t think

small and humble is beautiful. It could be

like, you know, brutalism or like, it could be

like harsh corners. I mean, like I said, like hot rods. Like, I don’t like, I don’t

necessarily like, like, oh man, I’m going to start so much

controversy with this. I don’t like Porsches. Okay.

The Porsche 911, like everyone says it’s the most beautiful.

No, no. It’s like, it’s like a baby car. It doesn’t make any sense.

But everyone, it’s beauty is in the eye of the beholder. You’re already looking at

me like, what is this kid talking about? I’m happy to talk about. You’re digging your

own hole. The form and function and my take on the

beauty of the hardware that we put on our vehicles,

you know, I will not comment on your Porsche monologue.

Okay. All right. So, but aesthetics, fine. But there’s an underlying, like,

philosophical question behind the kind of lighter question is

like, how much of the problem can be solved

with computer vision, with machine learning?

So I think without sort of disagreements and so on,

it’s nice to put it on the spectrum because Waymo is doing a lot of machine

learning as well. It’s interesting to think how much of

driving, if we look at five years, 10 years, 50 years down the road,

what can be learned in almost more and more and more

end to end way. If we look at what Tesla is doing

with, as a machine learning problem, they’re doing a multitask learning

thing where it’s just, they break up driving into a bunch of learning tasks

and they have one single neural network and they’re just collecting huge amounts

of data that’s training that. I’ve recently hung out with George

Hotz. I don’t know if you know George.

I love him so much. He’s just an entertaining human being.

We were off mic talking about Hunter S. Thompson. He’s the Hunter S. Thompson

of autonomous driving. Okay. So he, I didn’t realize this with comma

AI, but they’re like really trying to end to end.

They’re the machine, like looking at the machine learning problem, they’re

really not doing multitask learning, but it’s

computing the drivable area as a machine learning task

and hoping that like down the line, this level two system, this driver

assistance will eventually lead to

allowing you to have a fully autonomous vehicle. Okay. There’s an underlying

deep philosophical question there, technical question

of how much of driving can be learned. So LiDAR is an effective tool today

for actually deploying a successful service in Phoenix, right? That’s safe,

that’s reliable, et cetera, et cetera. But the question,

and I’m not saying you can’t do machine learning on LiDAR, but the question is

that like how much of driving can be learned eventually.

Can we do fully autonomous? That’s learned.

Yeah. You know, learning is all over the place

and plays a key role in every part of our system.

As you said, I would, you know, decouple the sensing modalities

from the, you know, ML and the software parts of it.

LiDAR, radar, cameras, like it’s all machine learning.

All of the object detection classification, of course, like that’s

what, you know, these modern deep nets and count nets are very

good at. You feed them raw data, massive amounts of raw data,

and that’s actually what our custom build LiDARs and radars are really good

at. And radars, they don’t just give you point

estimates of, you know, objects in space, they give you raw,

like, physical observations. And then you take all of that raw information,

you know, there’s colors of the pixels, whether it’s, you know, LiDARs returns

and some auxiliary information. It’s not just distance,

right? And, you know, angle and distance is much richer information that you get

from those returns, plus really rich information from the

radars. You fuse it all together and you feed it into those massive

ML models that then, you know, lead to the best results in terms of, you

know, object detection, classification, state estimation.

So there’s a side to interop, but there is a fusion. I mean, that’s something

that people didn’t do for a very long time,

which is like at the sensor fusion level, I guess,

like early on fusing the information together, whether

so that the the sensory information that the vehicle receives from the different

modalities or even from different cameras is

combined before it is fed into the machine learning models.

Yeah, so I think this is one of the trends you’re seeing more of that you

mentioned end to end. There’s different interpretation of end to end. There is

kind of the purest interpretation of I’m going to

like have one model that goes from raw sensor data to like,

you know, steering torque and, you know, gas breaks. That, you know,

that’s too much. I don’t think that’s the right way to do it.

There’s more, you know, smaller versions of end to end

where you’re kind of doing more end to end learning or core training or

depropagation of kind of signals back and forth across

the different stages of your system. There’s, you know, really good ways it

gets into some fairly complex design choices where on one

hand you want modularity and decomposability,

decomposability of your system. But on the other hand,

you don’t want to create interfaces that are too narrow or too brittle

to engineered where you’re giving up on the generality of the solution or you’re

unable to properly propagate signal, you know, reach signal forward and losses

and, you know, back so you can optimize the whole system jointly.

So I would decouple and I guess what you’re seeing in terms of the fusion

of the sensing data from different modalities as well as kind of fusion

at in the temporal level going more from, you know, frame by frame

where, you know, you would have one net that would do frame by frame detection

and camera and then, you know, something that does frame by frame and

lighter and then radar and then you fuse it, you know, in a weaker engineered way

later. Like the field over the last, you know,

decade has been evolving in more kind of joint fusion, more end to end models that

are, you know, solving some of these tasks, you know, jointly and there’s

tremendous power in that and, you know, that’s the

progression that kind of our technology, our stack has been on as well.

Now to your, you know, that so I would decouple the kind of sensing and how

that information is fused from the role of ML and the entire stack.

And, you know, I guess it’s, there’s trade offs and, you know, modularity and

how do you inject inductive bias into your system?

All right, this is, there’s tremendous power

in being able to do that. So, you know, we have, there’s no

part of our system that is not heavily, that does not heavily, you know, leverage

data driven development or state of the art ML.

But there’s mapping, there’s a simulator, there’s perception, you know, object

level, you know, perception, whether it’s

semantic understanding, prediction, decision making, you know, so forth and

so on.

It’s, you know, of course, object detection and classification, like you’re

finding pedestrians and cars and cyclists and, you know, cones and signs

and vegetation and being very good at estimating

kind of detection, classification, and state estimation. There’s just stable

stakes, like that’s step zero of this whole stack. You can be

incredibly good at that, whether you use cameras or light as a

radar, but that’s just, you know, that’s stable stakes, that’s just step zero.

Beyond that, you get into the really interesting challenges of semantic

understanding at the perception level, you get into scene level reasoning, you

get into very deep problems that have to do with prediction and joint

prediction and interaction, so the interaction

between all the actors in the environment, pedestrians, cyclists, other

cars, and you get into decision making, right? So, how do you build a lot of

systems? So, we leverage ML very heavily in all of

these components. I do believe that the best results you

achieve by kind of using a hybrid approach and

having different types of ML, having

different models with different degrees of inductive bias

that you can have, and combining kind of model,

you know, free approaches with some model based approaches and some

rule based, physics based systems. So, you know, one example I can give

you is traffic lights. There’s a problem of the detection of

traffic light state, and obviously that’s a great problem for, you know, computer

vision confidence, or, you know, that’s their bread and

butter, right? That’s how you build that. But then the

interpretation of, you know, of a traffic light, that you’re

gonna need to learn that, right? You don’t need to build some,

you know, complex ML model that, you know, infers

with some, you know, precision and recall that red means stop.

Like, it’s a very clear engineered signal

with very clear semantics, right? So you want to induce that bias, like how you

induce that bias, and that whether, you know, it’s a

constraint or a cost, you know, function in your stack, but like

it is important to be able to inject that, like, clear semantic

signal into your stack. And, you know, that’s what we do.

And, but then the question of, like, and that’s when you

apply it to yourself, when you are making decisions whether you want to stop

for a red light, you know, or not.

But if you think about how other people treat traffic lights,

we’re back to the ML version of that. You know they’re supposed to stop

for a red light, but that doesn’t mean they will.

So then you’re back in the, like, very heavy

ML domain where you’re picking up on, like, very subtle cues about,

you know, they have to do with the behavior of objects, pedestrians, cyclists,

cars, and the whole, you know, entire configuration of the scene

that allow you to make accurate predictions on whether they will, in

fact, stop or run a red light. So it sounds like already for Waymo,

like, machine learning is a huge part of the stack.

So it’s a huge part of, like, not just, so obviously the first, the level

zero, or whatever you said, which is, like,

just the object detection of things that, you know, with no other machine learning

can do, but also starting to do prediction behavior and so on to

model the, what other, what the other parties in the

scene, entities in the scene are going to do.

So machine learning is more and more playing a role in that

as well. Of course. Oh, absolutely. I think we’ve been

going back to the, you know, earliest days, like, you know, DARPA,

the DARPA Grand Challenge, our team was leveraging, you know, machine

learning. It was, like, pre, you know, ImageNet, and it was a very

different type of ML, but, and I think actually it was before

my time, but the Stanford team during the Grand Challenge had a very

interesting machine learned system that would, you know, use

LiDAR and camera. We’ve been driving in the

desert, and it, we had built the model where it would kind of

extend the range of free space reasoning. We get a

clear signal from LiDAR, and then it had a model that said, hey, like,

this stuff on camera kind of sort of looks like this stuff in LiDAR, and I

know this stuff that I’m seeing in LiDAR, I’m very confident it’s free space,

so let me extend that free space zone into the camera range that would allow

the vehicle to drive faster. And then we’ve been building on top of

that and kind of staying and pushing the state of the art in ML,

in all kinds of different ML over the years. And in fact,

from the early days, I think, you know, 2010 is probably the year

where Google, maybe 2011 probably, got pretty heavily involved in

machine learning, kind of deep nuts, and at that time it was probably the only

company that was very heavily investing in kind of state of the art ML and

self driving cars. And they go hand in hand.

And we’ve been on that journey ever since. We’re doing, pushing

a lot of these areas in terms of research at Waymo, and we

collaborate very heavily with the researchers in

Alphabet, and all kinds of ML, supervised ML,

unsupervised ML, published some

interesting research papers in the space,

especially recently. It’s just a super active learning as well.

Yeah, so super, super active. Of course, there’s, you know, kind of the more

mature stuff, like, you know, ConvNets for, you know, object detection.

But there’s some really interesting, really active work that’s happening

in kind of more, you know, in bigger models and, you know,

models that have more structure to them,

you know, not just, you know, large bitmaps and reason about temporal sequences.

And some of the interesting breakthroughs that you’ve, you know, we’ve seen

in language models, right? You know, transformers,

you know, GPT3 inference. There’s some really interesting applications of some

of the core breakthroughs to those problems

of, you know, behavior prediction, as well as, you know, decision making and

planning, right? You can think about it, kind of the the behavior,

how, you know, the path, the trajectories, the how people drive.

They have kind of a share, a lot of the fundamental structure,

you know, this problem. There’s, you know, sequential,

you know, nature. There’s a lot of structure in this representation.

There is a strong locality, kind of like in sentences, you know, words that follow

each other. They’re strongly connected, but there’s

also kind of larger context that doesn’t have that locality, and you also see that

in driving, right? What, you know, is happening in the scene

as a whole has very strong implications on,

you know, the kind of the next step in that sequence where

whether you’re, you know, predicting what other people are going to do, whether

you’re making your own decisions, or whether in the simulator you’re

building generative models of, you know, humans walking, cyclists

riding, and other cars driving. That’s all really fascinating, like how

it’s fascinating to think that transformer models and all this,

all the breakthroughs in language and NLP that might be applicable to like

driving at the higher level, at the behavioral level, that’s kind of

fascinating. Let me ask about pesky little creatures

called pedestrians and cyclists. They seem, so humans are a problem. If we

can get rid of them, I would. But unfortunately, they’re all sort of

a source of joy and love and beauty, so let’s keep them around.

They’re also our customers. For your perspective, yes, yes,

for sure. They’re a source of money, very good.

But I don’t even know where I was going. Oh yes,

pedestrians and cyclists, you know,

they’re a fascinating injection into the system of

uncertainty of like a game theoretic dance of what to do. And also

they have perceptions of their own, and they can tweet

about your product, so you don’t want to run them over

from that perspective. I mean, I don’t know, I’m joking a lot, but

I think in seriousness, like, you know, pedestrians are a complicated

computer vision problem, a complicated behavioral problem. Is there something

interesting you could say about what you’ve learned

from a machine learning perspective, from also an autonomous vehicle,

and a product perspective about just interacting with the humans in this

world? Yeah, just to state on record, we care

deeply about the safety of pedestrians, you know, even the ones that don’t have

Twitter accounts. Thank you. All right, cool.

Not me. But yes, I’m glad, I’m glad somebody does.

Okay. But you know, in all seriousness, safety

of vulnerable road users, pedestrians or cyclists, is one of our

highest priorities. We do a tremendous amount of testing

and validation, and put a very significant emphasis

on, you know, the capabilities of our systems that have to do with safety

around those unprotected vulnerable road users.

You know, cars, just, you know, discussed earlier in Phoenix, we have completely

empty cars, completely driverless cars, you know, driving in this very large area,

and you know, some people use them to, you know, go to school, so they’ll drive

through school zones, right? So, kids are kind of the very special

class of those vulnerable user road users, right? You want to be,

you know, super, super safe, and super, super cautious around those. So, we take

it very, very, very seriously. And you know, what does it take to

be good at it? You know,

an incredible amount of performance across your whole stack. You know,

starts with hardware, and again, you want to use all

sensing modalities available to you. Imagine driving on a residential road

at night, and kind of making a turn, and you don’t have, you know, headlights

covering some part of the space, and like, you know, a kid might

run out. And you know, lighters are amazing at that. They

see just as well in complete darkness as they do during the day, right? So, just

again, it gives you that extra,

you know, margin in terms of, you know, capability, and performance, and safety,

and quality. And in fact, we oftentimes, in these

kinds of situations, we have our system detect something,

in some cases even earlier than our trained operators in the car might do,

right? Especially, you know, in conditions like, you know, very dark nights.

So, starts with sensing, then, you know, perception

has to be incredibly good. And you have to be very, very good

at kind of detecting pedestrians in all kinds of situations, and all kinds

of environments, including, you know, people in weird poses,

people kind of running around, and you know, being partially occluded.

So, you know, that’s step number one, right?

Then, you have to have in very high accuracy,

and very low latency, in terms of your reactions

to, you know, what, you know, these actors might do, right? And we’ve put a

tremendous amount of engineering, and tremendous amount of validation, in to

make sure our system performs properly. And, you know, oftentimes, it

does require a very strong reaction to do the safe thing. And, you know, we

actually see a lot of cases like that. That’s the long tail of really rare,

you know, really, you know, crazy events that contribute to the safety

around pedestrians. Like, one example that comes to mind, that we actually

happened in Phoenix, where we were driving

along, and I think it was a 45 mile per hour road, so you have pretty high speed

traffic, and there was a sidewalk next to it, and

there was a cyclist on the sidewalk. And as we were in the right lane,

right next to the side, so it was a multi lane road, so as we got close

to the cyclist on the sidewalk, it was a woman, you know, she tripped and fell.

Just, you know, fell right into the path of our vehicle, right?

And our, you know, car, you know, this was actually with a

test driver, our test drivers, did exactly the right thing.

They kind of reacted, and came to stop. It requires both very strong steering,

and, you know, strong application of the brake. And then we simulated what our

system would have done in that situation, and it did, you know,

exactly the same thing. And that speaks to, you know, all of

those components of really good state estimation and

tracking. And, like, imagine, you know, a person

on a bike, and they’re falling over, and they’re doing that right in front of you,

right? So you have to be really, like, things are changing. The appearance of

that whole thing is changing, right? And a person goes one way, they’re falling on

the road, they’re, you know, being flat on the ground in front of

you. You know, the bike goes flying the other direction.

Like, the two objects that used to be one, they’re now, you know,

are splitting apart, and the car has to, like, detect all of that.

Like, milliseconds matter, and it doesn’t, you know, it’s not good enough to just

brake. You have to, like, steer and brake, and there’s traffic around you.

So, like, it all has to come together, and it was really great

to see in this case, and other cases like that, that we’re actually seeing in the

wild, that our system is, you know, performing

exactly the way that we would have liked, and is able to,

you know, avoid collisions like this.

That’s such an exciting space for robotics.

Like, in that split second to make decisions of life and death.

I don’t know. The stakes are high, in a sense, but it’s also beautiful

that for somebody who loves artificial intelligence, the possibility that an AI

system might be able to save a human life.

That’s kind of exciting as a problem, like, to wake up.

It’s terrifying, probably, for an engineer to wake up,

and to think about, but it’s also exciting because it’s, like,

it’s in your hands. Let me try to ask a question that’s often brought up about

autonomous vehicles, and it might be fun to see if you have

anything interesting to say, which is about the trolley problem.

So, a trolley problem is an interesting philosophical construct

that highlights, and there’s many others like it,

of the difficult ethical decisions that we humans have before us in this

complicated world. So, specifically is the choice

between if you are forced to choose to kill

a group X of people versus a group Y of people, like

one person. If you did nothing, you would kill one person, but if

you would kill five people, and if you decide to swerve out of the way, you

would only kill one person. Do you do nothing, or you choose to do

something? You can construct all kinds of, sort of,

ethical experiments of this kind that, I think, at least on a positive note,

inspire you to think about, like, introspect

what are the physics of our morality, and there’s usually not

good answers there. I think people love it because it’s just an exciting

thing to think about. I think people who build autonomous

vehicles usually roll their eyes, because this is not,

this one as constructed, this, like, literally never comes up

in reality. You never have to choose between killing

one or, like, one of two groups of people,

but I wonder if you can speak to, is there some something interesting

to you as an engineer of autonomous vehicles that’s within the trolley

problem, or maybe more generally, are there

difficult ethical decisions that you find

that an algorithm must make? On the specific version of the trolley problem,

which one would you do, if you’re driving? The question itself

is a profound question, because we humans ourselves

cannot answer, and that’s the very point. I would kill both.

Yeah, humans, I think you’re exactly right in that, you know, humans are not

particularly good. I think they’re kind of phrased as, like, what would a computer do,

but, like, humans, you know, are not very good, and actually oftentimes

I think that, you know, freezing and kind of not doing anything, because,

like, you’ve taken a few extra milliseconds to just process, and then

you end up, like, doing the worst of the possible outcomes, right? So,

I do think that, as you’ve pointed out, it can be

a bit of a distraction, and it can be a bit of a kind of red herring. I think

it’s an interesting, you know, discussion

in the realm of philosophy, right? But in terms of

what, you know, how that affects the actual

engineering and deployment of self driving vehicles,

it’s not how you go about building a system, right? We’ve talked

about how you engineer a system, how you, you know, go about evaluating

the different components and, you know, the safety of the entire thing.

How do you kind of inject the, you know, various

model based, safety based arguments, and, like, yes, you reason at parts of the

system, you know, you reason about the

probability of a collision, the severity of that collision, right?

And that is incorporated, and there’s, you know, you have to properly reason

about the uncertainty that flows through the system, right? So,

you know, those, you know, factors definitely play a role in how

the cars then behave, but they tend to be more

of, like, the emergent behavior. And what you see, like, you’re absolutely right

that these, you know, clear theoretical problems that they, you

know, you don’t encounter that in the system, and really kind of being

back to our previous discussion of, like, what, you know, what, you

know, which one do you choose? Well, you know, oftentimes, like,

you made a mistake earlier. Like, you shouldn’t be in that situation

in the first place, right? And in reality, the system comes up.

If you build a very good, safe, and capable driver,

you have enough, you know, clues in the environment that you

drive defensively, so you don’t put yourself in that situation, right? And

again, you know, it has, you know, this, if you go back to that analogy of, you

know, precision and recoil, like, okay, you can make a, you know, very hard trade

off, but like, neither answer is really good.

But what instead you focus on is kind of moving

the whole curve up, and then you focus on building the right capability on the

right defensive driving, so that, you know, you don’t put yourself in the

situation like this. I don’t know if you have a good answer

for this, but people love it when I ask this question

about books. Are there books in your life that you’ve enjoyed,

philosophical, fiction, technical, that had a big impact on you as an engineer or

as a human being? You know, everything from science fiction

to a favorite textbook. Is there three books that stand out that

you can think of? Three books. So I would, you know, that

impacted me, I would say,

and this one is, you probably know it well,

but not generally well known, I think, in the U.S., or kind of

internationally, The Master and Margarita. It’s one of, actually, my

favorite books. It is, you know, by

Russian, it’s a novel by Russian author Mikhail Bulgakov, and it’s just, it’s a

great book. It’s one of those books that you can, like,

reread your entire life, and it’s very accessible. You can read it as a kid,

and, like, it’s, you know, the plot is interesting. It’s, you know, the

devil, you know, visiting the Soviet Union,

and, you know, but it, like, you read it, reread it

at different stages of your life, and you enjoy it for

different, very different reasons, and you keep finding, like, deeper and deeper

meaning, and, you know, kind of affected, you know,

had a, definitely had an, like, imprint on me, you know, mostly from the,

probably kind of the cultural, stylistic aspect. Like, it makes you think one of

those books that, you know, is good and makes you think, but also has,

like, this really, you know, silly, quirky, dark sense of, you know,

humor. It captures the Russian soul more than

many, perhaps, many other books. On that, like, slight note,

just out of curiosity, one of the saddest things is I’ve read that book

in English. Did you, by chance, read it in English or in Russian?

In Russian, only in Russian, and I actually, that is a question I had,

kind of posed to myself every once in a while, like, I wonder how well it

translates, if it translates at all, and there’s the

language aspect of it, and then there’s the cultural aspect, so

I, actually, I’m not sure if, you know, either of those would

work well in English. Now, I forget their names, but, so, when the COVID lifts a

little bit, I’m traveling to Paris for several reasons. One is just, I’ve

never been to Paris, I want to go to Paris, but

there’s the most famous translators of Dostoevsky, Tolstoy, of most of

Russian literature live there. There’s a couple, they’re famous,

a man and a woman, and I’m going to, sort of, have a series of conversations with

them, and in preparation for that, I’m starting

to read Dostoevsky in Russian, so I’m really embarrassed to say that I read

this, everything I’ve read in Russian literature of, like,

serious depth has been in English, even though

I can also read, I mean, obviously, in Russian, but

for some reason, it seemed,

in the optimization of life, it seemed the improper decision to do, to read in

Russian, like, you know, like, I don’t need to,

I need to think in English, not in Russian, but now I’m changing my mind on

that, and so, the question of how well I translate, it’s a

really fun to method one, like, even with Dostoevsky.

So, from what I understand, Dostoevsky translates easier,

others don’t as much. Obviously, the poetry doesn’t translate as well,

I’m also the music big fan of Vladimir Vosotsky,

he doesn’t obviously translate well, people have tried,

but mastermind, I don’t know, I don’t know about that one, I just know in

English, you know, as fun as hell in English, so, so, but

it’s a curious question, and I want to study it rigorously from both the

machine learning aspect, and also because I want to do a

couple of interviews in Russia, that

I’m still unsure of how to properly conduct an interview

across a language barrier, it’s a fascinating question

that ultimately communicates to an American audience. There’s a few

Russian people that I think are truly special human beings,

and I feel, like, I sometimes encounter this with some

incredible scientists, and maybe you encounter this

as well at some point in your life, that it feels like because of the language

barrier, their ideas are lost to history. It’s a sad thing, I think about, like,

Chinese scientists, or even authors that, like,

that we don’t, in an English speaking world, don’t get to appreciate

some, like, the depth of the culture because it’s lost in translation,

and I feel like I would love to show that to the world,

like, I’m just some idiot, but because I have this,

like, at least some semblance of skill in speaking Russian,

I feel like, and I know how to record stuff on a video camera,

I feel like I want to catch, like, Grigori Perlman, who’s a mathematician, I’m not

sure if you’re familiar with him, I want to talk to him, like, he’s a

fascinating mind, and to bring him to a wider audience in English speaking

will be fascinating, but that requires to be rigorous about this question

of how well Bulgakov translates. I mean, I know it’s a silly

concept, but it’s a fundamental one, because how do you translate, and

that’s the thing that Google Translate is also facing

as a more machine learning problem, but I wonder as a more

bigger problem for AI, how do we capture the magic

that’s there in the language? I think that’s a really interesting,

really challenging problem. If you do read it, Master and Margarita

in English, sorry, in Russian, I’d be curious

to get your opinion, and I think part of it is language, but part of it’s just,

you know, centuries of culture, that, you know, the cultures are

different, so it’s hard to connect that.

Okay, so that was my first one, right? You had two more. The second one I

would probably pick is the science fiction by the

Strogatsky brothers. You know, it’s up there with, you know,

Isaac Asimov and, you know, Ray Bradbury and, you know, company. The

Strogatsky brothers kind of appealed more to me. I think it made more of an

impression on me growing up. I apologize if I’m

showing my complete ignorance. I’m so weak on sci fi. What did

they write? Oh, Roadside Picnic,

Heart to Be a God,

Beetle in an Ant Hill, Monday Starts on Saturday. Like, it’s

not just science fiction. It also has very interesting, you know,

interpersonal and societal questions, and some of the

language is just completely hilarious.

That’s the one. Oh, interesting. Monday Starts on Saturday. So,

I need to read. Okay, oh boy. You put that in the category of science fiction?

That one is, I mean, this was more of a silly,

you know, humorous work. I mean, there is kind of…

It’s profound too, right? Science fiction, right? It’s about, you know, this

research institute, and it has deep parallels to

serious research, but the setting, of course,

is that they’re working on, you know, magic, right? And there’s a

lot of stuff. And that’s their style, right?

And, you know, other books are very different, right? You know,

Heart to Be a God, right? It’s about kind of this higher society being injected

into this primitive world, and how they operate there,

and some of the very deep ethical questions there,

right? And, like, they’ve got this full spectrum. Some is, you know, more about

kind of more adventure style. But, like, I enjoy all of

their books. There’s just, you know, probably a couple.

Actually, one I think that they consider their most important work.

I think it’s The Snail on a Hill. I’m not exactly sure how it

translates. I tried reading a couple times. I still don’t get it.

But everything else I fully enjoyed. And, like, for one of my birthdays as a kid, I

got, like, their entire collection, like, occupied a giant shelf in my room, and

then, like, over the holidays, I just, like,

you know, my parents couldn’t drag me out of the room, and I read the whole thing

cover to cover. And I really enjoyed it.

And that’s one more. For the third one, you know, maybe a little bit

darker, but, you know, comes to mind is Orwell’s

  1. And, you know, you asked what made an

impression on me and the books that people should read. That one, I think,

falls in the category of both. You know, definitely it’s one of those

books that you read, and you just kind of, you know, put it

down and you stare in space for a while. You know, that kind of work. I think

there’s, you know, lessons there. People should

not ignore. And, you know, nowadays, with, like,

everything that’s happening in the world, I,

like, can’t help it, but, you know, have my mind jump to some,

you know, parallels with what Orwell described. And, like, there’s this whole,

you know, concept of double think and ignoring logic and, you know, holding

completely contradictory opinions in your mind and not have that not bother

you and, you know, sticking to the party line

at all costs. Like, you know, there’s something there.

If anything, 2020 has taught me, and I’m a huge fan of Animal Farm, which is a

kind of friendly, as a friend of 1984 by Orwell.

It’s kind of another thought experiment of how our society

may go in directions that we wouldn’t like it to go.

But if anything that’s been kind of heartbreaking to an

optimist about 2020 is that

that society is kind of fragile. Like, we have this,

this is a special little experiment we have going on.

And not, it’s not unbreakable. Like, we should be careful to, like, preserve

whatever the special thing we have going on. I mean, I think 1984

and these books, The Brave New World, they’re

helpful in thinking, like, stuff can go wrong

in nonobvious ways. And it’s, like, it’s up to us to preserve it.

And it’s, like, it’s a responsibility. It’s been weighing heavy on me because, like,

for some reason, like, more than my mom follows me on Twitter and I

feel like I have, like, now somehow a

responsibility to

do this world. And it dawned on me that, like,

me and millions of others are, like, the little ants

that maintain this little colony, right? So we have a responsibility not to

be, I don’t know what the right analogy is, but

I’ll put a flamethrower to the place. We want to

not do that. And there’s interesting complicated ways of doing that as 1984

shows. It could be through bureaucracy. It could

be through incompetence. It could be through misinformation.

It could be through division and toxicity.

I’m a huge believer in, like, that love will be

the, somehow, the solution. So, love and robots. Love and robots, yeah.

I think you’re exactly right. Unfortunately, I think it’s less of a

flamethrower type of thing. It’s more of a,

in many cases, it’s going to be more of a slow boil. And that’s the

danger. Let me ask, it’s a fun thing to make

a world class roboticist, engineer, and leader uncomfortable with a

ridiculous question about life. What is the meaning of life,

Dimitri, from a robotics and a human perspective?

You only have a couple minutes, or one minute to answer, so.

I don’t know if that makes it more difficult or easier, actually.

You know, they’re very tempted to quote one of the

stories by Isaac Asimov, actually. Actually, titled,

appropriately titled, The Last Question. It’s a short story where, you know, the

plot is that, you know, humans build this supercomputer,

you know, this AI intelligence, and, you know, once it

gets powerful enough, they pose this question to it, you know,

how can the entropy in the universe be reduced, right? So the computer replies,

as of yet, insufficient information to give a meaningful answer,

right? And then, you know, thousands of years go by, and they keep posing the

same question, and the computer, you know, gets more and more powerful, and keeps

giving the same answer, you know, as of yet, insufficient

information to give a meaningful answer, or something along those lines,

right? And then, you know, it keeps, you know, happening, and

happening, you fast forward, like, millions of years into the future, and,

you know, billions of years, and, like, at some point, it’s just the only entity in

the universe, it’s, like, absorbed all humanity,

and all knowledge in the universe, and it, like, keeps posing the same question

to itself, and, you know, finally, it gets to the

point where it is able to answer that question, but, of course, at that point,

you know, there’s, you know, the heat death of the universe has occurred, and

that’s the only entity, and there’s nobody else to provide that

answer to, so the only thing it can do is to,

you know, answer it by demonstration, so, like, you know, it recreates the big bang,

right, and resets the clock, right?

But, like, you know, I can try to give kind of a

different version of the answer, you know, maybe

not on the behalf of all humanity, I think that that might be a little

presumptuous for me to speak about the meaning of life on the behalf of all

humans, but at least, you know, personally,

it changes, right? I think if you think about kind of what

gives, you know, you and your life meaning and purpose, and kind of

what drives you, it seems to

change over time, right, and that lifespan

of, you know, kind of your existence, you know, when

just when you just enter this world, right, it’s all about kind of new

experiences, right? You get, like, new smells, new sounds, new emotions, right,

and, like, that’s what’s driving you, right? You’re experiencing

new amazing things, right, and that’s magical, right? That’s pretty

pretty awesome, right? That gives you kind of meaning.

Then, you know, you get a little bit older, you start more intentionally

learning about things, right? I guess, actually, before you start intentionally

learning, it’s probably fun. Fun is a thing that gives you kind of

meaning and purpose and purpose and the thing you optimize for, right?

And, like, fun is good. Then you get, you know, start learning, and I guess that

this joy of comprehension

and discovery is another thing that, you know, gives you

meaning and purpose and drives you, right? Then, you know, you

learn enough stuff and you want to give some of it back, right? And so

impact and contributions back to, you know, technology or society,

you know, people, you know, local or more globally

becomes a new thing that, you know, drives a lot of kind of your behavior

and is something that gives you purpose and

that you derive, you know, positive feedback from, right?

You know, then you go and so on and so forth. You go through various stages of

life. If you have kids,

like, that definitely changes your perspective on things. You know, I have

three that definitely flips some bits in your

head in terms of, you know, what you care about and what you

optimize for and, you know, what matters, what doesn’t matter, right?

So, you know, and so on and so forth, right? And I,

it seems to me that, you know, it’s all of those things and as

kind of you go through life, you know,

you want these to be additive, right? New experiences,

fun, learning, impact. Like, you want to, you know, be accumulating.

I don’t want to, you know, stop having fun or, you know, experiencing new things and

I think it’s important that, you know, it just kind of becomes

additive as opposed to a replacement or subtraction.

But, you know, those fewest problems as far as I got, but, you know, ask me in a

few years, I might have one or two more to add to the list.

And before you know it, time is up, just like it is for this conversation,

but hopefully it was a fun ride. It was a huge honor to meet you.

As you know, I’ve been a fan of yours and a fan of Google Self Driving Car and

Waymo for a long time. I can’t wait. I mean, it’s one of the

most exciting, if we look back in the 21st century, I

truly believe it’ll be one of the most exciting things we

descendants of apes have created on this earth. So,

I’m a huge fan and I can’t wait to see what you do

next. Thanks so much for talking to me. Thanks, thanks for having me and it’s a

also a huge fan doing work, honestly, and I really

enjoyed it. Thank you. Thanks for listening to this

conversation with Dmitry Dolgov and thank you to our sponsors,

Triolabs, a company that helps businesses apply machine learning to

solve real world problems, Blinkist, an app I use for reading

through summaries of books, BetterHelp, online therapy with a licensed

professional, and CashApp, the app I use to send money to

friends. Please check out these sponsors in the

description to get a discount and to support this podcast. If you

enjoy this thing, subscribe on YouTube, review it with Five Stars

and Upper Podcast, follow on Spotify, support on Patreon,

or connect with me on Twitter at Lex Friedman. And now,

let me leave you with some words from Isaac Asimov.

Science can amuse and fascinate us all, but it is engineering

that changes the world. Thank you for listening and hope to see you

next time.