Lex Fridman Podcast - #341 - Guido van Rossum: Python and the Future of Programming

The following is a conversation with Guido van Rossum,

his second time on this podcast.

He is the creator of the Python Programming Language

and is Python’s Emeritus BDFL, Benevolent Dictator for Life.

And now a quick few second mention of each sponsor.

Check them out in the description.

It’s the best way to support this podcast.

We’ve got GiveDirectly for philanthropy,

8sleep for naps, Fundrise for real estate investing,

Insight Tracker for biodata,

and Athletic Greens for nutritional health.

Choose wisely, my friends.

And now onto the full ad reads.

As always, no ads in the middle.

I try to make these interesting, but if you skip them,

please still check out our sponsors.

I enjoy their stuff, maybe you will too.

This show is brought to you by GiveDirectly,

a nonprofit that lets you send cash directly

to the people that need it.

GiveDirectly donors include previous guests

of this podcast, Jack Dorsey, Elon Musk,

Metallic Buterin, all of whom happen to be,

actually not Jack, Jack only has been on once.

But I’m pretty sure he’s going to be on again

many more times, he’s a fascinating human being.

Anyway, Elon too, he’ll be back on soon.

And of course Metallic as well, after the merge.

Anyway, all of that to say that the idea of giving directly

to the people that need it is actually

a really powerful way to help people.

There’s a lot of science, there’s a lot of studies

behind it that showed that getting cash directly

to the people that need it is actually

the best way to help them.

So I think a lot of philanthropy is based on the idea

that you could have a lot of middlemen

and you kind of fund that chain that’s able

to optimally redistribute the funding.

And okay, there might be some interesting aspects

to that idea, but what I think is especially interesting

is that when you remove all of those middlemen

and you give directly to the people that need it,

is actually really effective.

I frankly love that idea.

You can visit givedirectly.org slash Lex to learn more

and send cash directly to someone living in extreme poverty.

That’s givedirectly.org slash Lex.

This episode is also brought to you by Eight Sleep

and its new Pod 3 mattress.

I love taking naps.

I got very little sleep last night.

I just had a crazy pile of tasks to finish

and I had a crazy busy day today.

So after I’m done recording these very words

that I’m speaking to you, I’m going to go lay down

on an Eight Sleep bed, cold mattress, warm blanket,

and I’m going to pass out for 20, 25, 30 minutes.

And I’m going to, as opposed to feeling a little bit tired,

a little bit mentally exhausted, you know,

maybe a little bit not so motivated to do work,

I’m going to feel like a new human being,

excited to get right back into the trenches of programming

and taking on the rest of the day,

just powering through, super productive.

All of that, thank you to a comfortable,

enriching, joyful nap.

Check it out and get special holiday savings

of up to $400 when you go to eightsleep.com slash Lex.

This episode is also brought to you by Fundrise,

spelled F-U-N-D-R-I-S-E.

It’s a platform that allows you to invest

in private real estate.

As you know, looking at different markets

and investment opportunities and all the crazy

financial stuff that’s going out there in the world,

I think it’s a good idea to diversify.

And that’s why private real estate

is an interesting place to diversify in.

And of course, when you do that kind of diversification,

you should be using the services that make it super easy.

You don’t need to be an expert in what’s required

in this kind of investment.

You don’t need to go through super complicated paperwork

and so on.

They’ve got everything for you.

They figure out what are good real estate projects

to invest in and make it super easy for you

to do so over 150,000 investors use it.

Check it out.

It takes just a few minutes to get started

at Fundrise.com slash Lex.

This show is also brought to you by InsideTracker,

a service I use to track biological data

that comes from my body, is fed in as the raw signal

into machine learning algorithms, the DNA data,

fitness tracker data, blood data, all that kind of stuff,

goes into the machine learning algorithm

and gives a prediction, a recommendation

about what I should do with my diet and lifestyle changes.

This is obviously the early steps

in what will make the 21st century incredible,

which is the personalization, the depersonalization.

Both medicine, lifestyle, diet, everything,

career choices, mentorship, everything

should be coming from rich, raw signal

coming from your body.

Maybe one day coming from your brain.

If you have BCIs, brain computer interfaces

like Neuralink operating, this is such an exciting future.

So I’m a huge supporter of this kind of future.

If it’s done right, if it’s done in a way

that respects people’s privacy and people’s rights,

this is really, really a great way

to help individuals optimize their life

and get special savings for a limited time

when you go to insidetracker.com slash Lex.

This show is also brought to you by Athletic Greens

and it’s AG1 Drink, which is an all-in-one daily drink

to support better health and peak performance.

It’s my zen temple that I return to twice a day.

After a cup of coffee,

it’s the first thing I drink every day.

I also drink it after a long run.

And it gives me a certainty

that I have all my nutritional bases covered.

So all the crazy kind of diet stuff I’m doing,

if I’m fasting, if I’m doing one meal a day,

if I’m doing carnivore, just a little carb,

all that kind of stuff.

Or if I’m fasting for multiple days

or if I’m doing crazy stretches of work

or exercise to both the mental and the physical fatigue,

everything I’m doing to my body

in terms of challenging it,

I know I have that basis covered.

And so it’s like a multivitamin,

but amazing multivitamin.

I highly, highly recommend it.

And plus, it also tastes good.

They’ll give you one month’s supply of fish oil

when you sign up at athleticgreens.com slash Lex.

This is the Lex Friedman Podcast.

To support it,

please check out our sponsors in the description.

And now, dear friends, here’s Guido Van Rossum.

Python 3.11 is coming out very soon.

In it, CPython claim to be

10 to 60% faster.

How’d you pull that off?

And what’s CPython?

CPython is the last Python implementation standing,

also the first one that was ever created.

The original Python implementation

that I started over 30 years ago.

So what does it mean that Python,

the programming language,

is implemented in another programming language called C?

What kind of audience do you have in mind here?

People who know programming?

No, there’s somebody on a boat that’s into fishing

and have never heard about programming,

but also some world-class programmers.

So you’re gonna have to speak to both.

Imagine a boat with two people.

One of them has not heard about programming,

is really into fishing.

And the other one is like an incredible

Silicon Valley programmer that’s programmed in everything.

C, C++, Python, Rust, Java.

He knows the entire history of programming languages.

So you’re gonna have to speak to both.

I imagine that boat in the middle of the ocean.


I’m gonna please the guy

who knows how to fish first.

Yes, please.

He seems like the most useful in the middle of the ocean.

You gotta make him happy.

I’m sure he has a cell phone.

So he’s probably very suspicious

about what goes on in that cell phone,

but he must have heard that inside a cell phone

is a tiny computer.

And a programming language is computer code

that tells the computer what to do.

It’s a very low level language.

It’s zeros and ones, and then there’s assembly.

And then-

Oh yeah.

We don’t talk about these really low levels

because those just confuse people.

I mean, when we’re talking about human language,

we’re not usually talking about vocal tracts

and how you position your tongue.

I was talking yesterday about how

when you have a Chinese person and they speak English,

this is a bit of a stereotype they often don’t know,

they can’t seem to make the difference well

between an L and an R.

And I have a theory about that,

and I’ve never checked this with linguists,

that it probably has to do with the fact that in Chinese,

there is not really a difference.

And it could be that there are regional variations

in how native Chinese speakers pronounce that one sound

that sounds to L to some,

like L to some of them, like R to others.

So it’s both the sounds you produce with your mouth

throughout the history of your life

and what you’re used to listening to.

I mean, every language has that.

Russian has-


The Slavic languages have sounds like zh,

the letter zh.

Like Americans or English speakers

don’t seem to know the sound zh.

They seem uncomfortable with that sound.


So I’m, oh yes.

Okay, so we’re not going to the shapes of tongues

and the sounds that the mouth can make.

Fine, words.

And similarly, we’re not going into the ones and zeros

or machine language.

I would say a programming language

is a list of instructions,

like a cookbook recipe

that sort of tells you how to do a certain thing,

like make a sandwich.

Well, acquire a loaf of bread, cut it in slices,

take two slices, put mustard on one,

put jelly on the other or something,

then add the meat, then add the cheese.

I’ve heard that science teachers

can actually do great stuff with recipes like that

and trying to interpret their students’ instructions

incorrectly until the students

are completely unambiguous about it.

With language, see, that’s the difference

between natural languages and programming languages.

I think ambiguity is a feature,

not a bug in human spoken languages.

Like that’s the dance of communication between humans.

Well, for lawyers, ambiguity certainly is a feature.

For plenty of other cases,

the ambiguity is not much of a feature,

but we work around it, of course.

What’s more important is context.

So with context, the precision of the statement

becomes more and more concrete, right?

But when you say I love you

to a person that matters a lot to you,

the person doesn’t try to compile that statement

and return an error saying please define love, right?

No, but I imagine that my wife and my son

interpret it very differently.


Even though it’s the same three words.

But imprecisely still.

Oh, for sure.

Lawyers have a lot of follow-up questions for you.

Nevertheless, the context is already different in that case.

Yes, fair enough.

So that’s a programming language

is ability to unambiguously state a recipe.

Actually, let’s go back.

Let’s go to PEP8.

You go through in PEP8 the style guide for Python code

some ideas of what this language should look like,

feel like, read like.

And the big idea there is that code readability counts.

What does that mean to you?

And how do we achieve it?

So this recipe should be readable.

That’s a thing between programmers.

Because on the one hand,

we always explain the concept of programming language

as computers need instructions

and computers are very dumb

and they need very precise instructions

because they don’t have much context.

In fact, they have lots of context

but their context is very different.

But what we’ve seen emerge

during the development of software

starting in the probably in the late forties

is that software is a very social activity.

A software developer is not a mad scientist

who sits alone in his lab writing brilliant code.

A software is developed by teams of people.

Even the mad scientist sitting alone in his lab

can type fast enough to produce enough code

so that by the time he’s done with his coding,

he still remembers what the first few lines he wrote mean.

So even the mad scientist coding alone in his lab

would be sort of wise to adopt conventions

on how to format the instructions

that he gives to the computer.

So that the thing is there is a difference

between a cookbook recipe and a computer program.

The cookbook recipe,

the author of the cookbook writes it once

and then it’s printed in a hundred thousand copies.

And then lots of people in their kitchens

try to recreate that recipe,

that particular pie or dish from the recipe.

And so there the goal of the cookbook author

is to make it clear to the human reader of the recipe,

the human amateur chef in most cases.

When you’re writing a computer program,

you have two audiences at once.

It needs to tell the computer what to do,

but it also is useful if that program

is readable by other programmers.

Because computer software,

unlike the typical recipe for a cherry pie,

is so complex that you don’t get all of it right at once.

You end up with the activity of debugging

and you end up with the activity of…

So debugging is trying to figure out

why your code doesn’t run the way

you thought it should run.

That means broadly, it could be stupid little errors

or it could be big logical errors.

It could be anything.


Yeah, it could be anything from a typo

to a wrong choice of algorithm

to building something that does what you tell it to do,

but that’s not useful.

Yeah, it seems to work really well 99% of the time,

but does weird things 1% of the time on some edge cases.

That’s pretty much all software nowadays.

All good software, right?

Well, yeah, for bad software.

That 99 goes down a lot.

But it’s not just about the complexity of the program.

Like you said, it is a social endeavor

in that you’re constantly improving that recipe

for the cherry pie.

But you’re in a group of people improving that recipe.

Or the mad scientist is improving the recipe

that he created a year ago and making it better.

Or adding something.

He decides that he wants, I don’t know,

he wants some decoration on his pie or icing or…

So there’s broad philosophical things

and there’s specific advice on style.

So first of all, the thing that people first experience

when they look at Python,

there is a, it is very readable,

but there’s also like a spatial structure to it.

Can you explain the indentation style of Python

and what is the magic to it?

Spaces are important for readability of any kind of text.

If you take a cookbook recipe

and you remove all the sort of,

all the bullets and other markup

and you just crunch all the text together,

maybe you leave the spaces between the words,

but that’s all you leave.

When you’re in the kitchen trying to figure out,

oh, what are the ingredients and what are the steps?

And where does this step end and the next step begin?

You’re gonna have a hard time

if it’s just one solid block of text.

On the other hand, what a typical cookbook does

if the paper is not too expensive,

each recipe starts on its own page.

Maybe there’s a picture next to it.

The list of ingredients comes first.

There’s a standard notation.

There’s shortcuts so that you don’t have to

sort of write two sentences on how you have to cut the onion

because there are only three ways

that people ever cut onions in a kitchen,

small, medium, and in slices.

Or something like that.


None of my examples make any sense

to real cooks, of course, but.


We’re talking to programmers

with a metaphor of cooking, I love it.

But there is a strictness to the spacing

that Python defines.

So there’s some looser things, some stricter things,

but the four spaces for the indentation

is really interesting.

It really defines what the language looks

or feels like.

Because indentation sort of taking a block of text

and then having inside that block of text

a smaller block of text that is indented further

as sort of a group.

It’s like you have a bulleted list

in a complex business document

and inside some of the bullets are other bulleted lists.

You will indent those too.

If each bulleted list is indented several inches,

then at two levels deep, there’s no space left on the page

to put any of the words of the text.

So you can’t indent too far.

On the other hand, if you don’t indent at all,

you can’t tell whether something is a top level bullet

or a second level bullet or a third level bullet.

So you have to have some compromise

and based on ancient conventions

and sort of the typical width of a computer screen

in the 80s and all sorts of things,

sort of we came up with sort of four spaces as a compromise.

I mean, there are groups,

there are large groups of people

who code with two spaces per indent level.

For example, the Google style guide,

all the Google Python code,

and I think also all the Google C++ code

is indented with only two spaces per block.

If you’re not used to that,

it’s harder to at a glance understand the code

because the sort of the high level structure

is determined by the indentation.

On the other hand, there are other programming languages

where the indentation is eight spaces

or a whole tab stop in sort of classic Unix.

And to me, that looks weird

because you sort of after three indent levels,

you’ve got no room left.

Well, there’s some languages

where the indentation is a recommendation.

It’s a stylistic one.

The code compiles even without any indentation.

And then Python really,

indentation is a fundamental part of the language, right?

It doesn’t have to be four spaces.

So you can code Python with two spaces per block

or six spaces or 12 if you really want to go wild,

but sort of everything that belongs to the same block

needs to be indented the same way.

In practice, in most other languages,

people recommend doing that anyway.

If you look at C or Rust or C++,

all those languages, Java,

don’t have a requirement of indentation,

but except in extreme cases,

they’re just as anal about

having their code properly indented.

So any IDE that does syntax highlighting

that works with Java or C++,

they will yell at you aggressively

if you don’t do proper indentation.

They’d suggest the proper indentation for you.

Like in C, you type a few words

and then you type a curly brace,

which is their notion of sort of begin an indented block.

Then you hit return

and then it automatically indents four or eight spaces,

depending on your style preferences

or how your editor is configured.

Was there a possible universe

in which you considered having braces in Python?

Absolutely, yeah.

Was it 60, 40, 70, 30 in your head?

What was the trade-off?

For a long time,

I was actually convinced that the indentation

was just better.

Without context,

I would still claim that indentation is better.

It reduces clutter.

However, as I started to say earlier,

context is almost everything.

And in the context of coding,

most programmers are familiar with multiple languages,

even if they’re only good at one or two.

And apart from Python and maybe Fortran,

I don’t know how that’s written these days anymore,

but all the other languages,

Java, Rust, C, C++, JavaScript, TypeScript, Perl,

are all using curly braces to sort of indicate blocks.

And so Python is the odd one out.

So it’s a radical idea.

Do you still, as a radical renegade revolutionary,

do you still stand behind this idea

of indentation versus braces?

Like what, can you dig into it a little bit more?

Why you still stand behind indentation?

Because context is not the whole story.

History, in a sense, provides more context.

So for Python, there is no chance that we can switch.

Python is using curly braces for something else,

dictionaries mostly.

We would get in trouble if we wanted to switch.

Just like you couldn’t redefine C to use indentation,

even if you agree that indentation

sort of in a greenfield environment would be better.

You can’t change that kind of thing in a language.

It’s hard enough to reach agreement

over much more minor details.

Maybe, I mean, in the past in Python,

we did have a big debate about tabs versus spaces

and four spaces versus fewer or more.

And we sort of came up with a recommended standard

and sort of options for people who want to be different.

But yes, I guess the thought experiment

I’d like you to consider is if you could travel back

through time when the compatibility is not an issue

and you started Python all over again,

can you make the case for indentation still?

Well, it frees up a pair of matched brackets

of which there are never enough in the world

for other purposes.

It really makes the language slightly

sort of easier to grasp for people

who don’t already know another programming language.

Because sort of one of the things,

and I mostly got this from my mentors

who taught me programming language design

in the earlier 80s.

When you’re teaching programming

for the total newbie who has not coded before

in not in any other language,

a whole bunch of concepts in programming are very alien

or sort of new and maybe very interesting,

but also distracting and confusing.

And there are many different things you have to learn.

You have to sort of,

in a typical 13 week programming course,

you have to, if it’s like really learning

to program from scratch,

you have to cover algorithms,

you have to cover data structures,

you have to cover syntax,

you have to cover variables, loops, functions,

recursion, classes, expressions, operators.

There are so many concepts if you sort of,

if you can spend a little less time

having to worry about the syntax.

The classic example was often,

oh, the compiler complains every time I put a semicolon

in the wrong place or I forget to put a semicolon.

Python doesn’t have semicolons in that sense.

So you can’t forget them.

And you are also not sort of misled

into putting them where they don’t belong

because you don’t learn about them in the first place.

The flip side of that is forcing the strictness

onto the beginning programmer

to teach them that programming

values attention to details.

You don’t get to just write the way you write in English.

Plenty of other details that they have to pay attention to.

So I think they’ll still get the message

about paying attention to details.

The interesting design choice,

I still program quite a bit in PHP

and I’m sure there’s other languages like this,

but the dollar sign before a variable,

that was always an annoying thing for me.

It didn’t quite fit into my understanding

of why this is good for a programming language.

I’m not sure if you ever thought about that one.

That is a historical thing.

There is a whole lineage of programming languages.

PHP is one.

Perl was one.

The Unix shell is one of the oldest

or all the different shells.

The dollar was invented for that purpose

because the very earliest shells had a notion of scripting,

but they did not have a notion

of parameterizing the scripting, right?

And so a script is just a few lines of text

where each line of text is a command

that is read by a very primitive command processor

that then sort of takes the first word on the line

as the name of a program

and passes all the rest of the line as text

into the program for the program

to figure out what to do with as arguments.

And so by the time scripting was slightly more mature

than the very first script,

there was a convention that just like the first word

on the line is the name of the program,

the following words could be names of files.

Input.text, output.html, things like that.

The next thing that happens is,

oh, it would actually be really nice

if we could have variables

and especially parameters for scripts.

Parameters are usually what starts this process.

But now you have a problem

because you can’t just say the parameters are X, Y, and Z.

And so now we call, say, let’s say X is the input file

and Y is the output file,

and let’s forget about Z for now.

I have my program and I write program X, Y.

Well, that already has a meaning

because that presumably means X itself is the file.

It’s a file name.

It’s not a variable name.

And so the inventors of things like the Unix shell

and I’m sure job command language at IBM before that,

had to use something that made it clear

to the script processor,

here is an X that is not actually the name of a file,

which you just pass through to the program you’re running.

Here is an X that is the name of a variable.

And when you’re writing a script processor,

you try to keep it as simple as possible

because certainly in the 50s and 60s,

the thing that interprets the script

was itself had to be a very small program

because it had to fit in a very small part of memory.

And so saying, oh, just look at each character.

And if you see a dollar sign,

you jump to another section of the code

and then you gobble up characters

or say until the next space or something,

and you say that’s the variable name.

And so it was sort of invented

as a clever way to make parsing of things

that contain both variable and fixed parts,

very easy in a very simple script processor.

It also helps even then,

it also helps the human author

and the human reader of the script to quickly see,

oh, 20 lines down in the script,

I see a reference to X, Y, Z.

Oh, it has a dollar in front of it.

So now we know that X, Y, Z

must be one of the parameters of the script.

Well, this is fascinating.

Several things to say,

which is the leftovers

from the simple script processor languages

are now in code bases like behind Facebook

or behind most of the backend.

I think PHP is probably still runs

most of the backend of the internet.

Oh yeah, I think there’s a lot of it in Wikipedia too,

for example.

It’s funny that those decisions,

or not funny, it’s fascinating

that those decisions permeate through time.

Just like biological systems, right?

I mean, the inner workings of DNA

have been stable for,

well, I don’t know how long it was,

like 300 million years, half a billion years.

And there are all sorts of weird quirks there

that don’t make a lot of sense

if you were to design a system

like self-replicating molecules from scratch.

Well, that system has a lot of interesting resilience.

It has redundancy that results,

like it messes up in interesting ways

that still is resilient

when you look at the system level of the organism.

Code doesn’t necessarily have that,

a computer programming code.

You’d be surprised how much resilience modern code has.

I mean, if you look at the number of bugs per line of code,

even in very well-tested code

that in practice works just fine,

there are actually lots of things that don’t work fine.

And there are error correcting

or self-correcting mechanisms at many levels.

Including probably the user of the code.

Well, in the end, the user who sort of is told,

well, you got to reboot your PC,

is part of that system.

And a slightly less drastic thing is reload the page,

which we all know how to do without thinking about it

when something weird happens.

You try to reload a few times before you say,

oh, there’s something really weird.

Okay, or try to click the button again

if the first time didn’t work.

Well, yeah, we should all have learned not to do that

because that’s probably just gonna turn the light back off.

Yeah, true.

So do it three times.

That’s the right lesson.

And I wonder how many people actually like the dollar sign.

Like you said, it is documentation.

So to me, it’s whatever the opposite of syntactic sugar

is syntactic poison.

To me, it is such a pain in the ass

that I have to type in a dollar sign.

Also super error prone.

So it’s not self-documenting.

It’s like a bug generating thing.

It is a kind of documentation that’s the pro

and the con is it’s a source of a lot of bugs.

But actually I have to ask you,

this is a really interesting idea of bugs per line of code.

If you look at all the computer systems out there,

from the code that runs nuclear weapons

to the code that runs all the amazing companies

that you’ve been involved with and not,

the code that runs Twitter and Facebook and Dropbox

and Google and Microsoft Windows and so on.

And we like laid out,

wouldn’t that be a cool like table?

Bugs per line of code.

And let’s put like actual companies aside.

Do you think we’d be surprised by the number we see there

for all these companies?

That depends on whether you’ve ever read about research

that’s been done in this area before.

And I didn’t know that the last time

I saw some research like that,

there was probably in the nineties

and the research might’ve been done in the eighties.

But the conclusion was across

a wide range of different software,

different languages, different companies,

different development styles.

The number of bugs is always,

I think it’s in the order of about one bug per thousand lines

in sort of mature software that is considered

as good as it gets.

Can I give you some facts here?

There’s a lot of papers.

So you said mature software, right?

So here’s a report from a programming analytics company.

Now this is from a developer perspective.

Let me just say what it says

because this is very weird and surprising.

On average, a developer creates 70 bugs

per 1000 lines of code.

15 bugs per 1000 lines of code

find their way to the customers.

This is in software they’ve analyzed.

I was wrong by an order of magnitude there.

Fixing a bug takes 30 times longer

than writing a line of code.

That I can believe.

Yeah, totally.

75% of a developer’s time is spent on debugging.

That’s for an average developer.

They analyze this 1500 hours a year.

In US alone, $113 billion is spent annually

on identifying and fixing bugs.

And I imagine this is marketing literature

for someone who claims to have a golden bullet

or a silver bullet that makes all that investment

in fixing bugs go away.

But that is usually not going to happen.

Well, I mean, they’re referencing a lot of stuff,

of course, but it is a page that is,

you know, there’s a contact us button at the bottom.

Presumably, if you just spend a little bit less

than $100 billion, we’re willing to solve

the problem for you.


And there’s also a report on Stack Exchange

and Stack Overflow on the exact same topic.

But when I open it up at the moment,

the page says Stack Overflow is currently offline

for maintenance.

Oh, that is ironic.


By the way, their error page is awesome.

Anyway, I mean, can you believe that number of bugs?

Oh, absolutely.

Isn’t that scary?

That 70 bugs per 1000 lines of code.

So even 10 bugs per 1000 lines of code.

Well, that’s about one bug every 15 lines,

and that’s when you’re first typing it in.

Yeah, from a developer, but like,

how many bugs are going to be found?

If you’re typing it in?

Well, the development process is extremely iterative.


Typically, you don’t make a plan for what software

you’re going to release a year from now.


And work out all the details,

because actually all the details themselves consist,

they sort of compose a program.

And that being a program,

all your plans will have bugs in them too,

and inaccuracies.

But what you actually do is,

you do a bunch of typing,

and I’m actually really,

I’m a really bad typist.

That’s just, I’ve never learned to type with 10 fingers.

How many do you use?

Well, I use all 10 of them, but not very well.

But I never took a typing class,

and I never sort of corrected that.

So the first time I seriously learned,

I had to learn the layout of a QWERTY keyboard,

was actually in college, in my first programming classes,

where we used punch cards.

And so with my two fingers,

I sort of pecked out my code.

Watch anyone give you a little coding demonstration.

They’ll have to produce like four lines of code,

and now see how many times they use the backspace key.


Because they made a mistake.

And some people, especially when someone else is looking,

will backspace over 20, 30, 40 characters

to fix a typo earlier in a line.

If you’re slightly more experienced,

of course you use your arrow buttons to go,

or your mouse to,

but the mouse is usually slower than the arrows.

But a lot of people,

when they type a 20 character word,

which is not unusual,

and they realize they made a mistake

at the start of the word,

they backspace over the whole thing,

and then retype it.

And sometimes it takes three, four times to get it right.

So I don’t know what your definition of bug is,

arguably mistyping a word,

and then correcting it immediately is not a bug.

On the other hand,

you already do sort of lose time.

And every once in a while,

there’s sort of a typo that you don’t get in that process.

And now you’ve typed like 10 lines of code,

and somewhere in the middle of it,

you don’t know where yet is a typo,

or maybe a thinko where you forgot

that you had to initialize a variable or something.

But those are two different things.

And I would say, yes,

you have to actually run the code to discover that typo,

but forgetting to initialize a variable

is a fundamentally different thing,

because that thing can go undiscovered.

That depends on the language.

In Python, it will not.

And sort of modern compilers

are usually pretty good at catching that, even for C.

So for that specific thing,

but actually deeper,

there might be another variable that is initialized,

but logically speaking, the one you meant-



It’s like name the same, but it’s a different thing.

And you forgot to initialize whatever,

like some counter or some basic variable

that you’re using-

I can tell that you’ve coded.

By the way, I should mention

that I use a Kinesis keyboard,

which has the backspace under the thumb.

And one of the biggest reasons I use that keyboard

is because you realize,

in order to use the backspace on a usual keyboard,

you have to stretch your pinky out.

And like, for most normal keyboards,

the backspace is under the pinky.

And so I don’t know if people realize

the pain they go through in their life

because of the backspace key being so far away.

So with the Kinesis, it’s right under the thumb,

so you don’t have to actually move your hands.

The backspace and the delete-

What do you do if you’re ever not with your own keyboard

and you have to use someone else’s PC keyboard

that has that standard layout?

So first of all, it turns out

that you can actually go your whole life

always having the keyboard with you.

So this-

Well, except for that little tablet

that you’re using for note-taking right now, right?

Yeah, so it’s very inefficient note-taking,

but I’m not, I’m just looking stuff up.

But in most cases,

I would be actually using the keyboard here right now.

I just don’t anticipate.

You have to calculate,

how much typing do you anticipate?

If I anticipate quite a bit,

then I’ll just, I have a keyboard-

You pull it out.

And same with, I mean, the embarrassing,

I’ve accepted being the weirdo that I am,

but when I go on an airplane

and I anticipate to do programming or a lot of typing,

I will have a laptop

that will pull out a Kinesis keyboard

in addition to the laptop,

and it’s just who I am.

You have to accept who you are.

But also, for a lot of people,

for me certainly, there’s a comfort space

where there’s a certain kind of setups

that are maximized productivity.

And it’s like some people have a warm blanket

that they like when they watch a movie.

I like the Kinesis keyboard.

It takes me to a place of focus.

And I still mostly,

I’m trying to make sure I use the state-of-the-art IDEs

for everything, but my comfort place,

just like the Kinesis keyboard, is still Emacs.

So I still use, I still, I mean,

that’s one of some of the debates I have with myself

about everything from a technology perspective

is how much to hold on to the tools you’re comfortable with

versus how much to invest in using modern tools.

And the signal that the communities provide you with

is the noisy one,

because a lot of people year to year

get excited about new tools.

And you have to make a prediction.

Are these tools defining a new generation

or something that will transform programming?

Or is this just a fad that will pass?

Certainly with JavaScript frameworks

and front-end and back-end of the web,

there’s a lot of different styles that came and went.

I remember learning, what was it called, ActionScript.

I remember for Flash, learning how to program in Flash,

learning how to design, do graphic animation,

all that kind of stuff in Flash.

Same with Java applets.

I remember creating quite a lot of Java applets,

thinking that this potentially defines

the future of the web, and it did not.

Well, you know, in most cases like that,

the particular technology eventually gets replaced,

but many of the concepts that the technology introduced

or made accessible first are preserved, of course,

because yeah, we’re not using Java applets anymore,

but the notion of reactive webpages

that sort of contain little bits of code

that respond directly to something you do

like pressing a button or a link or hovering even,

has certainly not gone away.

And that those animations that were made

painfully complicated with Flash,

I mean, Flash was an innovation when it first came up.

And when it was replaced by JavaScript equivalents stuff,

it was a somewhat better way to do animations,

but those animations are still there.

Not all of them, but sort of, again,

there is an evolution and often, so often with technology,

that the sort of the technology

that was eventually thrown away or replaced

was still essential to sort of get started.

There wouldn’t be jet planes without propeller planes.

I bet you.

But from a user perspective, yes, from the feature set, yes,

but from a programmer perspective,

it feels like all the time I’ve spent with ActionScript,

all the time I’ve spent with Java on the applet side

for the GUI development, well, no, Java, I have to push back.

That was useful because it transfers,

but the Flash doesn’t transfer.

So some things you learn and invest time in.

Yeah, what you learned,

the skill you picked up learning ActionScript

was sort of, it was perhaps a super valuable skill

at the time you picked it up.

If you learned ActionScript early enough,

but that skill is no longer in demand.

Well, that’s the calculation you have to make

when you’re learning new things.

Like today, people start learning programming.

Today, I’m trying to see what are the new languages to try?

What are the new systems to try that?

What are the new ideas to try to keep improving?

That’s why we start when we’re young, right?

But that seems very true to me,

that when you’re young,

you have your whole life ahead of you

and you’re allowed to make mistakes.

In fact, you should feel encouraged

to do a bit of stupid stuff.

Try not to get yourself killed or seriously maimed,

but try stuff that deviates

from what everybody else is doing.

And like nine out of 10 times,

you’ll just learn why everybody else is not doing that,

or why everybody else is doing it some other way.

And one out of 10 times,

you sort of, you discover something that’s better

or that somehow works.

I mean, there are all sorts of crazy things

that were invented by accident,

by people trying stuff together.

That’s great advice to try random stuff,

make a lot of mistakes.

Once you’re married with kids,

you’re probably going to be a little more risk averse

because now there’s more at stake

and you’ve already hopefully had some time

where you were experimenting with crazy shit.

I like how marriage and kids solidifies

your choice of programming language.

How does that, the Robert Frost poem

with the road less taken,

which I think is misinterpreted by most people.

But anyway, I feel like the choices you make early on,

especially if you go all in,

they’re gonna define the rest of your life’s trajectory

in a way that, like you basically are picking a camp.

So, you know, there’s, if you invest a lot in PHP,

if you invest a lot in .NET,

if you invest a lot in JavaScript,

you’re going to stick there.

That’s your life journey.

It’s very hard to tell.

Well, only as far as that technology remains relevant.

Yes, yes.

I mean, if at age 16, you learn coding in C

and by the time you’re 26, C is like a dead language,

then there’s still time to switch.

There’s probably some kind of survivor bias

or whatever it’s called in sort of your observation

that you pick a camp

because there are many different camps to pick.

And if you pick .NET,

then you can coast for the rest of your life

because that technology is now so ubiquitous, of course,

that even if it’s bound to die,

it’s gonna take a very long time.

Well, for me personally,

I had a very difficult and in my own head brave leap

that I had to take relevant to our discussion,

which is most of my life I programmed in C and C++.

And so having that hammer, everything looked like a nail.

So I would literally even do scripting in C++.

Like I would create programs that do script like things.

And when I first came to Google and before then,

it became already before TensorFlow, before all of that,

there was a growing realization that C++

is not the right tool for machine learning.

We could talk about why that is.

It’s unclear why that is.

A lot of things has to do with community and culture

and how it emerges and stuff like that.

But for me to decide to take the leap to Python,

like all out, basically switch completely from C++

except for highly performant robotics applications.

There was still a culture of C++ in the space of robotics.

That was a big leap.

Like I had to, you know, like people have like

existential crises or midlife crises or whatever.

You have to realize almost like walking away

from a person you love.

Because I was sure that C++

would have to be a lifelong companion.

For a lot of problems I would want to solve,

C++ would be there.

And it was a question to say,

well, that might not be the case.

Because C++ is still one of the most popular languages

in the world, one of the most used,

one of the most dependent on.

It’s also still evolving quite a bit.

I mean, that is not sort of a fossilizing community.


They are doing great innovative work actually.

A lot.

But yet that sort of their innovations are hard to follow

if you’re not already a hardcore C++ user.

Well, this was the thing.

It pulls you in.

It’s a rabbit hole.

I was a hardcore.

The old metaprogramming, template programming.

Like I would start using the modern C++ as it developed.


Not just the shared pointer and the garbage collection

that makes it easier for you to work with some of the flaws.

But the detail, like the metaprogramming,

the crazy stuff that’s coming out there.

But then you have to just empirically look and step back

and say, what language am I more productive in?

Sorry to say, what language do I enjoy my life with more?

And readability and able to think through

and all that kind of stuff.

Those questions are harder to ask

when you already have a loved one,

which in my case was C++.

And then there’s Python, like that meme.

The grass is greener on the other side.

Am I just infatuated with a new fad, new cool thing?

Or is this actually going to make my life better?

And I think a lot of people face that kind of decision.

It was a difficult decision for me when I made it.

At this time, it’s an obvious switch

if you’re into machine learning.

But at that time, it wasn’t quite yet so obvious.

So it was a risk.

And you have the same kind of stuff.

I still, because of my connection to WordPress,

I still do a lot of backend programming in PHP.

And the question is, Node.js, Python,

do you switch backend to any of those programmings?

There’s the case for Node.js for me.

Well, more and more and more of the front end,

it runs in JavaScript.

And fascinating cool stuff is done in JavaScript.

Maybe use the same programming language

for the backend as well.

The case for Python for the backend is,

well, you’re doing so much programming

outside of the web in Python.

So maybe use Python for the backend.

And then the case for PHP,

well, most of the web still runs in PHP.

You have a lot of experience with PHP.

Why fix something that’s not broken?

Those are my own personal struggles,

but I think they reflect the struggles of a lot of people

with different programming languages,

with different problems they’re trying to solve.

It’s a weird one.

And there’s not a single answer, right?

Because depending on how much time

you have to learn new stuff,

where you are in your life,

what you’re currently working on,

who you want to work with, what communities you like,

there’s not one right choice.

Maybe if you sort of, if you can look back 20 years,

you can say, well, that whole detour through action script

was a waste of time, but nobody could know that.

So you can’t beat yourself up over that.

You just need to accept that not every choice you make

is going to be perfect.

Maybe sort of keep a plan B in the back of your mind,

but don’t overthink it.

Don’t try to sort of, don’t create a spreadsheet

with like, where you’re trying to estimate,

well, if I learn this language,

I expect to make X million dollars in a lifetime.

And if I learn that language,

I expect to make Y million dollars in a lifetime.

And which is higher and which has more risk

and where’s the chance that, it’s like picking a stock.

Kind of, kind of, but I think with stocks,

you can do, diversifying your investment is good.

With productivity in life,

boy, that spreadsheet is possible to construct.

Like if you actually carefully analyze

what your interests in life are,

where you think you can maximally impact the world,

there really is better and worse choices

for a programming language.

They’re not just about the syntax, but about the community,

about where you predict the community’s headed,

what large systems are programmed in that.

But can you create that spreadsheet?

Because that’s sort of, you’re mentioning

a whole bunch of inputs that go into that spreadsheet

where you have to estimate things

that are very hard to measure

and even harder.

I mean, they’re hard to measure retroactively

and they’re even harder to predict.

Like, what is the better community?

Well, better is one of those incredibly difficult words.

What’s better for you is not better for someone else.

But we’re not doing a public speech about what’s better.

We’re doing a personal spiritual journey.

I can determine a circle of friends,

circle one and circle two,

and I can have a bunch of parties with one

and a bunch of parties with two,

and then write down or take a mental note

of what made me happier, right?

And that, you know, you have,

if you’re a machine learning person,

you wanna say, okay, I want to build a large company

that does, that is grounded in machine learning,

but also has a sexy interface

that has a large impact on the world.

What languages do I use?

You look at what Facebook is using,

you look at what Twitter is using.

Then you look at performant,

more newer languages like Rust,

or you look at languages that have taken,

that most of the community uses

in the machine learning space, that’s Python.

And you can, like, think through,

you can hang out and think through it.

And it’s always a, and the level of activity

of the community is also really interesting.

Like you said, C++ and Python are super active

in terms of the development of the language itself.

But do you think that you can make objective choices there?

No, no.


But there’s a gut you build up.

Like, don’t you believe in that gut feeling about-

Oh, everything is very subjective.

And yes, you most certainly can have a gut feeling

and your gut can also be wrong.

That’s why there are billions of people

because they’re not all right.

I mean, clearly there are more people

living in the Bay Area who have plans

to sort of create a Google-sized company

than there’s room in the world for Google-sized companies.

And they’re gonna have to duke it out

in the market space.

And there’s many more choices

than just the programming language.

Speaking of which, let’s go back to the boat

with the fisherman who’s tuned out long ago.

Let’s talk to the programmer.

Let’s jump around and go back to CPython

that we tried to define as the reference implementation.

And one of the big things that’s coming out in 3.11,

what’s the right way to-

We tend to say 3.11 because it really was like

we went 3.8, 3.9, 3.10, 3.11,

and we’re planning to go up to 3.99.


What happens after 99?

Probably just 3.100, if I make it there.


And go all the way to 420.

I got it.

Forever Python v3.

We’ll talk about 4, but more for fun.

So, 3.11 is coming out.

One of the big sexy things in it is it’ll be much faster.

So how did you, beyond hiring a great team

or working with a great team, make it faster?

What are some ideas that makes it faster?

It has to do with simplicity of software versus performance.

And so even though C is known to be a low-level language,

which is great for writing sort of

a high-performance language interpreter,

when I originally started Python or C Python,

I didn’t expect there would be

great success and fame in my future.

So I tried to get something working and useful

in about three months.

And so I sort of, I cut corners.

I borrowed ideas left and right

when it comes to language design as well as implementation.

I also wrote much of the code as simple as it could be.

And there are many things that you can code

more efficiently by adding more code.

It’s a bit of a sort of a time-space trade-off

where you can compute a certain thing

from a small number of inputs.

And every time you get presented with a new input,

you do the whole computation from the top.

That can be simple-looking code.

It’s easy to understand.

It’s easy to reason about that.

You can tell quickly that it’s correct,

at least in the sort of mathematical sense of correct.

Because it’s implemented in C,

maybe it performs relatively well,

but over time as sort of,

as the requirements for that code

and the need for performance go up,

you might be able to rewrite that same algorithm

using more memory, maybe remember previous results

so you don’t have to recompute everything from scratch.

Like the classic example is computing prime numbers.

Like is 10 a prime number?

Well, you sort of, is it divisible by two?

Is it divisible by three?

Is it divisible by four?

And we go all the way to, is it divisible by nine?

And it is not, well, actually 10 is divisible by two,

so there we stop, but say 11.

Is it divisible by 10?

The answer is no, 10 times in a row.

So now we know 11 is a prime number.

On the other hand, if we already know

that two, three, five, and seven are prime numbers,

and you know a little bit about the mathematics

of how prime numbers work,

you know that if you have a rough estimate

for the square root of 11,

you don’t actually have to check,

is it divisible by four or is it divisible by five?

All you have to check in the case of 11

is, is it divisible by two?

Is it divisible by three?

Because take 12, if it’s divisible by four,

well, 12 divided by four is three,

so you should have come across the question,

is it divisible by three first?

So if you know basically nothing about prime numbers

except the definition, maybe you go for X from two

through N minus one, is N divisible by X?

And then at the end, if you got all nos

for every single one of those questions,

you know, oh, it must be a prime number.

Well, the first thing is you can stop iterating

when you find a yes answer.

And the second is you can also stop iterating

when you have reached the square root of N

because you know that if it has a divisor

larger than the square root,

it must also have a divisor smaller than the square root.

Then you say, oh, except for two,

we don’t need to bother with checking for even numbers

because all even numbers are divisible by two.

So if it’s divisible by four,

we would already have come across the question,

is it divisible by two?

And so now you go special case check,

is it divisible by two?

And then you just check three, five, seven, 11.

And so now you’ve sort of reduced your search space

by 50% again, by skipping all the even numbers

except for two.

If you think a bit more about it,

or you just read in your book about the history of math,

one of the first algorithms ever written down,

all you have to do is check,

is it divisible by any of the previous prime numbers

that are smaller than the square root?

And before you get to a better algorithm than that,

you have to have several PhDs in discrete math.

So that’s as much as I know.

So of course that same story applies

to a lot of other algorithms.

String matching is a good example

of how to come up with an efficient algorithm.

And sometimes the more efficient algorithm

is not so much more complex than the inefficient one.

But that’s an art, and it’s not always the case.

In the general cases, the more performant the algorithm,

the more complex it’s gonna be.

There’s a kind of trade-off.

The simpler algorithms are also the ones

that people invent first.

Because when you’re looking for a solution,

you look at the simplest way to get there first.

And so if there is a simple solution,

even if it’s not the best solution,

not the fastest or the most memory efficient or whatever,

a simple solution, and simple is fairly subjective,

but mathematicians have also thought about

sort of what is a good definition for simple

in the case of algorithms.

But the simpler solutions tend to be easier to follow

for other programmers who haven’t made a study

of a particular field.

And when I started with Python,

I was a good programmer in general.

I knew sort of basic data structures.

I knew the C language pretty well.

But there were many areas where I was

only somewhat familiar with the state of the art.

And so I picked, in many cases,

the simplest way I could solve a particular sub-problem.

Because when you’re designing and implementing a language,

you have to like,

you have many hundreds of little problems to solve.

And you have to have solutions for every one of them

before you can sort of say,

I’ve invented a programming language.

First of all, so CPython, what kind of things does it do?

It’s an interpreter.

It takes in this readable language that we talked about

that is Python.

What is it supposed to do?

The interpreter, basically,

it’s sort of a recipe for understanding recipes.

So instead of a recipe that says, bake me a cake,

we have a recipe for, well, given the text of a program,

how do we run that program?

And that is sort of the recipe for building a computer.

The recipe for the baker and the chef.


What are the algorithmically tricky things

that happen to be low-hanging fruit

that could be improved on?

Maybe throughout the history of Python, but also now,

how is it possible that 3.11 in year 2022,

it’s possible to get such a big performance improvement?

We focused on a few areas

where we still felt there was low-hanging fruit.

The biggest one is actually the interpreter itself.

And this has to do with details of how Python is defined.

So I didn’t know if the fisherman

is going to follow this story.

He already jumped off the boat.

He’s just, I’m bored.


This is stupid.

Python is actually,

even though it’s always called an interpreted language,

there’s also a compiler in there.

It just doesn’t compile to machine code.

It compiles to bytecode,

which is sort of code for an imaginary computer

that is called the Python interpreter.

So it’s compiling code that is more easily digestible

by the interpreter, or is digestible at all?

It is the code that is digested by the interpreter.

That’s the compiler.

We tweaked very minor bits of the compiler.

Almost all the work was done in the interpreter,

because when you have a program, you compile it once,

and then you run the code a whole bunch of times.

Or maybe there’s one function in the code

that gets run many times.

Now, I know that sort of people who know this field

are expecting me to, at some point,

say we built a just-in-time compiler.

Actually, we didn’t.

We just made the interpreter a little more efficient.

What’s a just-in-time compiler?

That is a thing from the Java world,

although it’s now applied to almost all programming languages,

especially interpreted ones.

So you see the compiler inside Python,

not like a just-in-time compiler,

but it’s a compiler that creates bytecode

that is then fed to the interpreter.

And the compiler,

was there something interesting to say about the compiler?

It’s interesting that you haven’t changed that,

tweaked that at all, or much.

We changed some parts of the bytecode,

but not very much.

And so we only had to change the parts of the compiler

where we decided that the breakdown

of a Python program in bytecode instructions

had to be slightly different.

But that didn’t gain us the performance improvements.

The performance improvements

were like making the interpreter faster in part

by sort of removing the fat

from some internal data structures used by the interpreter.

But the key idea is an adaptive specializing interpreter.

Let’s go.

What is adaptive about it?

What is specialized about it?

Well, let me first talk about the specializing part

because the adaptive part is the sort of

the second order effect, but they’re both important.

So bytecode is a bunch of machine instructions

but it’s an imaginary machine.

But the machine can do things like call a function,

add two numbers, print a value.

Those are sort of typical instructions in Python.

And if we take the example of adding two numbers,

actually in Python, the language,

there’s no such thing as adding two numbers.

There’s just, the compiler doesn’t know

that you’re adding two numbers.

You might as well be adding two strings or two lists

or two instances of some user defined class

that happened to implement this operator called add.

That’s a very interesting

and fairly powerful mathematical concept.

It’s mostly a user interface trick

because it means that a certain category of functions

can be written using a single symbol, the plus sign

and sort of a bunch of other functions can be written

using another single symbol, the multiply sign.

So if we take addition, the way traditionally in Python

the add bytecode was executed is

pointers, pointers, and more pointers.

So first we have two objects.

An object is basically a pointer to a bunch of memory

that contains more pointers.

Pointers all the way down.

Well, not quite, but there are a lot of them.

So to simplify a bit, we look up in one of the objects

what is the type of that object?

And does that object type define an add operation?

And so you can imagine that there is a sort of

a type integer that knows how to add itself

to another integer.

And there is a type floating point number

that knows how to add itself

to another floating point number.

And the integers and floating point numbers

are sort of important, I think, mostly historically

because in the first computers

you used the sort of, the same bit pattern

when interpreted as a floating point number

had a very different value

than when interpreted as an integer.

Can I ask a dumb question here?

Please do.

Given the basics of int and float and add,

who carries the knowledge of how to add two integers?

Is it the integer?

It’s the type integer versus?

It’s the type integer and the type float.

What about the operator?

Does the operator just exist as a platonic form

possessed by the integer?

The operator is more like,

it’s an index in a list of functions

that the integer type defines.

And so the integer type

is really a collection of functions

and there is an add function

and there’s a multiply function

and there are like 30 other functions for other operations.

There’s a power function, for example.

And you can imagine that in memory

there is a distinct slot for the add operations.

Let’s say the add operation is the first operation of a type

and the multiply is the second operation of a type.

So now we take the integer type

and we take the floating point type.

In both cases, the add operation is the first slot

and multiply is the second slot.

But each slot contains a function

and the functions are different

because the add to integers function

interprets the bit patterns as integers.

The add to float function

interprets the same bit pattern

as a floating point number.

And then there is the string data type,

which again, interprets the bit pattern

as the address of a sequence of characters.

There are lots of lies in that story,

but that’s sort of a basic idea.

I can tell the fake news and the fabrication

going on here at the table,

but where’s the optimization?

Is it on the operators?

Is it different inside the integer?

The optimization is the observation

that in a particular line of code,

so now you write your little Python program

and you write a function

and that function sort of takes a bunch of inputs

and at some point it adds two of the inputs together.

Now I bet you even if you call your function a thousand times

that all those calls are likely

all going to be about integers

because maybe your program is all about integers

or maybe on that particular line of code

where there’s that plus operator,

every time the program hits that line,

the variables A and B that are being added together

happen to be strings.

And so what we do is instead of having

this single byte code that says,

here’s an add operation

and the implementation of add is fully generic.

It looks at the object from the object,

it looks at the type,

then it takes the type

and it looks up the function pointer,

then it calls the function.

Now the function has to look at the other argument

and it has to double check

that the other argument has the right type.

And then there’s a bunch of error checking

before it can actually just go ahead

and add the two bit patterns in the right way.

What we do is every time we execute

an add instruction like that,

we keep a little note of,

in the end, after we hit the code

that did the addition for a particular type,

what type was it?

And then after a few times through that code,

if it’s the same type all the time,

we say, oh, so this add operation,

even though it’s the generic add operation,

it might as well be the add integer operation.

And add integer operation is much more efficient

because it just says,

assume that A and B are integers,

do the addition operation,

do it right there in line

and produce the result.

And the big lie here is that in Python,

even if you have great evidence that in the past,

it was always two integers that you were adding,

at some point in the future,

that same line of code could still be hit

with two floating points or two strings,

or maybe a string and an integer.

It’s not a great lie.

That’s just the fact of life.

I didn’t account for what should happen in that case

in the way I told the story.

There is some accounting for that.

And so what we actually have to do

is when we have the add integer operation,

we still have to check,

are the two arguments in fact integers?

We applied some tricks to make those checks efficient.

And we know statistically

that the outcome is almost always,

yes, they are both integers.

And so we quickly make that check

and then we proceed with the sort of add integer operation.

And then there is a fallback mechanism where we say,

oops, one of them wasn’t an integer.

Now we’re gonna pretend

that it was just the fully generic add operation.

We wasted a few cycles believing it was going

to be two integers and then we had to back up,

but we didn’t waste that much time

and statistically most of the time.

Basically, we’re sort of hoping

that most of the time we guess right,

because if it turns out that we guessed wrong too often,

or we didn’t have a good guess at all,

things might actually end up running a little slower.

So someone armed with this knowledge

and a copy of the implementation,

someone could easily construct a counter example

where they say, oh, I have a program

and then now it runs five times as slow in Python 3.11

than it did in Python 3.10.

But that’s a very unrealistic program.

That’s just like an extreme fluke.

It’s a fun reverse engineering task though.

Oh yeah.

So there’s…

People like fun, yes.

So there’s some presumably heuristic

of what defines a momentum of saying,

you seem to be working adding two integers,

not two generic types.

So how do you figure out that heuristic?

I think that the heuristic is actually,

we assume that the weather tomorrow

is gonna be the same as the weather today.

So you don’t need two days of the weather?


That is already so much better than guessing randomly.

So how do you find this idea?

Hey, I wonder if instead of adding two generic types,

we start assuming that the weather tomorrow

is the same as the weather today.

Where do you find the idea for that?

Because that ultimately, for you to do that,

you have to kind of understand

how people are using the language, right?

Python is not the first language to do a thing like this.

This is a fairly well-known trick,

especially from other interpreted languages

that had reason to be sped up.

We occasionally look at papers about HHVM,

which is Facebook’s efficient compiler for PHP.

There are tricks known from the JVM,

and sometimes it just comes from academia.

So the trick here is that the type itself doesn’t,

the variable doesn’t know what type it is.

So this is not a statically typed language

where you can afford to have a shortcut to saying it’s ints.

This is a trick that is especially important

for interpreted languages with dynamic typing,

because if the compiler could read in the source,

these X and Y that we’re adding are integers,

the compiler can just insert a single add machine code

that hardware machine instruction

that exists on every CPU and ditto for floats.

But because in Python,

you don’t generally declare the types of your variables,

you don’t even declare the existence of your variables.

They just spring into existence when you first assign them,

which is really cool and sort of helps those beginners

because there is less bookkeeping

they have to learn how to do

before they can start playing around with code.

But it makes the interpretation of the code less efficient.

And so we’re sort of trying to make the interpretation

more efficient without losing

the super dynamic nature of the language.

That’s always the challenge.

3.5 got the PEP484 type hints.

What is type hinting and is it used by the interpreter,

the hints, or is it just syntactic sugar?

So the type hints is an optional mechanism

that people can use.

And it’s especially popular with sort of larger companies

that have very large code bases written in Python.

Do you think of it as almost like documentation

saying these two variables are this type?

More than documentation.

I mean, so it is a sub language of Python

where you can express the types of variables.

So here’s a variable and it’s an integer.

And here’s an argument to this function and it’s a string.

And here is a function that returns a list of strings.

But that’s not checked when you run the code.

But exactly, there is a separate piece of software

called a static type checker

that reads all your source code without executing it

and thinks long and hard about what it looks like

from just reading the code that code might be doing

and double checks if that makes sense

if you take the types as annotated into account.

So this is something you’re supposed to run as you develop.

It’s like a linter, yeah.

That’s definitely a development tool,

but the type annotations currently are not used

for speeding up the interpreter.

And there are a number of reasons.

Many people don’t use them.

Even when they do use them, they sometimes contain lies

where the static type checker says, everything’s fine.

I cannot prove that this integer is ever not an integer,

but at runtime, somehow someone manages

to violate that assumption.

And the interpreter ends up doing just fine.

If we started enforcing type annotations in Python,

many Python programs would no longer work.

And some Python programs wouldn’t even be possible

because they’re too dynamic.

And so we made a choice of not using the annotations.

There is a possible future where eventually

three, four, five releases in the future,

we could start using those annotations

to sort of provide hints,

because we can still say,

well, the source code leads us to believe

that these X and Y are both integers.

And so we can generate an add integer instruction,

but we can still have a fallback that says,

oh, if somehow the code at runtime provided something else,

maybe it provided two decimal numbers,

we can still use that generic add operation as a fallback,

but we’re not there.

Is there currently a mechanism

or do you see something like that

where you can almost add like an assert

inside a function that says,

please check that my type hints

are actually mapping to reality.

Sort of like insert manual static typing.

There are third-party libraries that are in that business.

So it’s possible to do that kind of thing.

It’s possible for a third-party library

to take a hint and enforce it.

Seems like a tricky thing.

Well, what we actually do is,

and I think this is a fairly unique feature in Python.

The type hints can be introspected at runtime.

So while the program is running,

they mean Python is a very introspectable language.

You can look at a variable and ask yourself,

what is the type of this variable?

And if that variable happens to refer to a function,

you can ask, what are the arguments to the function?

And nowadays you can also ask,

what are the type annotations for the function?

So the type annotations are there inside the variable

as it’s at runtime.

They’re mostly associated with the function object,

not with each individual variable,

but you can sort of map from the arguments to the variables.

And that’s what a third-party library can help with.

Exactly, and the problem with that

is that all that extra runtime type checking

is going to slow your code down instead of speed it up.

I think to reference this sales pitchy blog post

that says 75% of developers’ time is spent on debugging,

I would say that in some cases that might be okay.

It might be okay to pay the cost of performance

for the catching of the types, the type errors.

And in most cases, doing it statically

before you ship your code to production

is more efficient than doing it at runtime piecemeal.


Can you tell me about mypy project?

What is it, what’s the mission,

and in general, what is the future

of static typing in Python?

Well, so mypy was started by a Finnish developer,

Jukka Latusalo.

So many cool things out of Finland, I gotta say.

Just that part of the world.

I guess people have nothing better to do

in those long, cold winters.

I don’t know, I think Jukka lived in England

when he invented that stuff, actually.

But mypy is the original static type checker for Python.

And the type annotations that were introduced

with PEP484 were sort of developed together

with the static type checker.

And in fact, Jukka had first invented a different syntax

that wasn’t quite compatible with Python.

And Jukka and I sort of met at a Python conference

in, I think, in 2013.

And we sort of came up with a compromise syntax

that would not require any changes to Python,

and that would let mypy sort of be an add

on static type checker for Python.

Just out of curiosity,

was it like double colon or something?

What was he proposing that would break Python?

I think he was using angular brackets for types

like in C++ or Java generics.

Yeah, you can’t use angular brackets in Python.

It would be too tricky for template type stuff.

Well, the key thing is that we already had

syntax for annotations.

We just didn’t know what to use them for yet.

So type annotations were just the sort of most logical thing

to use that existing dummy syntax for.

So there was no syntax for defining generics

directly syntactically in the language.

mypy literally meant my version of Python,

where my refers to Jukka.

He had a parser that translated mypy

into Python by like doing the type checks

and then removing the annotations

and all the angular brackets from the positions

where he was using them.

But a preprocessor model doesn’t work very well

with the typical workflow of Python development projects.

That’s funny.

I mean, that could have been another major split

if it became successful.

Like if you watch TypeScript versus JavaScript

is like a split in the community over types, right?

That seems to be stabilizing now.

It’s not necessarily a split.

There are certainly plenty of people

who don’t use TypeScript,

but just use the original JavaScript notation,

just like there are many people in the Python world

who don’t use type annotations

and don’t use static type checkers.

No, I know, but there is a bit of a split

between TypeScript and old school JavaScript,

ES, whatever.

Well, in the JavaScript world,

transpilers are sort of the standard way of working anyway,

which is why TypeScript being a transpiler itself

is not a big deal.

And transpilers, for people who don’t know,

it’s exactly the thing you said with mypy.

It’s the code, I guess you call it preprocessing,

code that translates from one language to the other.

And that’s part of the culture,

part of the workflow of the JavaScript community.

So that’s right.

At the same time,

an interesting development

in the JavaScript slash TypeScript world at the moment

is that there is a proposal under consideration.

It’s only a stage one proposal

that proposes to add a feature to JavaScript

where just like Python,

it will ignore certain syntax

when running the JavaScript code.

And what it ignores is more or less a superset

of the TypeScript annotation syntax.


So that would mean that eventually, if you wanted to,

you could take TypeScript

and you could shove it directly

into a JavaScript interpreter without transpilation.

The interesting thing in the JavaScript world,

at least the web browser world,

the web browsers have changed how they deploy

and they sort of update their JavaScript engines

much more quickly than they used to in the early days.

And so there’s much less of a need

for transpilation in JavaScript itself

because most browsers just support

the most recent version of ECMAScript.

Just on a tangent of a tangent,

do you see, if you were to recommend somebody use a thing,

would you recommend TypeScript or JavaScript?

I would recommend TypeScript.

Just because of the strictness of the typing?

It’s an enormously helpful extra tool

that helps you sort of keep your head straight

about what your code is actually doing.

I mean, it helps with editing your code.

It helps with ensuring that your code is not too incorrect.

And it’s actually quite compatible with JavaScript,

nevermind this syntactic sort of hack

that is still years in the future.

But any library that is written in pure JavaScript

can still be used from TypeScript programs.

And also the other way around,

you can write a library in TypeScript

and then export it in a form

that is totally consumable by JavaScript.

That sort of compatibility is sort of the key

to the success of TypeScript.

Yeah, just to look at it,

it’s almost like a biological system that’s evolving.

It’s fascinating to see JavaScript evolve the way it does.

Well, maybe we should consider

that biological systems

are just engineering systems too, right?


Just very advanced with more history.

But it’s almost like the most visceral

in the JavaScript world

because there’s just so much code written in JavaScript

that for its history was messy.

If you’re talking about bugs per line of code,

I just feel like JavaScript eats the cake

or whatever the terminology is.

It beats Python by a lot in terms of the number of bugs,

meaning like way more bugs in JavaScript.

And then obviously the browsers are developed.

I mean, just there’s so much active development.

It feels a lot more like evolution

where a bunch of stuff is born and dies

and there’s experimentation and debates

versus Python is more, all that stuff is happening,

but there’s just a longer history

of stable working giant software systems

written in Python versus JavaScript

is just a giant, beautiful, I would say, mess of code.

It’s a very different culture.

And to some extent, differences in culture are random,

but to some extent,

the differences have to do with the environment.

And the fact that JavaScript is primarily

the language for developing web applications,

especially the client side,

and the fact that it’s basically the only language

for developing web applications

makes that community sort of just have a different nature

than the community of other languages.

Plus the graphical component

and the fact that they’re deploying it

on all kinds of shapes of screens and devices

and all that kind of stuff,

it just creates a beautiful chaos.

Anyway, back to MyPy.

So what, okay, you met,

you talked about a syntax that could work.

Where does it currently stand?

What’s the future of static typing in Python?

It is still controversial,

but it is much more accepted

than when MyPy and PEP484 were young.

What’s the connection between PEP484 type hints and MyPy?

MyPy was the original static type checker.

So MyPy quickly evolved from Yuka’s own variant of Python

to a static type checker for Python

and sort of PEP484, that was at like

a very productive year where like many hundreds of messages

were exchanged debating the merits

of every aspect of that PEP.

And so MyPy is a static type checker for Python.

It is itself written in Python.

Most additional static typing features

that we introduced in the time since 3.6

were also prototyped through MyPy.

MyPy being an open source project

with a very small number of maintainers.

It was successful enough that people said

this static type checking stuff for Python

is actually worth an investment for our company.


But somehow they chose not to support

making MyPy faster, say, or adding new features to MyPy,

but both Google and Facebook and later Microsoft

developed their own static type checker.

I think Facebook was one of the first.

They decided that they wanted to use the same technology

that they had successfully used for HHVM

because they sort of, they had a bunch of compiler writers

and sort of static type checking experts

who had written the HHVM compiler

and it was a big success within the company.

And they had done it in a certain way, sort of.

They wrote a big, highly parallel application

in an obscure language named OCaml,

which is apparently mostly very good

for writing static type checkers.


I have a lot of questions about

how to write a static type checker then.

That’s very confusing.

Facebook wrote their version

and they worked on it in secret for about a year

and then they came clean and went open source.

Google, in the meantime,

was developing something called PyType,

which was mostly interesting because it,

as you may have heard, they have one gigantic monorepo.

So all the code is checked into a single repository.

Facebook has a different approach.

So Facebook developed Pyre, which was written in OCaml,

which worked well with Facebook’s development workflow.

Google developed something they called PyType,

which was actually itself written in Python.

And it was meant to sort of fit well

in their static type checking needs

in Google’s gigantic monorepo.

So Google wasn’t one giant, got it.

So just to clarify,

this static type checker, philosophically,

is a thing that’s supposed to exist

outside of the language itself.

And it’s just a workflow, like a debugger for the programmers.

It’s a linter.

For people who don’t know, a linter,

maybe you can correct me.

But it’s a thing that runs through the code continuously,

pre-processing to find issues based on style,


I mean, there’s all kinds of linters, right?

It can check that.

What usual things does a linter do?

Maybe check that you haven’t too many characters

in a single line?

Linters often do static analysis

where they try to point out things that are likely mistakes,

but not incorrect according to the language specification.

Like maybe you have a variable that you never use.

For the compiler, that is valid.

You might be planning to use it

in a future version of the code,

and the compiler might just optimize it out.

But the compiler’s not gonna tell you,

hey, you’re never using this variable.

A linter will tell you that variable is not used.

Maybe there’s a typo somewhere else

where you meant to use it,

but you accidentally used something else,

or there are a number of sort of common scenarios.

And a linter is often a big collection of little heuristics

where by looking at the combination

of how your code is laid out,

maybe how it’s indented,

maybe the comment structure,

but also just things like definition of names,

use of names,

it’ll tell you likely things that are wrong.

And in some cases, linters are really style checkers.

For Python, there are a number of linters

that check things like,

do you use the PEP-8 recommended naming scheme

for your functions and classes and variables?

Because like classes start with an uppercase

and the rest starts with a lowercase,

and there’s like differences there.

And so the linter can tell you,

hey, you have a class whose first letter

is not an uppercase letter.

And that’s just, I just find it annoying.

If I wanted that to be an uppercase letter,

I would have typed an uppercase letter,

but other people find it very comforting

that if the linter is no longer complaining

about their code,

that they have followed all the style rules.

Maybe it’s a fast way for a new developer

joining a team to learn the style rules, right?

Yeah, there’s definitely that.

But the best use of a linter is probably

not so much to sort of enforce team uniformity,

but to actually help developers catch bugs

that the compilers for whatever reason don’t catch.

And there’s lots of that in Python.

And so, but a static type checker

focuses on a particular aspect of the linting,

which, I mean, MyPy doesn’t care

how you name your classes and variables,

but it is meticulous about when you say

that there was an integer here

and you’re passing a string there,

it will tell you, hey, that string’s not an integer.

So something’s wrong.

Either you were incorrect when you said it was an integer

or you’re incorrect when you’re passing it a string.

If this is a race of static type checkers,

is somebody winning?

As you said, it’s interesting

that the companies didn’t choose to invest

in this centralized development of MyPy.

Is there a future for MyPy?

What do you see as the,

will one of the companies win out

and everybody uses like PyType,

whatever Google’s is called?

Well, Microsoft is hoping that Microsoft’s horse

in that race called PyRite is going to win.

PyRite, right, like R-I-G-H-T?


Yeah, all my word processors tend to typo correct

that as PyRite, the name of the, I don’t know what it is,

some kind of semi-precious metal.

Oh, right.

I love it.

Okay, so, okay, that’s the Microsoft hope,

but, okay, so let me ask the question a different way.

Is there going to be ever a future

where the static type checker gets integrated

into the language?

Nobody is currently excited about doing any work

towards that.

That doesn’t mean that five or 10 years from now,

the situation isn’t different.

At the moment, all the static type checkers

still evolve at a much higher speed than Python

and its annotation syntax evolve.

You get a new release of Python once a year.

Those are the only times

that you can introduce new annotation syntax.

And there are always people who invent new annotation syntax

that they’re trying to push.

And worse, once we’ve all agreed

that we are going to put some new syntax in,

we can never take it back.

At least the sort of deprecating an existing feature

takes many releases,

because you have to assume that people started using it

as soon as we announced it.

And then you can’t take it away from them right away.

You have to start telling them, well, this will go away,

but we’re not gonna tell you that it’s an error yet.

And then later it’s gonna be a warning.

And then eventually three releases in the future,

maybe we remove it.

On the other hand, the typical static type checker

still has a release like every month, every two months,

certainly many times a year.

Some type checkers also include a bunch of experimental ideas

that aren’t official standard Python syntax yet.

The static type checkers also just get better

at discovering things

that sort of are unspecified by the language,

but that sort of could make sense.

And so each static type checker

actually has its sort of strong and weak points.

So it’s cool, it’s like a laboratory of experiments.


Microsoft, Google and all, and you get to see.

And you see that everywhere, right?

Because there’s not one single JavaScript engine either.

There is one in Chrome, there is one in Safari,

there’s one in Firefox.

But that said, you said there’s not interest.

I think there is a lot of interest in type hinting, right?


In the PEP 484.

Actually, like how many people use that?

Do you have a sense how many people use,

because it’s optional, it’s a sugar.

I can’t put a number on it,

but from the number of packages

that do interesting things with it at runtime

and the fact that there are like now three or four

very mature type checkers

that each have their segment of the market.

And then there’s PyCharm,

which has a sort of more heuristic based type checker

that also supports the same syntax.

My assumption is that many, many people

developing Python software professionally

for some kind of production situation

are using a static type checker.

Especially anybody who has a continuous integration cycle

probably has one of the steps in there,

their testing routine that happens for basically

every commit is run a static type checker.

And in most cases, that will be MyPy.

So I think it’s pretty popular topic.

According to this webpage,

20 to 30% of Python three code bases are using type hints.

Wow, I wonder how they measured that.

Did they just scan all of GitHub?

Yeah, that’s what it looks like.


They did a quick, not all of,

but like a random sampling.

So you mentioned PyCharm.

Let me ask you the big subjective question.

What’s the best IDE for Python?

And you’re extremely biased now

that you’re with Microsoft.

Is it PyCharm, VS Code, Vim or Emacs?

Historically, I actually started out with using Vim

but when it was still called VI.

For a very long time, I think from the early 80s to,

I’d say two years ago, I was Emacs user.


Between, I’d say 2013 and 2018,

I dabbled with PyCharm,

mostly because it had a couple of features.

I mean, PyCharm is like driving an 18-wheeler truck

whereas Emacs is more like driving

your comfortable Toyota car.

That you’ve had for 100,000 miles

and you know what every little rattle of the car means.

I was very comfortable in Emacs

but there were certain things it couldn’t do.

It wasn’t very good at that sort of,

at least the way I had configured it.

I didn’t have very good tooling in Emacs

for finding a definition of a function.

Got it.

When I was at Dropbox,

exploring a 5 million line Python code base,

just grabbing all that code

for where is there a class foobar?

Well, it turns out that if you grab

all 5 million lines of code,

there are many classes with the same name.

And so PyCharm sort of once you fired it up

and once it’s indexed,

your repository was very helpful.

But as soon as I had to edit code,

I would jump back to Emacs and do all my editing there

because I could type much faster and switch between files

when I knew which file I wanted much quicker.

And I never really got used

to the whole PyCharm user interface.

Yeah, I feel torn in that same kind of way

because I’ve used PyCharm off and on

exactly in that same way.

And I feel like I’m just being an old grumpy man

for not learning how to quickly switch between files

and all that kind of stuff.

I feel like that has to do with shortcuts,

that has to do with,

I mean, you just have to get accustomed,

just like with touch typing.

Yeah, you have to just want to learn that.

I mean, if you don’t need it much,

you don’t need touch typing either.

You can type with two fingers just fine in the short term,

but in the long term,

your life will become better psychologically

and productivity-wise

if you learn how to type with 10 fingers.

If you do a lot of keyboard input.

But for everyone, emails and stuff, right?

Like you look at the next 20, 30 years of your life,

you have to anticipate where technology is going.

Do you want to invest in handwriting notes?

Probably not.

More and more people are doing typing

versus handwriting notes.

So you can anticipate that.

So there’s no reason to actually practice handwriting.

There’s more reason to practice typing.

You can actually estimate, back to the spreadsheet,

the number of paragraphs, sentences,

or words you’ll write for the rest of your life.

You can probably estimate-

You go again with the spreadsheet of my life, huh?

The spreadsheet, yes.

All of that is not actual, like converted to a spreadsheet,

but it’s a gut feeling.

Like I have the same kind of gut feeling about books.

I’ve almost exclusively switched to Kindle now,

so ebook readers.

Even though I still love, and probably always will,

the smell, the feel of a physical book.

And the reason I switched to Kindle is like,

all right, well, this is really paving.

The future is going to be digital

in terms of consuming books and content of that nature.

So you should let your brain

get accustomed to that experience.

In that same way, it feels like PyCharm or VS Code.

I think PyCharm is the most sort of sophisticated,

featureful Python ID.

It feels like I should probably at some point very soon

switch entire-

Like I’m not allowed to use anything else for Python

than this ID or VS Code.

It doesn’t matter, but walk away from Emacs

for this particular application.

So I think I’m limiting myself in the same way

that using two fingers for typing is limiting myself.

This is a therapy session.

I’m not even asking questions.

But I’m sure a lot of people are thinking this way, right?

I’m not going to stop you.

I think that sort of everybody has to decide for themselves

which one they want to invest more time in.

I actually ended up giving VS Code a very tentative try

when I started out at Microsoft and really liking it.

And it sort of, it took me a while

before I realized why that was.

But, and I think that actually the founders of VS Code

may not necessarily agree with me on this.

But to me, VS Code is in a sense

the spiritual successor of Emacs.

Because as you probably know, as an old Emacs hack,

the key part of Emacs is that it’s mostly written in Lisp.

And that sort of new features of Emacs

usually update all the Lisp packages

and add new Lisp packages.

And oh yeah, there’s also some very obscure thing

improved in the part that’s not in Lisp.

But that’s usually not why you would upgrade

to a new version of Emacs.

There’s a core implementation that sort of can read a file

and it can put bits on the screen

and it can sort of manage memory and buffers.

And then what makes it an editor full of features

is all the Lisp packages.

And of course the design of how the Lisp packages

interact with each other and with that sort of

base layer of the core immutable engine.

But almost everything in that core engine in Emacs case

can still be overridden or replaced.

And so VS Code has a similar architecture

where there is like a base engine

that you have no control over.

I mean, it’s open source,

but nobody except the people who work on that part

changes it much.

And it has a sort of a package manager

and a whole series of interfaces for packages

and an additional series of conventions

for how packages should interact with the lower layers

and with each other and powerful primitive operations

that let you move the cursor around

or select pieces of text or delete pieces of text

or interact with the keyboard and the mouse

and whatever peripherals you have.

And so the sort of the extreme extensibility

and the package ecosystem that you see in VS Code

is a mirror of very similar architectural features in Emacs.

Well, I’ll have to give it a serious try

because as far as sort of the hype and the excitement

in the general programming community,

VS Code seems to dominate.

The interesting thing about PyCharm

and what is it, PHP Storm,

which are these JetBrains specific IDs

that are designed for one programming language.

It’s interesting to, when an ID is specialized, right?

They’re usually actually just specializations of IntelliJ

because underneath it’s all the same editing engine

with different veneer on top,

where in VS Code,

many things you do require loading third-party extensions.

In PyCharm, it is possible to have third-party extensions

but it is a struggle to create one.

Yes, and it’s not part of the culture,

all that kind of stuff.

Yeah, I remember that it might’ve been five years ago

or so we were trying to get

some better MyPy integration into PyCharm

because MyPy is sort of Python tooling

and PyCharm had its own type checking heuristic thing

that we wanted to replace with something based on MyPy

because that was what we were using in the company.

And for the guy who was writing that PyCharm extension,

it was really a struggle to sort of find documentation

and get the development workflow going

and debug his code and all that.

So it was not a pleasant experience.

Let me talk to you about parallelism.

In your post titled,

Reasoning About AsyncIO Semaphore,

you talk about a fast food restaurant in Silicon Valley

that has only one table.

Is this a real thing?

I just wanted to ask you about that.

Is that just like a metaphor you’re using

or is that an actual restaurant in Silicon Valley?

It was a metaphor, of course.

I can imagine such a restaurant.

So for people who don’t then read the thing

you should, but it was a idea of a restaurant

where there’s only one table

and you show up one at a time and you’re prepared.

And I actually looked it up

and there is restaurants like this throughout the world.

And it just seems like a fascinating idea.

You stand in line, you show up, there’s one table.

They ask you all kinds of questions.

They cook just for you.

That’s fascinating.

It sounds like you’d find places like that in Tokyo.

It sounds like a very Japanese thing.

Or in the Bay Area, there are popular places

that probably more or less work like that.

I’ve never eaten at such a place.

The fascinating thing is you propose it’s a fast food.

This is all for a burger.

It was one of my rare sort of more literary

or poetic moments where I thought

I’ll just open with a crazy example to catch your attention.

And the rest is very dry stuff about locks and semaphores

and how a semaphore is a generalization of a lock.

Well, it was very poetic and well-delivered

and it actually made me wonder if it’s real or not

because you don’t make that explicit.

And it feels like it could be true.

And in fact, I wouldn’t be surprised

if somebody listens to this

and knows exactly a restaurant like this in Silicon Valley.

Anyway, can we step back

and can you just talk about parallelism,

concurrency, threading?

Asynchronous, all of these different terms.

What is it?

Sort of a high philosophical level.

The fisherman is back in the boat.

Well, the idea is if the fisherman has two fishing rods,

since fishing is mostly a matter of waiting

for a fish to nibble.

Well, it depends on how you do it actually.

But if you’re doing the style of fishing

where you sort of you throw it out

and then you let it sit for a while

until maybe you see a nibble,

one fisherman can easily run two or three

or four fishing rods.

And so as long as you can afford the equipment,

you can catch four times as many fish

by a small investment in four fishing rods.

And so since your time,

you sort of say you have all Saturday to go fishing,

if you can catch four times as much fish,

you have a much higher productivity.

And that’s actually I think how deep sea fishing is done.

You could just have a rod and you put in a hole

so you can have many rods.

What, is there an interesting difference

between parallelism and concurrency and asynchronous?

Is there one a subset of the other to you?

Like how do you think about these terms?

In the computer world, there is a big difference.

When people are talking about parallelism,

like a parallel computer,

that’s usually really several complete CPUs

that are sort of tied together

and share something like memory or an IO bus.

Concurrency can be a much more abstract concept

where you have the illusion

that things happen simultaneously,

but what the computer actually does

is it spends a little time running this program for a while

and then it spends some time running that program

for a while and then spending some time

for the third program for a while.

So parallelism is the reality

and concurrency is part reality, part illusion.

Yeah, parallelism typically implies

that there is multiple copies of the hardware.

You write that implementing synchronization

of primitives is hard in that blog post

and you talk about locks and semaphores.

Why is it hard to implement synchronization primitives?

Because at the conscious level,

our brains are not trained to sort of keep track

of multiple things at the same time.

Like obviously you can walk and chew gum at the same time

because they’re both activities

that require only a little bit of your conscious activity,

but try balancing your checkbook

and watching a murder mystery on TV.

You’ll mix up the digits

or you’ll miss an essential clue in the TV show.

So why does it matter that the programmer,

the human, is bad?

Because the programmer is,

at least with the current state of the art,

is responsible for writing the code correctly

and it’s hard enough to keep track of a recipe

that you just execute one step at a time.

Chop the carrots, then peel the potatoes, mix the icing.

You need your whole brain

when you’re reading a piece of code, what is going on?

Okay, we’re loading the number of mermaids in variable A

and the number of mermen in variable B

and now we take the average or whatever.

I like how we’re just jumping from metaphor to metaphor.

I like it.

You have to keep in your head what is in A,

what is in B, what is in C.

Hopefully you have better names.

And that is challenging enough.

If you have two different pieces of code

that are sort of being executed simultaneously,

whether it’s using the parallel or the concurrent approach,

if like A is the number of fishermen

and B is the number of programmers,

but in another part of the code,

A is the number of mermaids and B is the number of mermen,

and somehow that’s the same variable.

If you do it sequentially,

if first you do your mermaid merpeople computation

and then you do your people in the boat computation,

it doesn’t matter that the variables are called A and B

and that is literally the same variable

because you’re done with one use of that variable.

But when you mix them together,

suddenly the number of merpeople

replaces the number of fishermen

and your computation goes dramatically wrong.

And there’s all kinds of ordering of operations

that could result in the assignment of those variables

and so you have to anticipate all possible orderings.

And you think you’re smart and you’ll put a lock around it

and in practice, in terms of bugs per line,

per 1,000 lines of code,

this is an area where everything is worse.

So a lock is a mechanism by which you forbid

only one chef can access the oven at a time.

Something like that.

And then semaphores allow you to do what?

Multiple ovens?

That’s not a bad idea because if you’re sort of,

if you’re preparing, if you’re baking cakes

and you have multiple people all baking cakes

but there’s only one oven,

then maybe you can tell that the oven is in use

but maybe it’s preheating.

And so you have to, maybe you make a sign

that says oven in use and you flip the sign over

and it says oven is free when you’re done baking your cake.

That’s a lock, that’s sort of,

and what do you do when you have two ovens

or maybe you have 10 ovens?

You can put a separate sign on each oven

or maybe you can sort of,

someone who comes in wants to see at a glance

and maybe there’s an electronic sign that says

there are still five ovens available.

Or maybe there are already three people waiting for an oven

so you can, if you see an oven that’s not in use,

it’s already reserved for someone else

who got in line first.

And that’s sort of what the restaurant metaphor

was trying to explain.

Yeah, so you’re now tasked,

you’re sitting as a designer of Python

with a team of brilliant core developers

and have to try to figure out to what degree

can any of these ideas be integrated and not?

So maybe this is a good time to ask,

what is AsyncIO and how has it evolved since Python 3.4?

Wow, yeah, so we had this really old library

for doing things concurrently,

especially things that had to do with IO

and networking IO was especially sort of a popular topic.

And in the Python standard library,

we had a brief period where there was lots of development

and I think it was late 90s, maybe early 2000s

and like two little modules were added

that were the state of the art of doing asynchronous IO

or sort of non-blocking IO,

which means that you can keep multiple network connections

open and sort of service them all in parallel

like a typical web server does.

So IO is input and outputs,

you’re writing either to the network

or read from the network connection

or reading and writing to a hard drive, to storage.

Also possible.

And you can do the ideas you could do to multiple

while also doing computation.

So running some code that does some fancy stuff.

Yeah, like when you’re writing a web server,

when a request comes in,

a user sort of needs to see a particular web page,

you have to find that page maybe in the database

and format it properly and send it back to the client.

And there’s a lot of waiting, waiting for the database,

waiting for the network.

And so you can handle hundreds or thousands

or millions of requests concurrently on one machine.

Anyway, ways of doing that in Python were kind of stagnated.

And I forget, it might’ve been around 2012, 2014,

when someone for the umpteenth time actually said,

these async chat and async core modules

that you have in a standard library

are not quite enough to solve my particular problem.

Can we add one tiny little feature?

And everybody said, no, that stuff is not to,

you’re not supposed to use that stuff.

Write your own using a third-party library.

And then everybody started a debate

about what the right third-party library was.

And somehow I felt that there was actually a queue

for, well, maybe we need a better state-of-the-art module

in the standard library for multiplexing input-output

from different sources.

You could say that it spiraled out of control a little bit.

It was, at the time, it was the largest

Python enhancement proposal that was ever proposed.

And you were deeply involved with that.

At the time, I was very much involved with that.

I was like the lead architect.

I ended up talking to people

who had already developed serious third-party libraries

that did similar things and sort of taking ideas from them

and getting their feedback on my design.

And eventually we put it in the standard library.

And after a few years, I got distracted.

I think the thing, the big thing that distracted me

was actually type annotations.

But other people kept it alive and kicking

and it’s been quite successful actually

in the world of Python web clients.

So initially, what are some of the design challenges there

in that debate for the PEP?

And what are some things that got rejected?

What are some things that got accepted to stand out to you?

There are a couple of different ways

you can handle parallel IO.

And this happens sort of at an architectural level

in operating systems as well.

Like Windows prefers to do it one way

and Unix prefers to do it the other way.

You sort of, you have an object

that represents a network endpoint,

say a connection with a web browser that’s your client.

And say you’re waiting for an incoming request.

Two fundamental approaches are,

okay, I’m waiting for an incoming request.

I’m doing something else.

Come wake me up or sort of come tell me

when something interesting happened,

like a packet came in on that network connection.

And the other paradigm is we’re on a team

of a whole bunch of people with maybe a little mind

and we can only manage one web connection at a time.

So I’m just sitting,

looking at this web connection

and I’m just blocked until something comes in.

And then I’m already waiting for it.

I get the data, I process the data

and then I go back to the top and say,

no, sort of, I’m waiting for the next packet.

Those are about the two paradigms.

One is a paradigm where there is sort of notionally

a thread of control,

whether it’s an actual operating system thread

or more an abstraction in async IO, we call them tasks.

But a task in async IO or a thread in other contexts

is devoted to one thing

and it has logic for all the stages.

Like when it’s a web request,

like first wait for the first line of the web request,

parse it because then you know if it’s a get or a post

or a put or whatever, or an error.

Then wait until you have a bunch of lines

until there’s a blank line,

then parse that as headers and then interpret that

and then wait for the rest of the data to come in

if there is any more that you expect

that sort of standard web stuff.

And the other thing is,

and there’s always endless debate

about which approach is more efficient

and which approach is more error prone,

where I just have a whole bunch of stacks in front of me

and whenever a packet comes in,

I sort of look at the number of the pack,

that there’s some number on the packet

and I say, oh, that packet goes in this pile

and then I can do a little bit

and then sort of that pile provides my context.

And as soon as I’m done with the processing,

I sort of, I can forget everything about what’s going on

because the next packet will come in

from some random other client

and it’s that pile or it’s this pile.

And every time a pile is maybe empty or full

or whatever the criteria is,

I can toss it away or use it for a new space.

But several traditional third party libraries

for asynchronous IO processing in Python

shows the model of a callback

and that’s the idea where you have a bunch

of different stacks of paper in front of you

and every time someone gives you a piece,

gives you a new sheet,

you decide which stack it belongs to.

And that leads to a certain style of spaghetti code

that I find sort of aesthetically not pleasing

and I was sort of never very successful

and I had heard many stories about people

who were also sort of complaining

about that style of coding.

It was very prevalent in JavaScript at the time at least

because it was like how the JavaScript event loop

basically works.

And so I thought, well, the task-based model

where each task has a bunch of logic,

we had mechanisms in the Python language

that we could easily reuse for that.

And I thought I want to build a whole library

for asynchronous networking IO

and all the other things that may need

to be done asynchronously based on that paradigm.

And so I just chose a paradigm

and tried to see how far I could get with that.

And it turns out that it’s a pretty good paradigm.

So people enjoy that kind of paradigm programming

for asynchronous IO relative to callbacks.

Okay, beautiful.

So how does that all interplay with the infamous GIL,

the Global Interpreter Lock?

Maybe can you say what the GIL is

and how does it dance beautifully with async IO?

The Global Interpreter Lock solves the problem

that Python originally was not written

with either asynchronous or parallelism in mind at all.

There was no concurrency in the language.

There was no parallelism.

There were no threads.

Only a small number of years

into Python’s initial development,

all the new cool operating systems

like SunOS and Silicon Graphics IRIX

and then eventually POSIX and Windows

all came with threading libraries

that lets you do multiple things in parallel.

And there is a certain sort of principle

which is the operating system handles the threads for you.

And the program can pretend that there are as many CPUs

as there are threads in the program.

And those CPUs work completely independently.

And if you don’t have enough CPUs,

the operating system sort of simulates those extra CPUs.

On the other hand, if you have enough CPUs,

you can get a lot of work done

by deploying those multiple CPUs.

But Python wasn’t written to do that.

And so as libraries for multi-threading were added to C,

but every operating system vendor

was adding their own version of that.

We thought, and maybe we were wrong,

but at the time we thought,

well, we quickly want to be able

to support these multiple threads

because they seemed at the time in the early 90s

when they were new, at least to me,

they seemed a cool, interesting programming paradigm.

And one of the things that Python, at least at the time,

felt was nice about the language

was that we could give a safe version

of all kinds of cool new operating system toys

to the Python programmer.

Like I remember one or two years before threading,

I had spent some time adding networking sockets to Python

and they were very literal translation

of the networking sockets

that were in the BSD operating system, so Unix BSD.

But the nice thing was if you were using sockets from Python

then all the things you can do wrong with sockets in C

would automatically give you a clear error message

instead of just ending up

with a malfunctioning hanging program.

And so we thought, well,

we’ll do the same thing with threading,

but we didn’t really want to rewrite the interpreter

to be thread safe because that was like,

there would be a very complex refactoring

of all the interpreter code and all the runtime code

because all the objects were written with the assumption

that there’s only one thread.

And so we said, okay, well, we’ll take our losses,

we’ll provide something that looks like threads.

And as long as you only have a single CPU on your computer,

which most computers at the time did,

it feels just like threads

because the whole idea of multiple threads in the OS

was that even if your computer only had one CPU,

you could still fire up as many threads as you wanted.

Well, within reason, maybe 10 or 12, not 5,000.

And so we thought we had conquered

the abstraction of threads pretty well

because multi-core CPUs were not

in most Python programmers’ hands anyway.

And then of course, a couple of more iterations

of Moore’s law and computers getting faster.

And at some point,

the chip designers decided

that they couldn’t make the CPUs faster,

but they could still make them smaller.

And so they could put multiple CPUs on one chip.

And suddenly there was all this pressure

about do things in parallel.

And that’s where the solution we had in Python didn’t work.

And that’s sort of the moment

that the GIL became infamous.

Because the GIL was the solution we used

to sort of take this single interpreter

and share it between all the different operating system

threads that you could create.

And so as long as the hardware

physically only had one CPU, that was all fine.

And then as hardware vendors were suddenly telling us all,

oh, you got to parallelize.

Everything’s got to be parallelized.

People started saying, oh,

but we can use multiple threads in Python.

And then they discovered, oh,

but actually all threads run on a single core.


I mean, is there a way,

is there ideas in the future to remove

the global interpreter log GIL?

Like maybe multiple sub-interpreters,

some tricky interpreters on top of interpreters

kind of thing?

Yeah, there are a couple of possible futures there.

The most likely future is that we’ll get

multiple sub-interpreters,

which each run a completely independent Python program.


But there’s still some benefit

of sort of faster communication

between those programs.

But it’s also managing for you

this running a multiple Python programs.


So it’s hidden from you, right?

It’s hidden from you,

but you have to spend more time communicating

between those programs,

because the sort of,

the attractive thing about the multi-threaded model

is that the threads can share objects.

At the same time, that’s also the downfall

of the multi-threaded programming model,

because when you do share objects,

and you didn’t necessarily intend to share them,

or there were aspects of those objects

that were not reusable,

you get all kinds of concurrency bugs.

And so the reason I wrote that little blog post

about semaphores was that concurrency bugs are just harder.

It would be nice if Python had no global interpreter lock,

and it had the so-called free threading,

but it would also cause a lot more software bugs.

The interesting thing is that there is still

a possible future where we are actually going to,

or where we could experiment at least with that,

because there is a guy working for Facebook

who has developed a fork of CPython

that he called the no-gill interpreter,

where he removed the gill

and made a whole bunch of optimizations

so that the single-threaded case doesn’t run too much slower,

and multi-threaded case will actually use

all the cores that you have.

And so that would be an interesting possibility

if we would be willing as Python core developers

to actually maintain that code indefinitely.

And if we’re willing to put up

with the additional complexity of the interpreter

and the additional sort of overhead

for the single-threaded case,

and I’m personally not convinced

that there are enough people

needing the speed of multiple threads

with their Python programs

that it’s worth to sort of take that performance hit

and that complexity hit.

And I feel that the gill actually is a pretty nice

Goldilocks point between no threads

and all threads all the time,

but not everybody agrees on that.

So that is definitely a possible future.

The sub-interpreters look like a fairly safe bet for 3.12,

so say a year from now.

Yeah, so the goal is to do a new version every year

for Python.

Let me ask you perhaps a fun question,

but there’s a philosophy to it too.

Will there ever be a Python 4.0?

Now, before you say it’s currently a joke

and probably not, we’re gonna go to 3.99 or 3.999,

can you imagine possible features

that Python 4.0 might have

that would necessitate the creation of the new 4.0?

Given the amount of pain and joy,

suffering and triumph that was involved

in the move between version two and version three?

Yeah, well, as a community and as a core development team,

we have a large amount of painful memories

about the Python 3 transition.

Which is one reason that sort of everybody is happy

that we’ve decided there’s not going to be a 4.0

at least, not anytime soon.

And if there is going to be one,

we’ll sort of plan the transition very differently.

Because clearly we underestimated the pain

the transition caused for our users in the Python 3 case.

And had we known we could have sort of designed Python 3

somewhat differently without making it any worse,

we just thought that we had a good plan,

but we underestimated what sort of the users

were capable of when it comes to that kind of transition.

By the way, I think we talked way before,

like a year and a half before the Python 2 officially.

End of life.

End of life.

Oh, yeah.

What was that?

What was your memory of the end of life?

Did you shed a tear on January 1st, 2020?

Was there, were you standing alone?

Everyone on the core team had basically moved

on years before.


It was purely, it was a little symbolic moment

to signal to the remaining users that

there was no longer going to be any new releases

or support for Python 2.7.

Did you shed a single tear

while looking out over the horizon?

I’m not a very poetic person

and I don’t shed tears like that, but no.

No, we actually had planned a party,

but the party was planned for the Python,

the US Python conference that year,

which would never happened, of course,

because of the pandemic.

Oh, was it like in March or something?

Yeah, the conference was going to be,

I think, late April that year.

So that was a very difficult decision to cancel it,

but they did.

So anyway, if we’re going to have a Python 4,

we’re going to have to have both a different reason

for having that and a different process

for managing the transition.

Can you imagine a possible process that,

so I think you’re implying that if there is a 4.0,

in some ways it would break back compatibility?

Well, so here is a concrete thought I’ve had,

and I’m not unique, but not everyone agrees with this,

so this is definitely a personal opinion.

If we were to try something like that Nogill Python,

my expectation is that it would feel just different enough,

at least for the part of the Python ecosystem

that is heavily based on C extensions.

And that is like the entire machine learning,

data science, scientific Python world

is all based on C extensions for Python.

And so those people would likely feel the pain the most,

because they, even if we don’t change anything

about the syntax of the language

and the semantics of the language

when you’re writing Python code,

we could even say, suppose that after Python,

say 3.19 instead of 3.20, we’ll have 4.0.

Suppose that’s the time when we flip the switch to 4.0,

we’ll not have a GIL.

Imagine it was like that.

So I would probably say that particular year,

the release that we name 4.0 will be syntactically,

it will not have any new syntactical features,

no new modules in the standard library,

no new built-in functions.

Everything will be at the Python level

will be purely compatible with Python 3.19.

However, extension modules will have to make a change.

They will have to be recompiled.

They will not have the same binary interface,

the semantics and APIs for some things

that are frequently accessed by C extensions

will be different.

And so for a pure Python user, 4.0 would be a breeze,

except that there are very few pure Python users left

because everybody who is using Python

for something significant is using third-party extensions.

There are like, I don’t know,

several hundreds of thousands of third-party extensions

on the PyPI service.

And I’m not saying they’re all good,

but there is a large list of extensions

that would have to do work.

And some of those extensions

are currently already low on maintainers

and they’re struggling to keep afloat.

So there you can give a huge heads up to them

if you go to 4.0 to really keep developing it.

Yeah, we’d probably have to do something like

several years before, who knows,

maybe five years earlier, like 3.15,

we would have to say,

and I’m just making the specific numbers up,

but at some point we’d have to say

the Nogail Python could be an option.

It might be a compile-time option.

If you want to use Nogail Python,

you have to recompile Python from source

for your platform using your tool set.

All you have to do is change one configuration variable

and then you just run make or configure and make

and it will build it for you.

But now you also have to use

the Nogail compatible versions

of all extension modules you want to use.

And so as long as many extension modules

don’t have fully functional sort of variants

that work within the Nogail world,

that’s not a very practical thing for Python users,

but it would allow extension developers

to test the waters,

see what they need to syntactically

to be able to compile at all.

Maybe they’re using functions

that are defined by the Python 3 runtime

that won’t be in the Python 4 runtime.

Those functions will not work.

They’ll have to find an alternative,

but they can experiment with that

and sort of write test applications.

And that would be a way to transition

and that could be a series of releases

where the Python 4 is more and more imminent

and we have supported more and more

third-party extension modules to have solid support

that works for Nogail Python for that new API.

And then sort of Python 4.0 is like the official moment

that the mayor comes out and cuts the ribbon

and now Python, now the sort of Nogail mode

is the default and maybe the only mode there is.

The internet wants to know from Reddit.

It’s a small and fun question.

There’s many fun questions,

but out of the PyPi packages, PyPI packages,

do you have ones you like?

In your opinion, are there must have PyPi libraries

or ones you use all the time constantly?

Oh my, I should really have a standard answer

for that question, but like a positive standard answer,

but my current standard answer is that

I’m not a big user of third-party packages.

When I write Python code,

I’m usually developing some tooling

around building Python itself.

And the last thing we want is dependencies

on third-party packages.

So I tend to just use the standard library.

That’s where your focus is, that’s where your mind is.

But do you keep an eye of what’s out there

to understand where the standard library

could be moving, should be moving?

It’s a good kind of landscape

of what’s missing from the standard library.

Well, usually when something’s missing

from the standard library nowadays,

it is a relatively new idea

and there is a third-party implementation

or maybe possibly multiple third-party implementations,

but they evolve at a much higher rate

than they could when they’re in the standard library.

So it would be a big reduction in activity

to incorporate things like that in the standard library.

So I like that there is a lively package ecosystem

and that sort of recent trends in the standard library

are actually that we’re doing the occasional spring cleaning

where we’re just,

we’re choosing some modules

that have not had a lot of change in a long time

and that maybe would be better off

not existing at all at this point

because there might be a better third-party

alternative anyway,

and we’re sort of slowly removing those.

Often those are things that I sort of,

I spiked somewhere in 1992 or 1993.

And if you look through the commit history,

it’s very sad, like all cosmetic changes,

like changes in the indentation style

or the name of this other standard library module

got changed or nothing of any substance.

The API is identical to what it was 20 years ago.

So speaking of packages,

they have a lot of impact on a lot of people’s lives.

Does it make sense to you

why Python has become the primary,

the dominant language for the machine learning community?

So packages like PyTorch, TensorFlow,

scikit-learn, and even like the lower level stuff

like NumPy, SciPy, Pandas, Matplotlib with visualization.

Can you like, does it make sense to you

why it permeated the entire data science,

machine learning, AI community?

Well, part of it is an effect

that’s as simple as we’re all driving

on the right side of the road, right?

It’s compatibility.


And part of it is not quite as fundamental

as driving on the right side of the road,

which you have to do for safety reasons.

I mean, you have to agree on something.

They could have picked JavaScript or Perl.

There was a time in the early 2000s

that it really looked like Perl

was going to dominate like biosciences

because DNA search was all based on regular expressions

and Perl has the fastest

and most comprehensive regular expression engine, still does.

I spent quite a long time with Perl.

That was another letting go.


Letting go of this kind of data processing system.

The reasons why Python became the lingua franca

of scientific code and machine learning in particular

and data science, it really had a lot to do

with anything was better than C or C++.

Recently, a guy who worked

at Lawrence Livermore National Laboratories

in the sort of computing division wrote me his memoirs

and he had his own view of how he helped

something he called computational steering into existence.

And this was the idea that you take libraries

that in his days were written in Fortran

that solved universal mathematical problems

and those libraries still work

but the scientists that use the libraries

use them to solve continuously different

specific applications and answer different questions.

And so those poor scientists were required to use say Fortran

because Fortran was the language

that the library was written in.

And then the scientists would have to write an application

that sort of uses the library to solve a particular equation

or set off of answer a set of questions

and the same for C++ because there’s interoperability.

So the dusty decks are written either in C++

or Fortran and so Paul Dubois was one of the people

who I think in the mid 90s saw

that you needed a higher level language

for the scientists to sort of tie together

the fundamental mathematical algorithms

of linear algebra and other stuff.

And so gradually some libraries started appearing

that did very fundamental stuff

with arrays of numbers in Python.

I mean, when I first created Python

I was not expecting it to be used for arrays of numbers much

I thought that was like an outdated data type

and everything was like objects and strings

and like Python was good

and fast at string manipulation and objects obviously

but arrays of numbers were not very efficient

and the multidimensional arrays didn’t even exist

in the language at all.

But there were people who realized

that Python had extensibility that was flexible enough

that they could write third party packages

that did support large arrays of numbers

and operations on them very efficiently.

And somehow they got a foothold

through sort of different parts of the scientific community.

I remembered that the Hubble Space Telescope people

in Baltimore were somehow big Python fans in the late 90s.

And at various points, small improvements were made

and more people got in touch with using Python

to derive these libraries of interesting algorithms.

And like once you have a bunch of scientists

who are working on similar problems

say they’re all working on stuff that comes in

from the Hubble Space Telescope

but they’re looking at different things.

Some are looking at stars in this galaxy

other are looking at galaxies.

The math is completely different

but the underlying libraries are still the same.

And so they exchange code.

They say, well, I wrote this Python program

or I wrote a Python library to solve this class of problems.

And the other guys either say, oh, I can use that library too

or if you make a few changes, I can use that library too.

Why start from scratch in Perl or JavaScript

where there’s not that infrastructure

for arrays of numbers yet whereas in Python, you have it.

And so more and more scientists at different places

doing different work discovered Python

and then people who had an idea

for an important new fundamental library decided,

oh, Python is actually already known to our users.

So let’s use Python as the user interface.

I think that’s how TensorFlow,

I imagine at least that’s how TensorFlow ended up

with Python as the user interface.

Right, but with TensorFlow,

there’s a deeper history of what the community,

so it’s not just like what packages it needs.

It’s like what the community leans on

for a programming language

because TensorFlow had a prior library

that was internal to Google,

but there was also competing machine learning frameworks

like Theano, Caffe that were in Python.

There was some Scala, some other languages,

but Python was really dominating it.

And it’s interesting because there’s other languages

from the engineering space like MATLAB

that a lot of people used,

but different design choices by the company,

by the core developers led to it not spreading.

And one of the choices of MATLAB by MathWorks

is to not make it open source, right?

Or not having people pay.

It was a very expensive product

and so universities especially disliked it

because it was a price per seat, I remember hearing.

Yeah, but I think that’s not why it failed

or it failed to spread.

I think the universities didn’t like it,

but they would still pay for it.

The thing is it didn’t feed into that GitHub

open source packages culture.

So like, and that’s somehow a precondition

for viral spreading, the hacker culture,

like the tinkerer culture.

With Python, it feels like you can build a package

from scratch or solve a particular problem

and get excited about sharing that package with others.

And that creates an excitement about a language.

I tend to like Python’s approach to open source

in particular because it’s sort of,

it’s almost egalitarian.

There’s little hierarchy.

There’s obviously some because like you all need to decide

whether you drive on the left or the right side

of the road sometimes.

But there is a lot of access for people with little power.

You don’t have to work for a big tech company

to make a difference in the Python world.

We have affordable events that really care about community

and support people and sort of the community

is like a big deal at our conferences and in the PSF.

When the PSF funds events,

it’s always about growing the community.

The PSF funds very little development.

They do some, but most of the development,

most of the money that the PSF forks out

is to community fostering things.

So speaking of egalitarian,

last time we talked four years ago,

it was just after you stepped down from your role

as the benevolent dictator for life, BDFL.

Looking back, what are your insights and lessons

you learned from that experience

about Python developer community, about human nature,

about human civilization, life itself?

Oh my, I probably held on to the position too long.

I remember being just extremely stressed for a long time

and it wasn’t very clear to me what was leading,

what was causing the stress.

And looking back, I should have sort of relinquished

my central role as BDFL sooner.

What were the pros and cons of the BDFL role?

Like what were the, you not relinquishing it,

what are the benefits of that for the community?

And what are the drawbacks?

Well, the benefits for the community would be things like

clarity of vision and sort of a clear direction

because I had certain ideas in mind when I created Python.

And while I sort of let myself be influenced

by many other ideas as Python evolved

and became more successful and more complex and more used,

I also stuck to certain principles.

And it’s still hard to say

what are Python’s core principles.

But the fact that I was playing that role

and sort of always very active

grew the community in a certain way.

It modeled to the community how to think

about how to solve a certain problem.


That was a source of stress, but it was also beneficial.

It was a source of stress for me personally,

but it was beneficial for the community

because people sort of over time had learned

how I was thinking and could predict

but how I would decide about a particular issue

and not always perfectly, of course,

but there wasn’t a lot of jerking around.

Like this year, we’re all,

but this year the Democrats are in power

and we’re doing these kinds of things.

And now the Republicans are in power

and they roll all that back and do those kinds of things.

There is a clear, fairly straight path ahead.

And so fortunately the successor structure

with the steering council has sort of found a similar way

of leading the community in a fairly steady direction

without stagnating.

And for me personally, it’s more fun

because there are things I can just ignore.


Oh yeah, there’s a bug in multiprocessing.

Let someone else decide whether that’s important

to solve or not.

I’ll stick to typing in the async IO

and the faster interpreter.

Yeah, it allows you to focus a little bit more.


What are interesting differences in culture,

if you can comment on, between Google, Dropbox

and Microsoft from a Python programming perspective,

all places you’ve been to, the positive.

Well, is there a difference or is it just about people

and there’s great people everywhere

or is there culture differences?

Sort of Dropbox is much smaller

than the other two in your list.


So that is a big difference.

The set of products they provide is narrower

so they’re more focused.

And yeah, and Dropbox sort of,

at least during the time I was there,

had the tendency of sort of making a big plan,

putting the whole company behind that plan for a year

and then evaluate and then suddenly find that

everything was wrong about the plan

and then they had to do something completely different.

And so there was like, the annual engineering reorg

was sort of an unpleasant tradition at Dropbox

because like, oh, there’s a new VP of engineering

and so now all the directors are being reshuffled

and this guy was in charge of infrastructure one year

and the next year he was made in charge of,

I don’t know, product development.

It’s fascinating because like,

you don’t think about these companies internally,

but I, you know, Dropbox to me from the very beginning

was one of my favorite services.

There’s certain like programs and online services

that make me happy, make me more efficient

and all that kind of stuff.

But one of the powers of those kinds of services,

they disappear.

You’re not supposed to think about how it all works,

but it’s incredible to me that you can sync stuff

effortlessly across so many machines so quickly

and like, don’t have to worry about conflicts.

They take care of the, you know,

as a person that comes from a version repositories

and all that kind of stuff,

or merge is super difficult

and just keeping different versions of different files

is very tricky.

The fact that they could take care of that is just,

I don’t know, the engineering behind the scenes

must be super difficult,

both on the compute infrastructure and the software.

A lot of internal sort of hand-wringing

about things like that,

but the product itself always worked very smoothly.

Yeah, but there’s probably a lot of lessons to that.

You can have a lot of turmoil inside

on the engineering side,

but if the product is good, the product is good.

And maybe don’t mess with that either.

You know, when it’s good,

keep, it’s like with Google,

focus on the search and the ads, right?

And the money will come.

Yeah, and make sure that’s done extremely well

and don’t forget what you do extremely well.

In what ways do you provide value and happiness

to the world?

Make sure you do that well.

Is there something else to say

about Google and Microsoft?

Microsoft has had a very fascinating shift recently

with a new CEO,

with, you know, recent CEO,

with purchasing GitHub,

embracing open source culture,

embracing the developer culture.

It’s pretty interesting to see.

That’s like why I joined Microsoft.

I mean, after retiring and thinking

that I would stay retired for the rest of my life,

which of course was a ridiculous thought,

but I was done working for a bit

and then the pandemic made me realize

that work can also provide a source of fulfillment,

keep you out of trouble.

Microsoft is a very interesting company

because it has this incredible,

very long and varied history

and this amazing catalog of products

that many of which also date way back.

I mean,

I’ve been talking to a bunch of Excel people lately

and Excel is like 35 years old

and they can still read spreadsheets

that they might find on an old floppy drive.


Yeah, there’s, man,

there’ve been so many incredible tools through the years.

Excel, one of the great shames of my life

is that I’ve never learned how to use Excel well.

I mean, it just always felt like so many features are there.

It’s similar with IDEs like PyCharm.

It feels like I converge quickly

to the dumbest way to use a thing to get the job done

when clearly there’s so much more power at your fingertips.


But I do think there’s probably expert users of Excel.

Oh, Excel is a cash cow actually.

Oh, it actually brings in money.

Oh, yeah.

A lot of the engineering sort of,

if you look deep inside Excel,

there’s some very good engineering,

very, very impressive stuff.

Okay, now I need to definitely learn Excel a little better.

I had issues because I’m a keyboard person

so I had issues coming up with shortcuts.

I mean, Microsoft sometimes,

it’s changed over the years,

but sometimes they kind of wanna make things easier for you

on the surface and therefore make it harder

for people that like to have shortcuts

and all that kind of stuff to optimize their workflow.

Now, Excel’s probably, people are probably yelling at me.

It’s like, no, Excel probably has a lot of ways

to optimize workflow.

In fact, I keep discovering that there are many features

in Excel that only exists at keyboard shortcuts.

Yeah, that’s the sense I have.

I’m embarrassed that it’s just-

You just have to know what they are.

That’s like, there’s no logic or reason

to the assignment of the keyboard shortcuts

because they go back even longer than 35 years.

Can you maybe comment about such in Adela

and how hard it is for a CEO to sort of pivot a company

towards open source, towards developer culture?

Is there something you could see about like,

what’s the role of leadership in such a pivot

and definition of a new vision?

I’ve never met him, but I hear

he’s just a really sharp thinker,

but he also has an incredible business sense.

He took the organization that had very solid pieces,

but that was also struggling

with all sorts of shameful things,

especially the Steve Ballmer time.

I imagine in part through his personal charm and thinking,

and of course the great trust

that the rest of the leadership has in him,

he managed to really turn the company around

and sort of change it from openly hostile to open source,

to actively embracing open source.

And that doesn’t mean that suddenly

Excel is going to go open source,

but that means that there’s room for a product like VS Code,

which is open source.

Yeah, that’s fascinating.

It gives me faith that large companies

with good leadership can grow, can expand,

can change and pivot and so on, develop,

because it gets harder and harder as the company gets large.

You wrote a blog post in response to a person

looking for advice about whether with a CS degree

to choose a nine to five job or to become an entrepreneur.

It’s an interesting question.

If you just think from first principles right now,

somebody has took a few years in programming,

has loved software engineering,

in some sense creating Python is an entrepreneurial endeavor.

That’s a choice that a lot of people

that are good programmers have to make.

Do I work for a big company or do I create something new?

Or you can work for a big company

and create something new there.

Oh, inside the…

Yeah, I mean, big companies have individuals

who create new stuff that eventually grows big all the time.

And if you’re the person that creates a new thing

and it grows big, you’ll have a chance

to move up quickly in the company, to run that thing.

If that’s your aspiration.

What can also happen is that someone is a brilliant engineer

and sort of builds a great first version of a product

and has no aspirations to then become a manager

and grow the team from five people to 20 people

to 100 people to 1000 people

and be in charge of hiring and meetings.

And they move on to inventing another crazy thing

inside the same company.

Or sometimes they found a startup

or they move to a different great large or small company.

There’s all sorts of models.

And sometimes people sort of do have this whole trajectory

from engineer buckling down, writing code,

not nine to five, but more like noon till midnight,

seven days a week, and coming up with a product

and sort of staying in charge.

I mean, if you take Drew Houston, Dropbox’s founder,

he is still the CEO.

And at least when I was there,

he had not checked out or anything.

He was a good CEO,

but he had started out as the technical inventor

or co-inventor.

And so he was someone who,

I don’t know if he always aspired that.

I think when he was 16, he already started a company.

So maybe he did, but he sort of,

it turned out that he did have the personal

sort of skillset needed to grow and stay on top.

And other people sort of are brilliant engineers

and horrible at management.

I count myself at least in the second category.

So your first love and still your love

is to be the quote-unquote individual contributor.

So the programmer.

Do you have advice for a programming beginner

on how to learn Python the right way?

Find something you actually want to do with it.

If you say, I want to learn skill X,

that’s not enough motivation.

You need to pick something,

and it can be a crazy problem you want to solve.

It can be completely unrealistic.

But something that challenges you

into actually learning coding in some language.

And there’s so many projects out there you can look for.

Like that doesn’t have to be some big ambitious thing.

It could be writing a small bot.

If you’re into social media,

you can write a Reddit bot or a Twitter bot

or some aspect of automating something

that you do every single day.

Processing files, all that kind of stuff.

Nowadays you can take machine learning components

and sort of plug those things together.

So you can do cool stuff with them.

So that’s actually a really good example.

So if you’re interested in machine learning,

the state of machine learning is such

that a tutorial that takes an hour

can get you to start using pre-trained models

to do something super cool.

And that’s a good way to learn Python

because you learn just enough to run this model.

And that’s like a sneaky way to get in there

to figure out how to import stuff,

how to write basic IO, how to run functions.

And I’m not sure if it’s the best way

to learn the basics in Python,

but it could be nice to just fall in love first

and then figure out the basics, right?

Yeah, you can’t expect to learn Python

from a one hour video.

I’m blanking out on the name of someone

who wrote a very funny blog post where he said,

I see all these ads for things like,

learn Python in 10 days or so.

And he said, the goal should be learn Python in 10 years.

That’s hilarious, but I completely disagree with that.

I think the criticism behind that

is that the places just like the blog post from earlier,

the places that tell you learn Python

in five minutes or 10 minutes,

they’re actually usually really bad tutorials.

So the thing is I do believe

that you can learn a thing in an hour

to get some interesting, quick, it hooks you.

But it just takes a tremendous amount of skill

to be that kind of educator.

Richard Feynman was able to condense a lot of ideas

in physics in a very short amount of time,

but that takes a deep, deep understanding.

And so yes, of course, the actual,

I think the 10 years is about the experience,

the pain along the way,

and there’s something fundamental.

Well, you have to practice.

You can memorize the syntax,

but, well, I couldn’t, but maybe someone else can,

but that doesn’t make you a coder.

Yeah, actually, coding has changed in fascinating ways

because so much of coding is copying, pasting

from Stack Overflow and then adjusting,

which is another way of coding.

And I don’t want to talk down

to that kind of style of coding

because it’s kind of a nicely efficient.

But do you know where that is going?

Code generation?

No, seriously, GitHub Copilot.

Yeah, Copilot.

I use it every day and it-


Yeah, it writes a lot of code for me.

And usually it’s slightly wrong,

but it still saves me typing

because all I have to do is change one word

in a line of text that otherwise it generated perfectly.

And how many times are you looking for,

oh, what was I doing this morning?

I was looking for a begin marker

and I was looking for an end marker.

And so begin is blah, blah, blah, search for begin.

This is the begin token.

And then the next line I type E

and it completes the whole line with end instead of begin.

That’s a very simple example.

Sometimes it sort of, if I name my function right,

it writes a five or 10 line function.

And you know Python enough

to very quickly then detect the issues.

So it becomes a really good dance partner then.

It doesn’t save me a lot of thinking,

but since I’m a poor typist,

I’m very much appreciative of all the typing it does for me

much better actually than the previous generation

of suggestions that are also still built in VS Code.

Where when you hit like a dot,

it tries to guess what the type is of the variable

to the left of the dot.

And then it gives you a list,

a pop down menu of what the attributes of that object are.

But Copilot is much, much smoother than that.

Well, it’s fascinating to hear that you use GitHub Copilot.

Do you think, do you worry about the future of that?

Did the automatic code generation,

the increasing amount of that kind of capability,

are programmers jobs threatened

or is there still a significant role for humans?

Are programmers jobs threatened

by the existence of Stack Overflow?

I don’t think so.

It helps you take care of the boring stuff.

And you shouldn’t try to use it to do something

that you have no way of understanding what you’re doing yet.

A tool like that is always best

when the question you’re asking is,

please remind me of how I do this,

which I could do, I could look up how to do it,

but right now I’ve forgotten

whether the method is called foo or bar

or what the shape of the API is.

Does it use a builder object or a constructor

or a factory or something else?

And what are the parameters?

It serves that role.

It’s like a great assistant,

but the creative work of sort of deciding

what you want the code to do is totally yours.

What do you think is the future of Python

in the next 10, 20, 50 years, 100 years?

You look forward, you ever imagine a future

of human civilization living inside the metaverse,

on Mars, humanoid robots everywhere?

What part does Python play in that?

It’ll eventually become sort of a legacy language

that plays an important role,

but that most people have never heard of

and don’t need to know about,

just like all kinds of basic structures in biology,

like mitochondria.

So it permeates all of life, all of digital life,

but people just build on top of it.

And they only know the stuff that’s on top of it.


You guys, you build layers of abstractions.

I mean, most programmers nowadays

rarely need to do binary arithmetic, right?

Yeah, or even think about it, or even learn about it,

or they could go quite far without knowing.

I started building little digital circuits

out of NAND gates that I built myself

with transistors and resistors.

So I sort of, I feel very blessed

that with that start, when I was a teenager,

I learned some of the basic, at least concepts

that go into building a computer.

And I sort of, every part,

I have some understanding what it’s for

and why it’s there and how it works.

And I can forget about all that most of the time,

but I sort of, I enjoy knowing, oh, if you go deeper,

at some point you get to NAND gates

and half adders and shift registers.

And when it comes to the point of how do you actually make

a chip out of silicon, I have no idea.

That’s just magic to me.

But you enjoy knowing that you can walk a while

towards the lower and lower layers, but you don’t need to.

It’s nice.

The other day, as a sort of a mental exercise,

I was trying to figure out if I could build

a flip-flop circuit out of relays.

I was just sort of trying to remember,

oh, how does a relay work?

Yeah, there’s like this electromagnetic force

that pulls a switch open or shut.

And you can have like, it can open one switch

and shut another.

And you can have multiple contacts that go at once.

And how many relays do I really need

to sort of represent one bit of information?

Can the relay just feed on itself?

And it was, I don’t think I got to the final solution,

but it was fun that I could still do a little bit

of problem-solving and thinking at that level.

And it’s cool how we build on top of each other.

So there’s people that are just,

you stood on the shoulders of giants

and there’s others who’ll stand on your shoulders.

And it’s a giant, beautiful hierarchy.

Yeah, I feel I sort of covered this middle layer

of the technology stack where it sort of peters out

below the level of NAND gates.

And at the top, I sort of, I lose track

when it gets to machine learning.

And then eventually the machine learning

will build higher and higher layers

that will help us understand the lowest layer

of the physics and thereby the universe figures out

how it itself works.

Maybe, maybe not.

Yeah, I did.

I mean, it’s possible.

I mean, if you think of human consciousness,

if that’s even the right concept,

it’s interesting that sort of,

we have this super parallel brain

that does all these incredible parallel operations

like image recognition.

I recognize your face.

There’s a huge amount of processing

that goes on in parallel.

There’s lots of nerves between my eyes and my brain

and the brain does a whole bunch of stuff all at once

because it’s actually really slow circuits,

but there are many of them that all work together.

On the other hand, when I’m speaking,

everything is completely sequential.

I have to sort of string words together one at a time.

And when I’m thinking about stuff,

when I’m understanding the world,

I’m also thinking of everything like one step at a time.

And so, we’ve sort of,

we’ve got all this incredible parallel circuitry

in our brains and eventually we use that

to simulate a single-threaded,

much, much higher level interpreter.

It’s exactly, I mean, that’s the illusion of it.

That’s the illusion of it for us

that it’s a single sequential set of thoughts

and all of that came from a single cell

through the process of embryogenesis.

So, DNA is the code.

DNA holds the entirety of the code,

the information and how to use that information

to build up an organism.

The entire, like, the arms, the legs, the brain.

You don’t buy a computer, you buy like a-

You buy a seed, a diagram.

And then you plant the computer

and it builds itself in almost the same way

and then does the computation

and then eventually dies.

It gets stale but gives birth to young computers

more and more and gives them lessons

but they figure stuff out on their own

and over time, it goes on that way.

And those computers, when they go to college,

try to figure out how to program

and they built their own little computers.

They’re increasingly more intelligent,

increasingly higher and higher levels of abstractions.

Isn’t it interesting that you sort of,

you see the same thing appearing at different levels though

because you have like cells that create new cells

and eventually that builds a whole organism

but then the animal or the plant or the human

has its own mechanism of replication.

That is sort of connected in a very complicated way

to the mechanism of replication of the cells.

And then if you look inside the cell,

if you see how DNA and proteins are connected,

then there is yet another completely different mechanism

whereby proteins are mass produced using enzymes

and a little bit of code from DNA.

And of course, viruses break into it at that level.

And while the mechanisms might be different,

it seems like the nature of the mechanism is the same

and it carries across natural languages

and programming languages, humans,

maybe even human civilizations

or intelligent civilizations

and then all the way down to the single cell organisms.

It is fascinating to see what abstraction levels

are built on top of individual humans

and how you have like whole societies

that sort of have a similar self-preservation,

I don’t know what it is, instinct, nature, abstraction

as the individuals have and the cells have.

And they self replicate and breed in different ways.

It’s hard for us humans to introspect it

because we were very focused

on our particular layer of abstraction.

But from an alien perspective, looking on Earth,

they’ll probably see the higher level organism

of human civilization as part of this bigger organism

of life on Earth itself.

In fact, that could be an organism just alone,

just life, life, life on Earth.

This has been a wild,

both philosophical and technical conversation,

Guido, you’re an amazing human being.

You were gracious enough to talk to me

when I was first doing this podcast

and one of the earliest, first people I’ve talked to,

somebody I admired for a long time.

It’s just a huge honor that you did it at that time

and you do it again.

You’re awesome.

Thank you, Lex.

Thanks for listening to this conversation

with Guido Van Rossum.

To support this podcast,

please check out our sponsors in the description.

And now, let me leave you with some words from Oscar Wilde.

Experience is the name that everyone gives

to their mistakes.

Thank you for listening and hope to see you next time.


comments powered by Disqus