The following is a conversation with Guido van Rossum, creator of Python, one of the most popular
programming languages in the world, used in almost any application that involves computers
from web back end development to psychology, neuroscience, computer vision, robotics, deep
learning, natural language processing, and almost any subfield of AI. This conversation is part of
MIT course on artificial general intelligence and the artificial intelligence podcast.
If you enjoy it, subscribe on YouTube, iTunes, or your podcast provider of choice, or simply connect
with me on Twitter at Lex Friedman, spelled F R I D. And now, here’s my conversation with Guido van
Rossum. You were born in the Netherlands in 1956. Your parents and the world around you was deeply
deeply impacted by World War Two, as was my family from the Soviet Union. So with that context,
what is your view of human nature? Are some humans inherently good,
and some inherently evil? Or do we all have both good and evil within us?
Guido van Rossum Ouch, I did not expect such a deep one. I, I guess we all have good and evil
potential in us. And a lot of it depends on circumstances and context.
Peter Bell out of that world, at least on the Soviet Union side in Europe, sort of out of
suffering, out of challenge, out of that kind of set of traumatic events, often emerges beautiful
art, music, literature. In an interview I read or heard, you said you enjoyed Dutch literature
when you were a child. Can you tell me about the books that had an influence on you in your
childhood? Guido van Rossum
Well, with as a teenager, my favorite writer was my favorite Dutch author was a guy named Willem
Frederik Hermans, who’s writing, certainly his early novels were all about sort of
ambiguous things that happened during World War Two. I think he was a young adult during that time.
And he wrote about it a lot, and very interesting, very good books, I thought, I think.
Peter Bell In a nonfiction way?
Guido van Rossum No, it was all fiction, but it was
very much set in the ambiguous world of resistance against the Germans,
where often you couldn’t tell whether someone was truly in the resistance or really a spy for the
Germans. And some of the characters in his novels sort of crossed that line, and you never really
find out what exactly happened.
Peter Bell And in his novels, there’s always a
good guy and a bad guy, the nature of good and evil. Is it clear there’s a hero?
Guido van Rossum No, his heroes are often more,
his main characters are often anti heroes. And so they’re not very heroic. They’re often,
they fail at some level to accomplish their lofty goals.
Peter Bell And looking at the trajectory
through the rest of your life, has literature, Dutch or English or translation had an impact
outside the technical world that you existed in?
Guido van Rossum I still read novels.
I don’t think that it impacts me that much directly.
Peter Bell It doesn’t impact your work.
Guido van Rossum It’s a separate world.
My work is highly technical and sort of the world of art and literature doesn’t really
directly have any bearing on it.
Peter Bell You don’t think there’s a creative element
to the design? You know, some would say design of a language is art.
Guido van Rossum I’m not disagreeing with that.
I’m just saying that sort of I don’t feel direct influences from more traditional art
on my own creativity.
Peter Bell Right. Of course, you don’t feel doesn’t mean
it’s not somehow deeply there in your subconscious.
Guido van Rossum Who knows?
Peter Bell Who knows? So let’s go back to your early
teens. Your hobbies were building electronic circuits, building mechanical models.
What if you can just put yourself back in the mind of that young Guido 12, 13, 14, was
that grounded in a desire to create a system? So to create something? Or was it more just
tinkering? Just the joy of puzzle solving?
Guido van Rossum I think it was more the latter, actually.
I maybe towards the end of my high school period, I felt confident enough that that
I designed my own circuits that were sort of interesting somewhat. But a lot of that
time, I literally just took a model kit and follow the instructions, putting the things
together. I mean, I think the first few years that I built electronics kits, I really did
not have enough understanding of sort of electronics to really understand what I was doing. I mean,
I could debug it, and I could sort of follow the instructions very carefully, which has
always stayed with me. But I had a very naive model of, like, how do I build a circuit?
Of, like, how a transistor works? And I don’t think that in those days, I had any understanding
of coils and capacitors, which actually sort of was a major problem when I started to build
more complex digital circuits, because I was unaware of the sort of the analog part of
the – how they actually work. And I would have things that – the schematic looked
– everything looked fine, and it didn’t work. And what I didn’t realize was that
there was some megahertz level oscillation that was throwing the circuit off, because
I had a sort of – two wires were too close, or the switches were kind of poorly built.
But through that time, I think it’s really interesting and instructive to think about,
because echoes of it are in this time now. So in the 1970s, the personal computer was
being born. So did you sense, in tinkering with these circuits, did you sense the encroaching
revolution in personal computing? So if at that point, we would sit you down and ask
you to predict the 80s and the 90s, do you think you would be able to do so successfully
to unroll the process that’s happening? No, I had no clue. I remember, I think, in
the summer after my senior year – or maybe it was the summer after my junior year – well,
at some point, I think, when I was 18, I went on a trip to the Math Olympiad in Eastern
Europe, and there was like – I was part of the Dutch team, and there were other nerdy
kids that sort of had different experiences, and one of them told me about this amazing
thing called a computer. And I had never heard that word. My own explorations in electronics
were sort of about very simple digital circuits, and I had sort of – I had the idea that
I somewhat understood how a digital calculator worked. And so there is maybe some echoes
of computers there, but I never made that connection. I didn’t know that when my parents
were paying for magazine subscriptions using punched cards, that there was something called
a computer that was involved that read those cards and transferred the money between accounts.
I was also not really interested in those things. It was only when I went to university
to study math that I found out that they had a computer, and students were allowed to use
it.
And there were some – you’re supposed to talk to that computer by programming it.
What did that feel like, finding –
Yeah, that was the only thing you could do with it. The computer wasn’t really connected
to the real world. The only thing you could do was sort of – you typed your program
on a bunch of punched cards. You gave the punched cards to the operator, and an hour
later the operator gave you back your printout. And so all you could do was write a program
that did something very abstract. And I don’t even remember what my first forays into programming
were, but they were sort of doing simple math exercises and just to learn how a programming
language worked.
Did you sense, okay, first year of college, you see this computer, you’re able to have
a program and it generates some output. Did you start seeing the possibility of this,
or was it a continuation of the tinkering with circuits? Did you start to imagine that
one, the personal computer, but did you see it as something that is a tool, like a word
processing tool, maybe for gaming or something? Or did you start to imagine that it could
be going to the world of robotics, like the Frankenstein picture that you could create
an artificial being? There’s like another entity in front of you. You did not see the
computer.
I don’t think I really saw it that way. I was really more interested in the tinkering.
It’s maybe not a sort of a complete coincidence that I ended up sort of creating a programming
language which is a tool for other programmers. I’ve always been very focused on the sort
of activity of programming itself and not so much what happens with the program you
write.
Right.
I do remember, and I don’t remember, maybe in my second or third year, probably my second
actually, someone pointed out to me that there was this thing called Conway’s Game of Life.
You’re probably familiar with it. I think –
In the 70s, I think is when they came up with it.
So there was a Scientific American column by someone who did a monthly column about
mathematical diversions. I’m also blanking out on the guy’s name. It was very famous
at the time and I think up to the 90s or so. And one of his columns was about Conway’s
Game of Life and he had some illustrations and he wrote down all the rules and sort of
there was the suggestion that this was philosophically interesting, that that was why Conway had
called it that. And all I had was like the two pages photocopy of that article. I don’t
even remember where I got it. But it spoke to me and I remember implementing a version
of that game for the batch computer we were using where I had a whole Pascal program that
sort of read an initial situation from input and read some numbers that said do so many
generations and print every so many generations and then out would come pages and pages of
sort of things.
I remember much later I’ve done a similar thing using Python but that original version
I wrote at the time I found interesting because I combined it with some trick I had learned
during my electronics hobbyist times. I essentially first on paper I designed a simple circuit
built out of logic gates that took nine bits of input which is sort of the cell and its
neighbors and produced a new value for that cell and it’s like a combination of a half
adder and some other clipping. It’s actually a full adder. And so I had worked that out
and then I translated that into a series of Boolean operations on Pascal integers where
you could use the integers as bitwise values. And so I could basically generate 60 bits
of a generation in like eight instructions or so.
Nice.
So I was proud of that.
It’s funny that you mentioned, so for people who don’t know Conway’s Game of Life, it’s
a cellular automata where there’s single compute units that kind of look at their neighbors
and figure out what they look like in the next generation based on the state of their
neighbors and this is deeply distributed system in concept at least. And then there’s simple
rules that all of them follow and somehow out of this simple rule when you step back
and look at what occurs, it’s beautiful. There’s an emergent complexity. Even though the underlying
rules are simple, there’s an emergent complexity. Now the funny thing is you’ve implemented
this and the thing you’re commenting on is you’re proud of a hack you did to make it
run efficiently. When you’re not commenting on, it’s a beautiful implementation, you’re
not commenting on the fact that there’s an emergent complexity that you’ve coded a simple
program and when you step back and you print out the following generation after generation,
that’s stuff that you may have not predicted would happen is happening.
And is that magic? I mean, that’s the magic that all of us feel when we program. When
you create a program and then you run it and whether it’s Hello World or it shows something
on screen, if there’s a graphical component, are you seeing the magic in the mechanism
of creating that?
I think I went back and forth. As a student, we had an incredibly small budget of computer
time that we could use. It was actually measured. I once got in trouble with one of my professors
because I had overspent the department’s budget. It’s a different story.
I actually wanted the efficient implementation because I also wanted to explore what would
happen with a larger number of generations and a larger size of the board. Once the implementation
was flawless, I would feed it different patterns and then I think maybe there was a follow
up article where there were patterns that were like gliders, patterns that repeated
themselves after a number of generations but translated one or two positions to the right
or up or something like that. I remember things like glider guns. Well, you can Google Conway’s
Game of Life. People still go aww and ooh over it.
For a reason because it’s not really well understood why. I mean, this is what Stephen
Wolfram is obsessed about. We don’t have the mathematical tools to describe the kind of
complexity that emerges in these kinds of systems. The only way you can do is to run
it.
I’m not convinced that it’s sort of a problem that lends itself to classic mathematical
analysis.
One theory of how you create an artificial intelligence or artificial being is you kind
of have to, same with the Game of Life, you kind of have to create a universe and let
it run. That creating it from scratch in a design way, coding up a Python program that
creates a fully intelligent system may be quite challenging. You might need to create
a universe just like the Game of Life.
You might have to experiment with a lot of different universes before there is a set
of rules that doesn’t essentially always just end up repeating itself in a trivial
way.
Yeah, and Stephen Wolfram works with these simple rules, says that it’s kind of surprising
how quickly you find rules that create interesting things. You shouldn’t be able to, but somehow
you do. And so maybe our universe is laden with rules that will create interesting things
that might not look like humans, but emergent phenomena that’s interesting may not be as
difficult to create as we think.
Sure.
But let me sort of ask, at that time, some of the world, at least in popular press, was
kind of captivated, perhaps at least in America, by the idea of artificial intelligence, that
these computers would be able to think pretty soon. And did that touch you at all? In science
fiction or in reality in any way?
I didn’t really start reading science fiction until much, much later. I think as a teenager
I read maybe one bundle of science fiction stories.
Was it in the background somewhere, like in your thoughts?
That sort of the using computers to build something intelligent always felt to me, because
I felt I had so much understanding of what actually goes on inside a computer. I knew
how many bits of memory it had and how difficult it was to program. And sort of, I didn’t believe
at all that you could just build something intelligent out of that, that would really
sort of satisfy my definition of intelligence. I think the most influential thing that I
read in my early twenties was Gödel Escherbach. That was about consciousness, and that was
a big eye opener in some sense.
In what sense? So, on your own brain, did you at the time or do you now see your own
brain as a computer? Or is there a total separation of the way? So yeah, you’re very pragmatically
practically know the limits of memory, the limits of this sequential computing or weakly
paralyzed computing, and you just know what we have now, and it’s hard to see how it creates.
But it’s also easy to see, it was in the 40s, 50s, 60s, and now at least similarities between
the brain and our computers.
Oh yeah, I mean, I totally believe that brains are computers in some sense. I mean, the rules
they use to play by are pretty different from the rules we can sort of implement in our
current hardware, but I don’t believe in, like, a separate thing that infuses us with
intelligence or consciousness or any of that. There’s no soul, I’ve been an atheist
probably from when I was 10 years old, just by thinking a bit about math and the universe,
and well, my parents were atheists. Now, I know that you could be an atheist and still
believe that there is something sort of about intelligence or consciousness that cannot
possibly emerge from a fixed set of rules. I am not in that camp. I totally see that,
sort of, given how many millions of years evolution took its time, DNA is a particular
machine that sort of encodes information and an unlimited amount of information in chemical
form and has figured out a way to replicate itself.
I thought that that was, maybe it’s 300 million years ago, but I thought it was closer
to half a billion years ago, that that’s sort of originated and it hasn’t really changed,
that the sort of the structure of DNA hasn’t changed ever since. That is like our binary
code that we have in hardware. I mean…
The basic programming language hasn’t changed, but maybe the programming itself…
Obviously, it did sort of, it happened to be a set of rules that was good enough to
sort of develop endless variability and sort of the idea of self replicating molecules
competing with each other for resources and one type eventually sort of always taking
over. That happened before there were any fossils, so we don’t know how that exactly
happened, but I believe it’s clear that that did happen.
Can you comment on consciousness and how you see it? Because I think we’ll talk about
programming quite a bit. We’ll talk about, you know, intelligence connecting to programming
fundamentally, but consciousness is this whole other thing. Do you think about it often as
a developer of a programming language and as a human?
Those are pretty sort of separate topics. Sort of my line of work working with programming
does not involve anything that goes in the direction of developing intelligence or consciousness,
but sort of privately as an avid reader of popular science writing, I have some thoughts
which is mostly that I don’t actually believe that consciousness is an all or nothing thing.
I have a feeling that, and I forget what I read that influenced this, but I feel that
if you look at a cat or a dog or a mouse, they have some form of intelligence. If you
look at a fish, it has some form of intelligence, and that evolution just took a long time,
but I feel that the sort of evolution of more and more intelligence that led to sort of
the human form of intelligence followed the evolution of the senses, especially the visual
sense. I mean, there is an enormous amount of processing that’s needed to interpret
a scene, and humans are still better at that than computers are.
And I have a feeling that there is a sort of, the reason that like mammals in particular
developed the levels of consciousness that they have and that eventually sort of going
from intelligence to self awareness and consciousness has to do with sort of being a robot that
has very highly developed senses.
Has a lot of rich sensory information coming in, so that’s a really interesting thought
that whatever that basic mechanism of DNA, whatever that basic building blocks of programming,
if you just add more abilities, more high resolution sensors, more sensors, you just
keep stacking those things on top that this basic programming in trying to survive develops
very interesting things that start to us humans to appear like intelligence and consciousness.
As far as robots go, I think that the self driving cars have that sort of the greatest
opportunity of developing something like that, because when I drive myself, I don’t just
pay attention to the rules of the road.
I also look around and I get clues from that, oh, this is a shopping district, oh, here’s
an old lady crossing the street, oh, here is someone carrying a pile of mail, there’s
a mailbox, I bet you they’re going to cross the street to reach that mailbox.
And I slow down, and I don’t even think about that.
And so, there is so much where you turn your observations into an understanding of what
other consciousnesses are going to do, or what other systems in the world are going
to be, oh, that tree is going to fall.
I see sort of, I see much more of, I expect somehow that if anything is going to become
unconscious, it’s going to be the self driving car and not the network of a bazillion computers
in a Google or Amazon data center that are all networked together to do whatever they
do.
So, in that sense, so you actually highlight, because that’s what I work in Thomas Vehicles,
you highlight the big gap between what we currently can’t do and what we truly need
to be able to do to solve the problem.
Under that formulation, then consciousness and intelligence is something that basically
a system should have in order to interact with us humans, as opposed to some kind of
abstract notion of a consciousness.
Consciousness is something that you need to have to be able to empathize, to be able to
fear, understand what the fear of death is, all these aspects that are important for interacting
with pedestrians, you need to be able to do basic computation based on our human desires
and thoughts.
And if you sort of, yeah, if you look at the dog, the dog clearly knows, I mean, I’m
not the dog owner, but I have friends who have dogs, the dogs clearly know what the
humans around them are going to do, or at least they have a model of what those humans
are going to do and they learn.
Some dogs know when you’re going out and they want to go out with you, they’re sad when
you leave them alone, they cry, they’re afraid because they were mistreated when they were
younger.
We don’t assign sort of consciousness to dogs, or at least not all that much, but I also
don’t think they have none of that.
So I think it’s consciousness and intelligence are not all or nothing.
The spectrum is really interesting.
But in returning to programming languages and the way we think about building these
kinds of things, about building intelligence, building consciousness, building artificial
beings.
So I think one of the exciting ideas came in the 17th century and with Leibniz, Hobbes,
Descartes, where there’s this feeling that you can convert all thought, all reasoning,
all the thing that we find very special in our brains, you can convert all of that into
logic.
So you can formalize it, formal reasoning, and then once you formalize everything, all
of knowledge, then you can just calculate and that’s what we’re doing with our brains
is we’re calculating.
So there’s this whole idea that this is possible, that this we can actually program.
But they weren’t aware of the concept of pattern matching in the sense that we are aware of
it now.
They sort of thought they had discovered incredible bits of mathematics like Newton’s calculus
and their sort of idealism, their sort of extension of what they could do with logic
and math sort of went along those lines and they thought there’s like, yeah, logic.
There’s like a bunch of rules and a bunch of input.
They didn’t realize that how you recognize a face is not just a bunch of rules but is
a shit ton of data plus a circuit that sort of interprets the visual clues and the context
and everything else and somehow can massively parallel pattern match against stored rules.
I mean, if I see you tomorrow here in front of the Dropbox office, I might recognize you.
Even if I’m wearing a different shirt, yeah, but if I see you tomorrow in a coffee shop
in Belmont, I might have no idea that it was you or on the beach or whatever.
I make those kind of mistakes myself all the time.
I see someone that I only know as like, oh, this person is a colleague of my wife’s and
then I see them at the movies and I didn’t recognize them.
But do you see those, you call it pattern matching, do you see that rules is unable
to encode that?
Everything you see, all the pieces of information you look around this room, I’m wearing a black
shirt, I have a certain height, I’m a human, all these, there’s probably tens of thousands
of facts you pick up moment by moment about this scene.
You take them for granted and you aggregate them together to understand the scene.
You don’t think all of that could be encoded to where at the end of the day, you can just
put it all on the table and calculate?
I don’t know what that means.
I mean, yes, in the sense that there is no actual magic there, but there are enough layers
of abstraction from the facts as they enter my eyes and my ears to the understanding of
the scene that I don’t think that AI has really covered enough of that distance.
It’s like if you take a human body and you realize it’s built out of atoms, well, that
is a uselessly reductionist view, right?
The body is built out of organs, the organs are built out of cells, the cells are built
out of proteins, the proteins are built out of amino acids, the amino acids are built
out of atoms and then you get to quantum mechanics.
So that’s a very pragmatic view.
I mean, obviously as an engineer, I agree with that kind of view, but you also have
to consider the Sam Harris view of, well, intelligence is just information processing.
Like you said, you take in sensory information, you do some stuff with it and you come up
with actions that are intelligent.
That makes it sound so easy.
I don’t know who Sam Harris is.
Oh, well, it’s a philosopher.
So like this is how philosophers often think, right?
And essentially that’s what Descartes was, is wait a minute, if there is, like you said,
no magic, so he basically says it doesn’t appear like there’s any magic, but we know
so little about it that it might as well be magic.
So just because we know that we’re made of atoms, just because we know we’re made
of organs, the fact that we know very little how to get from the atoms to organs in a way
that’s recreatable means that you shouldn’t get too excited just yet about the fact that
you figured out that we’re made of atoms.
Right, and the same about taking facts as our sensory organs take them in and turning
that into reasons and actions, that sort of, there are a lot of abstractions that we haven’t
quite figured out how to deal with those.
I mean, sometimes, I don’t know if I can go on a tangent or not, so if I take a simple
program that parses, say I have a compiler that parses a program, in a sense the input
routine of that compiler, of that parser, is a sensing organ, and it builds up a mighty
complicated internal representation of the program it just saw, it doesn’t just have
a linear sequence of bytes representing the text of the program anymore, it has an abstract
syntax tree, and I don’t know how many of your viewers or listeners are familiar with
compiler technology, but there’s…
Fewer and fewer these days, right?
That’s also true, probably.
People want to take a shortcut, but there’s sort of, this abstraction is a data structure
that the compiler then uses to produce outputs that is relevant, like a translation of that
program to machine code that can be executed by hardware, and then that data structure
gets thrown away.
When a fish or a fly sees, sort of gets visual impulses, I’m sure it also builds up some
data structure, and for the fly that may be very minimal, a fly may have only a few, I
mean, in the case of a fly’s brain, I could imagine that there are few enough layers of
abstraction that it’s not much more than when it’s darker here than it is here, well
it can sense motion, because a fly sort of responds when you move your arm towards it,
so clearly its visual processing is intelligent, well, not intelligent, but it has an abstraction
for motion, and we still have similar things in, but much more complicated in our brains,
I mean, otherwise you couldn’t drive a car if you couldn’t, if you didn’t have an
incredibly good abstraction for motion.
Yeah, in some sense, the same abstraction for motion is probably one of the primary
sources of our, of information for us, we just know what to do, I think we know what
to do with that, we’ve built up other abstractions on top.
We build much more complicated data structures based on that, and we build more persistent
data structures, sort of after some processing, some information sort of gets stored in our
memory pretty much permanently, and is available on recall, I mean, there are some things that
you sort of, you’re conscious that you’re remembering it, like, you give me your phone
number, I, well, at my age I have to write it down, but I could imagine, I could remember
those seven numbers, or ten digits, and reproduce them in a while, if I sort of repeat them
to myself a few times, so that’s a fairly conscious form of memorization.
On the other hand, how do I recognize your face, I have no idea.
My brain has a whole bunch of specialized hardware that knows how to recognize faces,
I don’t know how much of that is sort of coded in our DNA, and how much of that is
trained over and over between the ages of zero and three, but somehow our brains know
how to do lots of things like that, that are useful in our interactions with other humans,
without really being conscious of how it’s done anymore.
Right, so our actual day to day lives, we’re operating at the very highest level of abstraction,
we’re just not even conscious of all the little details underlying it.
There’s compilers on top of, it’s like turtles on top of turtles, or turtles all the way
down, there’s compilers all the way down, but that’s essentially, you say that there’s
no magic, that’s what I, what I was trying to get at, I think, is with Descartes started
this whole train of saying that there’s no magic, I mean, there’s all this beforehand.
Well didn’t Descartes also have the notion though that the soul and the body were fundamentally
separate?
Separate, yeah, I think he had to write in God in there for political reasons, so I don’t
know actually, I’m not a historian, but there’s notions in there that all of reasoning, all
of human thought can be formalized.
I think that continued in the 20th century with Russell and with Gadot’s incompleteness
theorem, this debate of what are the limits of the things that could be formalized, that’s
where the Turing machine came along, and this exciting idea, I mean, underlying a lot of
computing that you can do quite a lot with a computer.
You can encode a lot of the stuff we’re talking about in terms of recognizing faces and so
on, theoretically, in an algorithm that can then run on a computer.
And in that context, I’d like to ask programming in a philosophical way, what does it mean
to program a computer?
So you said you write a Python program or compiled a C++ program that compiles to some
byte code, it’s forming layers, you’re programming a layer of abstraction that’s higher, how
do you see programming in that context?
Can it keep getting higher and higher levels of abstraction?
I think at some point the higher levels of abstraction will not be called programming
and they will not resemble what we call programming at the moment.
There will not be source code, I mean, there will still be source code sort of at a lower
level of the machine, just like there are still molecules and electrons and sort of
proteins in our brains, but, and so there’s still programming and system administration
and who knows what, to keep the machine running, but what the machine does is a different level
of abstraction in a sense, and as far as I understand the way that for the last decade
or more people have made progress with things like facial recognition or the self driving
cars is all by endless, endless amounts of training data where at least as a lay person,
and I feel myself totally as a lay person in that field, it looks like the researchers
who publish the results don’t necessarily know exactly how their algorithms work, and
I often get upset when I sort of read a sort of a fluff piece about Facebook in the newspaper
or social networks and they say, well, algorithms, and that’s like a totally different interpretation
of the word algorithm, because for me, the way I was trained or what I learned when I
was eight or ten years old, an algorithm is a set of rules that you completely understand
that can be mathematically analyzed and you can prove things.
You can like prove that Aristotelian sieve produces all prime numbers and only prime
numbers.
Yeah.
So I don’t know if you know who Andrej Karpathy is, I’m afraid not.
So he’s a head of AI at Tesla now, but he was at Stanford before and he has this cheeky
way of calling this concept software 2.0.
So let me disentangle that for a second.
So kind of what you’re referring to is the traditional, the algorithm, the concept of
an algorithm, something that’s there, it’s clear, you can read it, you understand it,
you can prove it’s functioning as kind of software 1.0.
And what software 2.0 is, is exactly what you described, which is you have neural networks,
which is a type of machine learning that you feed a bunch of data and that neural network
learns to do a function.
All you specify is the inputs and the outputs you want and you can’t look inside.
You can’t analyze it.
All you can do is train this function to map the inputs to the outputs by giving a lot
of data.
And that’s as programming becomes getting a lot of data.
That’s what programming is.
Well, that would be programming 2.0.
To programming 2.0.
I wouldn’t call that programming.
It’s just a different activity.
Just like building organs out of cells is not called chemistry.
Well, so let’s just step back and think sort of more generally, of course.
But you know, it’s like as a parent teaching your kids, things can be called programming.
In that same sense, that’s how programming is being used.
You’re providing them data, examples, use cases.
So imagine writing a function not by, not with for loops and clearly readable text,
but more saying, well, here’s a lot of examples of what this function should take.
And here’s a lot of examples of when it takes those functions, it should do this.
And then figure out the rest.
So that’s the 2.0 concept.
And so the question I have for you is like, it’s a very fuzzy way.
This is the reality of a lot of these pattern recognition systems and so on.
It’s a fuzzy way of quote unquote programming.
What do you think about this kind of world?
Should it be called something totally different than programming?
If you’re a software engineer, does that mean you’re designing systems that are very, can
be systematically tested, evaluated, they have a very specific specification and then this
other fuzzy software 2.0 world, machine learning world, that’s something else totally?
Or is there some intermixing that’s possible?
Well the question is probably only being asked because we don’t quite know what that software
2.0 actually is.
And I think there is a truism that every task that AI has tackled in the past, at some point
we realized how it was done and then it was no longer considered part of artificial intelligence
because it was no longer necessary to use that term.
It was just, oh now we know how to do this.
And a new field of science or engineering has been developed and I don’t know if sort
of every form of learning or sort of controlling computer systems should always be called programming.
So I don’t know, maybe I’m focused too much on the terminology.
But I expect that there just will be different concepts where people with sort of different
education and a different model of what they’re trying to do will develop those concepts.
I guess if you could comment on another way to put this concept is, I think the kind of
functions that neural networks provide is things as opposed to being able to upfront
prove that this should work for all cases you throw at it.
All you’re able, it’s the worst case analysis versus average case analysis.
All you’re able to say is it seems on everything we’ve tested to work 99.9% of the time, but
we can’t guarantee it and it fails in unexpected ways.
We can’t even give you examples of how it fails in unexpected ways, but it’s like really
good most of the time.
Is there no room for that in current ways we think about programming?
programming 1.0 is actually sort of getting to that point too, where the sort of the ideal
of a bug free program has been abandoned long ago by most software developers.
We only care about bugs that manifest themselves often enough to be annoying.
And we’re willing to take the occasional crash or outage or incorrect result for granted
because we can’t possibly, we don’t have enough programmers to make all the code bug free
and it would be an incredibly tedious business.
And if you try to throw formal methods at it, it becomes even more tedious.
So every once in a while the user clicks on a link and somehow they get an error and the
average user doesn’t panic.
They just click again and see if it works better the second time, which often magically
it does, or they go up and they try some other way of performing their tasks.
So that’s sort of an end to end recovery mechanism and inside systems there is all
sorts of retries and timeouts and fallbacks and I imagine that that sort of biological
systems are even more full of that because otherwise they wouldn’t survive.
Do you think programming should be taught and thought of as exactly what you just said?
I come from this kind of, you’re always denying that fact always.
In sort of basic programming education, the sort of the programs you’re having students
write are so small and simple that if there is a bug you can always find it and fix it.
Because the sort of programming as it’s being taught in some, even elementary, middle schools,
in high school, introduction to programming classes in college typically, it’s programming
in the small.
Very few classes sort of actually teach software engineering, building large systems.
Every summer here at Dropbox we have a large number of interns.
Every tech company on the West Coast has the same thing.
These interns are always amazed because this is the first time in their life that they
see what goes on in a really large software development environment.
Everything they’ve learned in college was almost always about a much smaller scale and
somehow that difference in scale makes a qualitative difference in how you do things and how you
think about it.
If you then take a few steps back into decades, 70s and 80s, when you were first thinking
about Python or just that world of programming languages, did you ever think that there would
be systems as large as underlying Google, Facebook, and Dropbox?
Did you, when you were thinking about Python?
I was actually always caught by surprise by sort of this, yeah, pretty much every stage
of computing.
So maybe just because you’ve spoken in other interviews, but I think the evolution of programming
languages are fascinating and it’s especially because it leads from my perspective towards
greater and greater degrees of intelligence.
I learned the first programming language I played with in Russia was with the Turtle
logo.
Logo, yeah.
And if you look, I just have a list of programming languages, all of which I’ve now played with
a little bit.
I mean, they’re all beautiful in different ways from Fortran, Cobalt, Lisp, Algol 60,
Basic, Logo again, C, as a few, the object oriented came along in the 60s, Simula, Pascal,
Smalltalk.
All of that leads.
They’re all the classics.
The classics.
Yeah.
The classic hits, right?
Steam, that’s built on top of Lisp.
On the database side, SQL, C++, and all of that leads up to Python, Pascal too, and that’s
before Python, MATLAB, these kind of different communities, different languages.
So can you talk about that world?
I know that sort of Python came out of ABC, which I actually never knew that language.
I just, having researched this conversation, went back to ABC and it looks remarkably,
it has a lot of annoying qualities, but underneath those, like all caps and so on, but underneath
that, there’s elements of Python that are quite, they’re already there.
That’s where I got all the good stuff.
All the good stuff.
So, but in that world, you’re swimming these programming languages, were you focused on
just the good stuff in your specific circle, or did you have a sense of what is everyone
chasing?
You said that every programming language is built to scratch an itch.
Were you aware of all the itches in the community?
And if not, or if yes, I mean, what itch were you trying to scratch with Python?
Well, I’m glad I wasn’t aware of all the itches because I would probably not have been able
to do anything.
I mean, if you’re trying to solve every problem at once, you’ll solve nothing.
Well, yeah, it’s too overwhelming.
And so I had a very, very focused problem.
I wanted a programming language that sat somewhere in between shell scripting and C. And now,
arguably, there is like, one is higher level, one is lower level.
And Python is sort of a language of an intermediate level, although it’s still pretty much at
the high level end.
I was thinking about much more about, I want a tool that I can use to be more productive
as a programmer in a very specific environment.
And I also had given myself a time budget for the development of the tool.
And that was sort of about three months for both the design, like thinking through what
are all the features of the language syntactically and semantically, and how do I implement the
whole pipeline from parsing the source code to executing it.
So I think both with the timeline and the goals, it seems like productivity was at the
core of it as a goal.
So like, for me in the 90s, and the first decade of the 21st century, I was always doing
machine learning, AI programming for my research was always in C++.
And then the other people who are a little more mechanical engineering, electrical engineering,
are MATLABby.
They’re a little bit more MATLAB focused.
Those are the world, and maybe a little bit Java too.
But people who are more interested in emphasizing the object oriented nature of things.
So within the last 10 years or so, especially with the oncoming of neural networks and these
packages that are built on Python to interface with neural networks, I switched to Python
and it’s just, I’ve noticed a significant boost that I can’t exactly, because I don’t
think about it, but I can’t exactly put into words why I’m just much, much more productive.
Just being able to get the job done much, much faster.
So how do you think, whatever that qualitative difference is, I don’t know if it’s quantitative,
it could be just a feeling, I don’t know if I’m actually more productive, but how
do you think about…
You probably are.
Yeah.
Well, that’s right.
I think there’s elements, let me just speak to one aspect that I think that was affecting
my productivity is C++ was, I really enjoyed creating performant code and creating a beautiful
structure where everything that, you know, this kind of going into this, especially with
the newer and newer standards of templated programming of just really creating this beautiful
formal structure that I found myself spending most of my time doing that as opposed to getting
it, parsing a file and extracting a few keywords or whatever the task was trying to do.
So what is it about Python?
How do you think of productivity in general as you were designing it now, sort of through
the decades, last three decades, what do you think it means to be a productive programmer?
And how did you try to design it into the language?
There are different tasks and as a programmer, it’s useful to have different tools available
that sort of are suitable for different tasks.
So I still write C code, I still write shell code, but I write most of my things in Python.
Why do I still use those other languages, because sometimes the task just demands it.
And well, I would say most of the time the task actually demands a certain language because
the task is not write a program that solves problem X from scratch, but it’s more like
fix a bug in existing program X or add a small feature to an existing large program.
But even if you’re not constrained in your choice of language by context like that, there
is still the fact that if you write it in a certain language, then you have this balance
between how long does it take you to write the code and how long does the code run?
And when you’re in the phase of exploring solutions, you often spend much more time
writing the code than running it because every time you’ve run it, you see that the output
is not quite what you wanted and you spend some more time coding.
And a language like Python just makes that iteration much faster because there are fewer
details that you have to get right before your program compiles and runs.
There are libraries that do all sorts of stuff for you, so you can sort of very quickly take
a bunch of existing components, put them together, and get your prototype application running.
Just like when I was building electronics, I was using a breadboard most of the time,
so I had this sprawl out circuit that if you shook it, it would stop working because it
was not put together very well, but it functioned and all I wanted was to see that it worked
and then move on to the next schematic or design or add something to it.
Once you’ve sort of figured out, oh, this is the perfect design for my radio or light
sensor or whatever, then you can say, okay, how do we design a PCB for this?
How do we solder the components in a small space?
How do we make it so that it is robust against, say, voltage fluctuations or mechanical disruption?
I know nothing about that when it comes to designing electronics, but I know a lot about
that when it comes to writing code.
So the initial steps are efficient, fast, and there’s not much stuff that gets in the
way, but you’re kind of describing, like Darwin described the evolution of species, right?
You’re observing of what is true about Python.
Now if you take a step back, if the act of creating languages is art and you had three
months to do it, initial steps, so you just specified a bunch of goals, sort of things
that you observe about Python, perhaps you had those goals, but how do you create the
rules, the syntactic structure, the features that result in those?
So I have in the beginning and I have follow up questions about through the evolution of
Python too, but in the very beginning when you were sitting there creating the lexical
analyzer or whatever.
Python was still a big part of it because I sort of, I said to myself, I don’t want
to have to design everything from scratch, I’m going to borrow features from other languages
that I like.
Oh, interesting.
So you basically, exactly, you first observe what you like.
Yeah, and so that’s why if you’re 17 years old and you want to sort of create a programming
language, you’re not going to be very successful at it because you have no experience with
other languages, whereas I was in my, let’s say mid 30s, I had written parsers before,
so I had worked on the implementation of ABC, I had spent years debating the design of ABC
with its authors, with its designers, I had nothing to do with the design, it was designed
fully as it ended up being implemented when I joined the team.
But so you borrow ideas and concepts and very concrete sort of local rules from different
languages like the indentation and certain other syntactic features from ABC, but I chose
to borrow string literals and how numbers work from C and various other things.
So in then, if you take that further, so yet you’ve had this funny sounding, but I think
surprisingly accurate and at least practical title of benevolent dictator for life for
quite, you know, for the last three decades or whatever, or no, not the actual title,
but functionally speaking.
So you had to make decisions, design decisions.
Can you maybe, let’s take Python 2, so releasing Python 3 as an example.
It’s not backward compatible to Python 2 in ways that a lot of people know.
So what was that deliberation, discussion, decision like?
Yeah.
What was the psychology of that experience?
Do you regret any aspects of how that experience undergone that?
Well, yeah, so it was a group process really.
At that point, even though I was BDFL in name and certainly everybody sort of respected
my position as the creator and the current sort of owner of the language design, I was
looking at everyone else for feedback.
Sort of Python 3.0 in some sense was sparked by other people in the community pointing
out, oh, well, there are a few issues that sort of bite users over and over.
Can we do something about that?
And for Python 3, we took a number of those Python words as they were called at the time
and we said, can we try to sort of make small changes to the language that address those
words?
And we had sort of in the past, we had always taken backwards compatibility very seriously.
And so many Python words in earlier versions had already been resolved because they could
be resolved while maintaining backwards compatibility or sort of using a very gradual path of evolution
of the language in a certain area.
And so we were stuck with a number of words that were widely recognized as problems, not
like roadblocks, but nevertheless sort of things that some people trip over and you know that
that’s always the same thing that people trip over when they trip.
And we could not think of a backwards compatible way of resolving those issues.
But it’s still an option to not resolve the issues, right?
And so yes, for a long time, we had sort of resigned ourselves to, well, okay, the language
is not going to be perfect in this way and that way and that way.
And we sort of, certain of these, I mean, there are still plenty of things where you
can say, well, that particular detail is better in Java or in R or in Visual Basic or whatever.
And we’re okay with that because, well, we can’t easily change it.
It’s not too bad.
We can do a little bit with user education or we can have a static analyzer or warnings
in the parse or something.
But there were things where we thought, well, these are really problems that are not going
away.
They are getting worse in the future.
We should do something about that.
But ultimately there is a decision to be made, right?
So was that the toughest decision in the history of Python you had to make as the benevolent
dictator for life?
Or if not, what are there, maybe even on the smaller scale, what was the decision where
you were really torn up about?
Well, the toughest decision was probably to resign.
All right, let’s go there.
Hold on a second then.
Let me just, because in the interest of time too, because I have a few cool questions for
you and let’s touch a really important one because it was quite dramatic and beautiful
in certain kinds of ways.
In July this year, three months ago, you wrote, now that PEP 572 is done, I don’t ever want
to have to fight so hard for a PEP and find that so many people despise my decisions.
I would like to remove myself entirely from the decision process.
I’ll still be there for a while as an ordinary core developer and I’ll still be available
to mentor people, possibly more available.
But I’m basically giving myself a permanent vacation from being BDFL, benevolent dictator
for life.
And you all will be on your own.
First of all, it’s almost Shakespearean.
I’m not going to appoint a successor.
So what are you all going to do?
Create a democracy, anarchy, a dictatorship, a federation?
So that was a very dramatic and beautiful set of statements.
It’s almost, it’s open ended nature called the community to create a future for Python.
It’s just kind of a beautiful aspect to it.
So what, and dramatic, you know, what was making that decision like?
What was on your heart, on your mind, stepping back now a few months later?
I’m glad you liked the writing because it was actually written pretty quickly.
It was literally something like after months and months of going around in circles, I had
finally approved PEP572, which I had a big hand in its design, although I didn’t initiate
it originally.
I sort of gave it a bunch of nudges in a direction that would be better for the language.
So sorry, just to ask, is async IO, that’s the one or no?
PEP572 was actually a small feature, which is assignment expressions.
That had been, there was just a lot of debate where a lot of people claimed that they knew
what was Pythonic and what was not Pythonic, and they knew that this was going to destroy
the language.
This was like a violation of Python’s most fundamental design philosophy, and I thought
that was all bullshit because I was in favor of it, and I would think I know something
about Python’s design philosophy.
So I was really tired and also stressed of that thing, and literally after sort of announcing
I was going to accept it, a certain Wednesday evening I had finally sent the email, it’s
accepted.
I can just go implement it.
So I went to bed feeling really relieved, that’s behind me.
And I wake up Thursday morning, 7 a.m., and I think, well, that was the last one that’s
going to be such a terrible debate, and that’s the last time that I let myself be so stressed
out about a pep decision.
I should just resign.
I’ve been sort of thinking about retirement for half a decade, I’ve been joking and sort
of mentioning retirement, sort of telling the community at some point in the future
I’m going to retire, don’t take that FL part of my title too literally.
And I thought, okay, this is it.
I’m done, I had the day off, I wanted to have a good time with my wife, we were going to
a little beach town nearby, and in I think maybe 15, 20 minutes I wrote that thing that
you just called Shakespearean.
The funny thing is I didn’t even realize what a monumental decision it was, because
five minutes later I read that link to my message back on Twitter, where people were
already discussing on Twitter, Guido resigned as the BDFL.
And I had posted it on an internal forum that I thought was only read by core developers,
so I thought I would at least have one day before the news would sort of get out.
The on your own aspects had also an element of quite, it was quite a powerful element
of the uncertainty that lies ahead, but can you also just briefly talk about, for example
I play guitar as a hobby for fun, and whenever I play people are super positive, super friendly,
they’re like, this is awesome, this is great.
But sometimes I enter as an outside observer, I enter the programming community and there
seems to sometimes be camps on whatever the topic, and the two camps, the two or plus
camps, are often pretty harsh at criticizing the opposing camps.
As an onlooker, I may be totally wrong on this, but what do you think of this?
Yeah, holy wars are sort of a favorite activity in the programming community.
And what is the psychology behind that?
Is that okay for a healthy community to have?
Is that a productive force ultimately for the evolution of a language?
Well, if everybody is patting each other on the back and never telling the truth, it would
not be a good thing.
I think there is a middle ground where sort of being nasty to each other is not okay,
but there is a middle ground where there is healthy ongoing criticism and feedback that
is very productive.
And you mean at every level you see that.
I mean, someone proposes to fix a very small issue in a code base, chances are that some
reviewer will sort of respond by saying, well, actually, you can do it better the other way.
When it comes to deciding on the future of the Python core developer community, we now
have, I think, five or six competing proposals for a constitution.
So that future, do you have a fear of that future, do you have a hope for that future?
I’m very confident about that future.
By and large, I think that the debate has been very healthy and productive.
And I actually, when I wrote that resignation email, I knew that Python was in a very good
spot and that the Python core developer community, the group of 50 or 100 people who sort of
write or review most of the code that goes into Python, those people get along very well
most of the time.
A large number of different areas of expertise are represented, different levels of experience
in the Python core dev community, different levels of experience completely outside it
in software development in general, large systems, small systems, embedded systems.
So I felt okay resigning because I knew that the community can really take care of itself.
And out of a grab bag of future feature developments, let me ask if you can comment, maybe on all
very quickly, concurrent programming, parallel computing, async IO.
These are things that people have expressed hope, complained about, whatever, have discussed
on Reddit.
Async IO, so the parallelization in general, packaging, I was totally clueless on this.
I just used pip to install stuff, but apparently there’s pipenv, poetry, there’s these dependency
packaging systems that manage dependencies and so on.
They’re emerging and there’s a lot of confusion about what’s the right thing to use.
Then also functional programming, are we going to get more functional programming or not,
this kind of idea.
And of course the GIL connected to the parallelization, I suppose, the global interpreter lock problem.
Can you just comment on whichever you want to comment on?
Well, let’s take the GIL and parallelization and async IO as one topic.
I’m not that hopeful that Python will develop into a sort of high concurrency, high parallelism
language.
That’s sort of the way the language is designed, the way most users use the language, the way
the language is implemented, all make that a pretty unlikely future.
So you think it might not even need to, really the way people use it, it might not be something
that should be of great concern.
I think async IO is a special case because it sort of allows overlapping IO and only
IO and that is a sort of best practice of supporting very high throughput IO, many connections
per second.
I’m not worried about that.
I think async IO will evolve.
There are a couple of competing packages.
We have some very smart people who are sort of pushing us to make async IO better.
Parallel computing, I think that Python is not the language for that.
There are ways to work around it, but you can’t expect to write an algorithm in Python
and have a compiler automatically parallelize that.
What you can do is use a package like NumPy and there are a bunch of other very powerful
packages that sort of use all the CPUs available because you tell the package, here’s the data,
here’s the abstract operation to apply over it, go at it, and then we’re back in the C++
world.
Those packages are themselves implemented usually in C++.
That’s where TensorFlow and all these packages come in, where they parallelize across GPUs,
for example, they take care of that for you.
In terms of packaging, can you comment on the future of packaging in Python?
Packaging has always been my least favorite topic.
It’s a really tough problem because the OS and the platform want to own packaging, but
their packaging solution is not specific to a language.
If you take Linux, there are two competing packaging solutions for Linux or for Unix
in general, but they all work across all languages.
Several languages like Node, JavaScript, Ruby, and Python all have their own packaging solutions
that only work within the ecosystem of that language.
What should you use?
That is a tough problem.
My own approach is I use the system packaging system to install Python, and I use the Python
packaging system then to install third party Python packages.
That’s what most people do.
Ten years ago, Python packaging was really a terrible situation.
Nowadays, pip is the future, there is a separate ecosystem for numerical and scientific Python
based on Anaconda.
Those two can live together.
I don’t think there is a need for more than that.
That’s packaging.
Well, at least for me, that’s where I’ve been extremely happy.
I didn’t even know this was an issue until it was brought up.
In the interest of time, let me sort of skip through a million other questions I have.
So I watched the five and a half hour oral history that you’ve done with the Computer
History Museum, and the nice thing about it, it gave this, because of the linear progression
of the interview, it gave this feeling of a life, you know, a life well lived with interesting
things in it, sort of a pretty, I would say a good spend of this little existence we have
on Earth.
So, outside of your family, looking back, what about this journey are you really proud
of?
Are there moments that stand out, accomplishments, ideas?
Is it the creation of Python itself that stands out as a thing that you look back and say,
damn, I did pretty good there?
Well, I would say that Python is definitely the best thing I’ve ever done, and I wouldn’t
sort of say just the creation of Python, but the way I sort of raised Python, like a baby.
I didn’t just conceive a child, but I raised a child, and now I’m setting the child free
in the world, and I’ve set up the child to sort of be able to take care of himself, and
I’m very proud of that.
And as the announcer of Monty Python’s Flying Circus used to say, and now for something
completely different, do you have a favorite Monty Python moment, or a moment in Hitchhiker’s
Guide, or any other literature show or movie that cracks you up when you think about it?
You can always play me the dead parrot sketch.
Oh, that’s brilliant.
That’s my favorite as well.
It’s pushing up the daisies.
Okay, Greta, thank you so much for talking with me today.
Lex, this has been a great conversation.