Lex Fridman Podcast - #151 - Dan Kokotov Speech Recognition with AI and Humans

🎁Amazon Prime 💗The Drop 📖Kindle Unlimited 🎧Audible Plus 🎵Amazon Music Unlimited 🌿iHerb 💰Binance

The following is a conversation with Dan Kokotov, VP of engineering at rev.ai,

which is by many metrics, the best speech to text AI engine in the world.

Rev in general is a company that does captioning and transcription

of audio by humans and by AI.

I’ve been using their services for a couple of years now and I’m planning

to use Rev to add both captions and transcripts to some of the previous

and future episodes of this podcast to make it easier for people to read

through the conversation or reference various parts of the episode, since

that’s something that quite a few people requested.

I’ll probably do a separate video on that with links on the podcast website

so people can provide suggestions and improvements there.

Quick mention of our sponsors, Athletic Greens, All in One Nutrition Drink,

Blinkist app that summarizes books, Business Wars podcast, and Cash App.

So the choice is health, wisdom, or money.

Choose wisely my friends, and if you wish, click the sponsor links

below to get a discount and to support this podcast.

As a side note, let me say that I reached out to Dan and the Rev

team for a conversation because I’ve been using and genuinely loving

their service and really curious about how it works.

I previously talked to the head of Adobe research for the same reason.

For me, there’s a bunch of products, usually software, that comes along

and just makes my life way easier.

Examples are Adobe Premiere for video editing, iZotope RX for cleaning up audio,

AutoHotKey on Windows for automating keyboard and mouse tasks, Emacs as an

IDE for everything, including the universe itself.

I can keep on going, but you get the idea.

I just like talking to people who create things I’m a big fan of.

That said, after doing this conversation, the folks at Rev.ai offered to sponsor

this podcast in the coming months.

This conversation is not sponsored by the guest.

It probably goes without saying, but I should say it anyway, that you

can not buy your way onto this podcast.

I don’t know why you would want to.

I wanted to bring this up to make a specific point that no sponsor

will ever influence what I do on this podcast, or to the best of

my ability, influence what I think.

I wasn’t really thinking about this.

For example, when I interviewed Jack Dorsey, who is the CEO of Square that

happens to be sponsoring this podcast, but I should really make it explicit.

I will never take money for bringing a guest on.

Every guest on this podcast is someone I genuinely am curious to talk to or just

genuinely love something they’ve created.

As I sometimes get criticized for, I’m just a fan of people.

And that’s who I talk to.

As I also talk about way too much, money is really never a consideration.

In general, no amount of money can buy my integrity.

That’s true for this podcast, and that’s true for anything else I do.

If you enjoy this thing, subscribe on YouTube, review on the Apple podcast,

follow on Spotify, support on Patreon, a podcast on YouTube, and

support on Patreon, or connect with me on Twitter at Lex Friedman.

And now here’s my conversation with Dan Kokotov.

You mentioned science fiction on the phone.

So let’s go with the ridiculous first.

What’s the greatest sci fi novel of all time in your view?

And maybe what ideas do you find philosophically fascinating about it?

The greatest sci fi novel of all time is Dune.

And the second greatest is The Children of Dune.

And the third greatest is The God Emperor of Dune.

So I’m a huge fan of the whole series.

I mean, it’s just an incredible world that he created.

And I don’t know if you’ve read the book or not.

No, I have not.

It’s one of my biggest regrets, especially because a new movie is coming out.

Everyone’s super excited about it.

I used to, it’s ridiculous to say, and sorry to interrupt, is that I

used to play the video game.

It used to be Dune.

I guess you would call that real time strategy.

Right.

I think I remember that game.

Yeah, it was kind of awesome.

Nineties or something.

I think I played it actually when I was in Russia.

I definitely remember it.

I was not in Russia anymore.

I think at the time that I used to live in Russia, I think video games

were about like the suspicion of Pong.

I think Pong was pretty much like the greatest game I ever got to play in Russia,

which was still a privilege right in that age.

So you didn’t get color?

You didn’t get like, uh, so I left Russia.

I left Russia in 1991, right?

Okay.

So I was one of the few lucky kids because my mom was a programmer.

So I would go to her work, right?

I would take the, the Metro.

I’ve got our work and play like on, I guess the equivalent of like a

286 PC, you know, nice floppy disks.

Yes.

So, okay.

Put back to doing what you get back to doing.

And by the way, the new movie I’m pretty interested in, but the

skeptical, I’m a little skeptical.

I’m a little skeptical.

I saw the trailer.

Uh, I don’t know.

So there’s, there’s a David Lynch movie doing as you may know, I’m

a huge David Lynch fan, by the way.

So the movie is somewhat controversial, but it’s a little confusing, but it

captures kind of the mood of the book better than I would say like most any

adaptation and like doing so much about kind of mood and the world, right.

But back to the philosophical point.

So in the fourth book, God, emperor of doing, there’s a sort of setting where

Leto, one of the characters, he’s become this weird sort of God emperor.

He’s turned into a gigantic worm.

I mean, you kind of have to read the book to understand what that means.

So the worms are involved, the worms are involved.

You probably saw the worms in the trailer, right.

And in the video, you kind of like merges with the swarm, um, and becomes

this tyrant of the world and like oppresses the people for a long time.

Right.

But he has a purpose and the purpose is to kind of, uh, break through kind of

a stagnation period in civilization.

Right.

Um, but people have gotten too comfortable, right.

And so you kind of oppresses them so that they explode and like go on to

colonize new worlds and kind of renew the forward momentum of humanity.

Right.

And so to me, that’s kind of fascinating, right.

You need a little bit of pressure and suffering, right.

To kind of like make progress, not, not, not get too comfortable.

Maybe that’s a bit of a cruel philosophy to take away, but that seems to be

the case, unfortunately, obviously, I’m a huge fan of, uh, suffering.

So one of the reasons we’re talking today is that a bunch of people requested

that I do transcripts for this podcast and do captioning.

I used to make all kinds of YouTube videos and I would go on up work, I

think, and I would hire folks to do transcription and it was always a pain

in the ass, if I’m being honest, and then I don’t know how I discovered Rev.

But when I did, it was this feeling of like, Holy shit, somebody figured

out how to do it just really easily.

I I’m, I’m such a fan of just when people take a problem and they just make it easy.

Right.

You know, like just, uh, there’s so many, uh, there’s so many,

it’s like, there’s so many things in life that you might not even

be aware of that are painful.

Then Rev, you just like give the audio, give the video, you can

actually give a YouTube link.

And then it comes back like a day later or, uh, two days later, whatever

the hell it is with the captions, you know, all in a standardized format.

It was, I dunno, it was, it was, it was, it was truly a joy.

So I thought I had, you know, just for the hell of it, uh, talk to you

that one other product just made my soul feel good.

One other product that I’ve used like that is, uh, for people who might

be familiar is called isotope RX.

It’s for audio editing and like, and that’s another one where it was

like, you just drop it.

I dropped into the audio and it just cleans everything up really nicely.

All the stupid, like the mouth sounds and sometimes there’s a background

like sounds due to the malfunction of the equipment.

It can clean that stuff up.

It can, it has a general voice denoising.

It has like automation capabilities where you can do batch processing

and you can put a bunch of effects.

I mean, it just, I dunno, everything else sucked for like voice based

cleanup that I’ve ever used.

They’ve used audition, Adobe audition, and he’s all kinds of other things

with plugins and you have to kind of figure it all out.

You have to do it manually here.

It’s just, it just worked.

So that’s another one in this whole pipeline.

It just brought joy to my, to my heart.

Anyway, all that to say is, uh, uh, Rev put a smile to my face.

So can you maybe take a step back and say, what is Rev and how does it work?

And Rev or Rev.com?

Rev, Rev.com, the same thing, I guess, uh, that way we do have Rev.ai now as

well, which we can talk about later.

Like, do you have the actual domain or is it just, uh, the actual domain,

but we also use it kind of as a, as a sub brand.

Oh, so we’ve, so we use Rev.ai to denote our ASR services, right?

And Rev.com is kind of our more human and to the end user services.

So it’s like wordpress.com and wordpress.org, they actually have separate

brands that like, I don’t know if you’re familiar with what those are.

Yeah, they provide almost like a separate branch of a little bit.

I think with that, it’s like wordpress.org is kind of their open source, right?

And, uh, wordpress.com is sort of their hosted commercial offering.

Yes.

Um, and with us, the differential is a little bit different,

but maybe a similar idea.

Yep.

Okay.

So what is Rev?

Before I launch into, uh, what is Rev?

I was going to say, you know, like you, you were talking about like

Rev was music to your ears, your, your, your field was music to my ears.

To us, the founders of Rev, because, um, Rev was kind of founded to improve

on the model of Upwork that was kind of the original, um, or part of their

original impetus, like our CEO, Jason, was a early employee of Upwork.

So he was very familiar with their work, the company Upwork company.

Um, and so he was very familiar with that model and he wanted to make the whole

experience better because he knew like, when you go at that time, Upwork was

primarily programmers, so the main thing they offered us, if you want to hire,

you know, someone to help you code a little site, right.

Um, you could go on Upwork, um, you could like browse through a list of freelancers,

pick a programmer, you know, have a contract with them and have them do some

work, but it was kind of a difficult experience because, uh, for the, for you,

you would kind of have to browse through all these people, right.

And you have to decide, okay, like, well, is this guy good as, um, or somebody

else better and naturally, you know, you’re going to Upwork because you’re not

an expert, right?

If you’re an expert, you probably wouldn’t be like getting a programmer

from Upwork, uh, so, so how can you really tell?

So there’s a kind of like a lot of potential regret, right?

What if I choose a bad person, they’re like, going to be late on the work.

It’s going to be a painful experience.

And for the freelancer, it was also painful because, you know, half the time

they spent not on actually doing the work, but kind of figuring out how can I make

my profile most attractive to the buyer, right?

And they’re not an expert on that either.

So like Grav’s idea was let’s remove the barrier, right?

Like, let’s make it simple where we’ll pick a few, uh, verticals

that are fairly standardizable.

Now we actually started with translation, um, and then we added

audio transcription a bit later and we’ll just make it a website.

You go give us your files.

We’ll give you back, uh, the results, you know, as soon as possible.

You know, originally maybe it was 48 hours.

Then we made it shorter and shorter and shorter.

Um, yeah, there’s a rush processing too.

There’s a rush processing now, uh, and, uh, we’ll hide all the details from you.

Right.

Yeah.

And like, that’s kind of exactly what you’re experiencing, right?

You don’t, you don’t need to worry about the details of how the sausage is made.

That’s really cool.

The, so you picked like a vertical by vertically, you mean basically a

service, a service category.

Why translation is Rev thinking of potentially going into other verticals

in the future, or is this like the focus now is a translation transcription, like

language, the focus now is, is language or, uh, speech services, generally speech

to text language services, you can kind of group them however you want.

Um, so, but we, uh, originally the categorization was work from home.

So when, uh, work that was done by people on a computer, you know, we weren’t trying

to get into, you know, uh, task rabbit type of things and something that could

be relatively standard, not a lot of options.

So we could kind of present the simplified interface, right?

So programming wasn’t like a good fit because each programming

project is kind of unique, right?

We’re looking for something that, uh, Transcription is, you know, you have five

hours of idea, it’s five hours of audio, right?

Translation is somewhat similar.

In that, you know, you can have a five page document, you know, and then you

just can price it by that and then you pick the language you want and that

that’s mostly all that is to it.

So those were a few criteria.

We started with translation because we saw the need, um, and, uh, we picked up

kind of a specialty of translation, um, where we would translate things like

board certificates, uh, uh, immigration documents, things like that.

And so they were fairly, um, even more well defined and easy to

kind of tell if we did a good job.

So you can literally charge per type of document.

Was that, was, was that the, so what, what is it now?

Is it per word or something like that?

Like, how do you, like, how do you measure the effort

involved in a particular thing?

So now it looks like for audio translation, it’s like,

so now it looks a for audio transcription, right?

It’s a per audio minute.

Well, that, that yes, for, for, for our translation, we don’t really,

uh, actually focus it on anymore.

Uh, but you know, back when it was still a main business of Revit was per page,

right.

Or per word, depending on the kind of, uh, cause you can also do translation

now on the audio, right?

Mm hmm.

So like subtitles.

So it would be both, uh, transcription and translation.

That’s right.

I wanted to test the system to see how good it is to see like how, how, uh,

well, is Russian supported?

I think so.

Yeah.

And it’d be interesting to try it out.

I mean, one of the, now it’s only in like the one direction, right?

So you start with English and then you can have subtitles in Russian.

Not really, not really the other way.

Got it.

Because it’s, um, I’m deeply curious about this.

Um, when COVID opens up a little bit, when the economy, when the

world opens up a little bit, you want to build your brand in Russia?

No, I don’t.

First of all, I’m allergic to the word brand.

All right, I’m definitely not building, uh, any brands in Russia, but I’m going to

Paris to talk to the translators of Dostoevsky and Tolstoy, there’s this

famous couple that does translation.

And, you know, I’m more and more thinking of how is it possible to have a

conversation with a Russian speaker?

Cause I have just some number of famous Russian speakers that

I’m interested in talking to, and my Russian is not strong

enough to be witty and funny.

I’m already an idiot in English.

I’m an extra level of like awkward idiot in Russian, but I can understand it.

Right.

And I also like wonder how can I create a compelling English Russian

experience for an English speaker?

Like if I, there’s a guy named Grigori Perlman, who’s a mathematician who,

uh, obviously doesn’t speak any English.

So I would probably incorporate like a Russian translator into the picture.

And then it would be like a, not to use a weird term, but like a three, like a

three, three person thing where it’s like a dance of, like, I understand it one way.

They don’t understand the other way, but I’ll be asking questions in English.

I don’t know.

I don’t know the right way.

It’s complicated.

It’s complicated, but I feel like it’s worth the effort for certain kinds of

people, one of whom I’m confident of Vladimir Putin, I’m for sure talking to.

I really want to make it happen.

Cause I think I could do a good job with, but the, the right, you know,

understanding the fundamentals of translation is something I’m really

interested in.

So that’s why I’m starting with, um, the actual translators of like Russian

literature, because they understand the nuance and the beauty of the

language and how it goes back and forth.

But I also want to see, like in speech, how can we do it in real time?

So that’s, that’s like a little bit of a baby project that I hope to push forward.

But anyway, it’s a challenging thing.

So just to share, uh, my dad, um, actually does translation, um, not, not

professional, he’s a, uh, he writes poetry.

That was kind of always his, uh, not a hobby, but he’s, uh, he, you know, he

had a job, like a day job, but his passion was always writing poetry.

Uh, and then when I got to America, like he started also translating, um, first

he was translating English poetry to Russia.

Now he also like goes the other, uh, the other way, you kind of gain some small

fame in that world anyways, because, uh, recently this poet like Lewis

clock, I don’t know if you know of, uh, some American poet, um, she was

awarded the Nobel prize for literature.

Uh, and so my dad had translated, uh, one of her books of poetry in

to Russian, and he was like one of the few.

So he kind of like, they asked him and gave an interview to Radiosvoboda,

if you know what that is.

And he kind of talked about some of the intricacies of translating poetry.

So that’s like an extra level of difficulty, right?

Because translating poetry is even more challenging than

translating just, you know, interviews.

Do you remember any, any experiences and challenges to having to do the

translation that, that’s the God to you, like something he’s talked about?

I mean, a lot of it, I think is word choice, right?

It’s the way Russian is structured is first of all, quite different

than, um, the way English is structured, right?

Just there is inflections in Russian and genders, and they don’t exist in English.

That’s just one of the reasons actually why, um, machine translation is quite

difficult for English to Russian and Russian to English, because there’s

such different languages, but then English has like a huge number of words.

Um, many more than Russian, actually, I think.

So it’s often difficult to find the right word to convey the same emotional

meaning, yeah, Russian language.

They play with words much more.

So you, you’re mentioning that, uh, Rev was kind of born out of, um, trying to

take a vertical on the upwork and then standardize it.

So we’re just trying to make the, the freelancer marketplace idea better, right?

Um, better for both customers and better for the freelancers themselves.

Is there something else to the story of Rev finding the right word?

Rev, finding Rev, like what, what did it take to bring it to actually to life?

Was there any pain points?

Um, plenty of, plenty of pain points.

I mean, uh, as, as often the case it’s with scaling it up, right?

Um, and in this case, you know, the scaling is kind of scaling the,

the marketplace, so to speak, right?

Rev is essentially a two sided marketplace, right?

Because, you know, there’s the customers and then there’s the reverse.

Um, if there’s not enough Revers, Revers are world color freelancers.

So if there’s not enough Revers, then customers have a bad experience, right?

You know, it takes longer to get your work done.

Um, things like that, you know, if there’s too many done, the Revers have

a bad experience because they might log on to see like what work is available

and there’s not very much work, right?

Uh, so kind of keeping that balance, um, is, is, is a quite challenging problem.

And, you know, that’s, that’s like a problem we’ve been working on for many

years and we’re still like refining our methods, right?

If you can kind of talk to this gig economy idea, I did a bunch of different

psychology experiments on mechanical Turk, for example, I’ve asked to do

different kinds of very tricky computer vision annotation on mechanical Turk.

And it’s connecting, connecting people in a more systematized way.

I would say, you know, between task and, and, uh, what would you call that worker

is what mechanical Turk calls it.

What do you think about this world of gig economies, of there being a service

that connects customers to workers in a way that’s like massively distributed,

like potentially scaling to, it could be, it could be scaled to like

tens of thousands of people, right?

Is there something interesting about that world that you can speak to?

Yeah.

Well, we don’t think of it as kind of gig economy, like to some degree,

I don’t like the word gig that much, right?

Because to some degree it diminishes the words being done, right?

It sounds kind of like almost amateurish.

Well, maybe in like music industry, like gig is a standard term, but in work, it

kind of sounds like, oh, it’s, it’s, it’s frivolous, um, to us it’s, um, improving

the nature of working from home on your own time and on your own terms, right?

And kind of taking away geographical limitations and time limitations, right?

Uh, so, you know, many of our freelancers are maybe work from home moms, right?

And, you know, they don’t want the traditional nine to five job, but they

want to make some income and rough kind of like allows them to do that and decide

like exactly how much to work and when to work or by the same token, maybe someone

is, you know, someone wants to live the mountain top, you know, life, right?

You know, cabin in the woods, but they still want to make some money.

Um, and like generally that wouldn’t be compatible before, before this new world,

you kind of had to choose, uh, but like with Rev, like, if you like, you don’t

have to choose, can you speak to like, what’s the demographics like distribution?

Like where do rivers live?

Is there a way to, to, to, to, to, to, to, to, to, to, to, to, to, to, to, to, to,

to, to, to, but you really want to teach it how to, how to run, how to track

Once you’re out of the bush, 들어가, things like that, you know,

but like in the back of you know, like hard, but you

just as you approach, there’s a lot more control over you.

Like you, you may be Oh, like you know, one day you might go to the

For some years now, we’ve been doing these little meetings

where the management team will go to some place

and we’ll try to meet Revers.

And pretty much wherever we go, it’s

pretty easy to find a large number of Revers.

The most recent one we did is in Utah.

But anyway, really.

Are they from all walks of life?

Are these young folks, older folks?

Yeah, all walks of life, really.

Like I said, one category is the work from home.

Students who want to make some extra income.

There are some people who maybe have some social anxiety,

so they don’t want to be in the office.

And this is one way for them to make a living.

So it’s really pretty wide variety.

But on the flip side, for example,

one Rever we were talking to was a person

who had a fairly high powered career before

and was kind of like taking a break.

And she was almost doing this just to explore and learn

about the gig economy, quote unquote.

So it really is a pretty wide variety of folks.

Yeah, it’s kind of interesting through the captioning

process for me to learn about the Revers

because like some are clearly like weirdly knowledgeable

about technical concepts.

Like you can tell by how good they are

at like capitalizing stuff, like technical terms,

like a machine learning or deep learning.

Like I’ve used Rev to annotate, to caption

the deep learning lectures or machine learning lectures

I did at MIT.

And it’s funny, like a large number of them were like,

I don’t know if they looked it up

or were already knowledgeable,

but they do a really good job at like, I don’t know.

They invest time into these things.

They will like do research, they will Google things,

you know, to kind of make sure that they got it right.

But to some of them, it’s like,

it’s actually part of the enjoyment of the work.

Like they’ll tell us, you know,

I love doing this because I get paid

to watch a documentary on something, right?

And I learn something while I’m transcribing, right?

Pretty cool.

Yeah, so what’s that captioning transcription process

look like for the Revers?

Can you maybe speak to that to give people a sense,

like how much is automated, how much is manual?

What’s the actual interface look like?

All that kind of stuff.

Yeah, so, you know, we’ve invested a pretty good amount

of time to give like our Revers the best tools possible.

So typical day of forever,

they might log into their workspace,

they’ll see a list of audios that need to be transcribed.

And we try to give them tools to pick specifically

the ones they want to do, you know?

So maybe some people like to do longer audios

or shorter audios, people have their preferences.

Some people like to do audios in a particular subject

or from a particular country.

So we try to give people the tools to control,

things like that.

And then when they pick what they want to do,

we’ll launch a specialized editor that we’ve built

to make transcription as efficient as possible.

They’ll start with a speech drag draft.

So, you know, we have our machine learning model

for automated speech recognition, they’ll start with that.

And then our tools are optimized to help them correct that.

So it’s basically a process of correction.

Yeah, it depends on, you know, I would say the audio.

If the audio itself is pretty good,

like probably like our podcast right now

would be quite good.

So the ASR would do a fairly good job.

But if you imagine someone recorded a lecture,

you know, in the back of a auditorium, right?

Where like the speaker is really far away

and there’s maybe a lot of cross talk and things like that,

then maybe the ASR wouldn’t do a good job.

So the person might say like, you know what,

I’m just gonna do it from scratch.

Do it from scratch, yeah.

So it kind of really depends.

What would you say is the speed that you can possibly get?

Like, what’s the fastest?

Can you get, is it possible to get real time or no?

As you’re like listening, can you write as fast as?

Real time would be pretty difficult.

It’s actually a pretty, it’s not an easy job, you know.

We actually encourage everyone at the company

to try to be a transcriber for a day,

transcriptionist for a day.

And it’s way harder than you might think it is, right?

Because people talk fast and people have accents

and all this kind of stuff.

So real time is pretty difficult.

Is it possible?

Like there’s somebody, we’re probably gonna use Rev

to caption this, they’re listening to this right now.

What do you think is the fastest

you could possibly get on this right now?

I think on a good audio, maybe two to three X,

I would say, real time.

Meaning it takes two to three times longer

than the actual audio of the podcast.

This is so meta, I could just imagine the Revvers

working on this right now.

You’re like, you’re way wrong.

You’re way wrong, this takes way longer.

But yeah, it definitely works.

Or you doubted me, I could do real time.

Yeah.

Okay, so you mentioned ASR.

Can you speak to what is ASR, automatic speech recognition?

How much, like what is the gap

between perfect human performance

and perfect or pretty damn good ASR?

Yeah, so ASR, automatic speech recognition,

it’s a class of machine learning problem, right?

So take speech like we’re talking

and transform it into a sequence of words, essentially.

Audio of people talking.

Audio, audio to words.

And there’s a variety of different approaches

and techniques, which we could talk about later if you want.

So, we think we have pretty much the world’s best ASR

for this kind of speech, right?

So there’s different kinds of domains, right, for ASR.

Like one domain might be voice assistance, right?

So Siri, very different than what we’re doing, right?

Because Siri, there’s fairly limited vocabulary.

You might ask Siri to play a song

or order a pizza or whatever.

And it’s very good at doing that.

Very different from when we start talking

in a very unstructured way.

And Siri will also generally adapt to your voice

and stuff like this.

So for this kind of audio, we think we have the best.

And our accuracy, right now I think it’s maybe 14%

word error rate on our test suite

that we generally use to measure.

So word error rate is like one way to measure accuracy

for ASR, right?

So what’s 14% word error rate?

So 14% means across this test suite,

of a variety of different audios,

it would be, it would get in some way 14% of the words wrong.

14% of the words wrong.

So the way you kind of calculate it is,

you might add up insertions, deletions, and substitutions,

right?

So insertions is like extra words.

Deletions are words that we said,

but weren’t in the transcript, right?

Substitutions is, you said Apple, but I said,

but the ASR thought it was able, something like this.

Human accuracy, most people think realistically,

it’s like 3%, 2%, word error rate would be like

the max achievable.

So there’s still quite a gap, right?

Would you say that, so YouTube, when I upload videos,

often generates automatic captions.

Are you sort of from a company perspective,

from a company perspective, from a tech perspective,

are you trying to beat YouTube, Google?

It’s a hell of a, Google, I mean,

I don’t know how seriously they take this task,

but I imagine it’s quite serious.

And they, you know, Google is probably up there

in terms of their teams on, on ASR or just NLP,

natural language processing, different technologies.

So do you think you can beat Google?

On this kind of stuff, yeah, we think so.

Google just woke up on my phone.

This is hilarious, okay.

Now Google is listening, sending it back to headquarters.

Who are these rough people?

But that’s the goal?

Yeah, I mean, we measure ourselves against like Google,

Amazon, Microsoft, you know, some of the,

some smaller competitors.

And we use like our internal tests with it,

we try to compose it of a pretty representative

set of ideas, maybe it’s some podcasts, some videos,

some interviews, some lectures, things like that, right?

And we beat them in our own testing.

And actually Rev offers automated,

like you can actually just do the automated captioning.

So like, I guess it’s like way cheaper, whatever it is,

whatever the rates are.

Yeah, yeah.

So it’s a, by the way, it used to be a dollar per minute

for captioning and transcription,

I think it’s like $1.15 or something like that.

$1.25.

$1.25, yeah.

It’s pretty cool.

That was the other thing that was surprising to me,

it was actually like the cheapest thing you could,

one of the, I mean, I don’t remember it being cheaper.

You could on Upwork get cheaper,

but it was clear to me that this,

that’s gonna be really shitty.

Yeah.

So like, you’re also competing on price.

I think there were services that you can get,

like similar to Rev kind of feel to it,

but it wasn’t as automated.

Like the drag and drop, the entirety of the interface,

it’s like the thing we’re talking about.

I’m such a huge fan of like frictionless,

like Amazon’s single buy button, whatever.

Yeah, yeah.

That one click.

The one click, that’s genius right there.

Like that is so important for services.

Yeah.

And simplicity and I mean, Rev is almost there.

I mean, there’s like some, I’m trying to think.

So I think I’ve, I stopped using this pipeline,

but Rev offers it and I like it,

but it was causing me some issues on my side,

which is you can connect it to like Dropbox

and it generates the files in Dropbox.

So like it closes the loop to where

I don’t have to go to Rev at all and I can download it.

Sorry, I don’t have to go to Rev at all

and to download the files.

It could just like automatically copy them.

Right, you’re putting your Dropbox in a day later

or maybe a few hours later.

Yeah, it just shows up.

Just shows up, yeah.

Yeah, I was trying to do it programmatically too.

Is there an API interface you can,

I was trying to through like through Python

to download stuff automatically,

but then I realized this is the programmer in me.

Like, dude, you don’t need to automate everything

like in life, like flawlessly,

because I wasn’t doing enough captions

to justify to myself the time investment

into automating everything perfectly.

Yeah, I would say if you’re doing so many interviews

that your biggest roadblock is clicking on the Rev download,

but now you’re talking about Elon Musk levels of business.

But for sure, we have like a variety of ways

to make it easy.

You know, there’s the integration.

You mentioned, I think it’s through a company called Zapier,

which kind of can connect Dropbox to Rev and vice versa.

We have an API if you want to really like customize it,

you know, if you want to create the Lex Friedman,

you know, CMS or whatever.

For this whole thing.

Okay, cool.

So can you speak to the ASR a little bit more?

Like, what does it take?

Like approach wise, machine learning wise,

how hard is this problem?

How do you get to the 3% error rate?

Like, what’s your vision of all of this?

Yeah, well, the 3% error rate is definitely,

that’s the grand vision.

We’ll see what it takes to get there.

But we believe, you know, in ASR,

the biggest thing is the data, right?

Like, that’s true of like a lot of

machine learning problems today, right?

The more data you have and high quality of the data,

the better label the data.

Yeah, that’s how you get good results.

And we at Rev have kind of like the best data.

Like we have.

Like you’re literally,

your business model is annotating the data.

Our business model is being paid to annotate the data.

Being paid to annotate the data.

So it’s kind of like a pretty magical flywheel.

Yeah.

And so we’ve kind of like written this flywheel

to this point.

And we think we’re still kind of in the early stages

of figuring out all the parts of the flywheel to use,

you know, because we have the final transcripts

and we have the audios and we train on that.

But we in principle also have all the edits

that the Revvers make, right?

Oh, that’s interesting.

How can you use that as data?

Yeah, that’s something for us to figure out in the future.

But, you know, we feel like we’re only

in the early stages, right?

So the data is there.

That’d be interesting.

Like almost like a recurrent neural net

for fixing transcripts.

I always remember we did a segmentation annotation

for driving data.

So segmenting the scene, like visual data.

And you can get all,

so it was drawing, people were drawing polygons

around different objects and so on.

And it feels like it always felt like

there was a lot of information in the clicking,

the sequence of clicking that people do,

the kind of fixing of the polygons that they do.

Now there’s a few papers written about

how to draw polygons like with a recurrent neural nets

to try to learn from the human clicking.

But it was just like experimental,

you know, it was one of those like CVPR type papers

that people do like a really tiny data set.

It didn’t feel like people really tried to do it seriously.

Yeah, I wonder if there’s information in the fixing

that provides deeper set of signal

than just like the raw data.

The intuition is for sure there must be, right?

There must be.

And in all kinds of signals and how long you took

to make that edit and stuff like that.

It’s gonna be like up to us.

That’s why like the next couple of years

is like super exciting for us, right?

So that’s what like the focus is now.

You mentioned rev.ai, that’s where you want to.

Yeah, so rev.ai is kind of our way of bringing this ASR

to the rest of the world, right?

So when we started, we were human only.

Then we kind of created this Temi service.

I think you might’ve used it,

which was kind of ASR for the consumer, right?

So if you don’t wanna pay $1.25, but you wanna pay,

now it’s 25 cents a minute, I think.

And you get the transcript,

the machine generated transcript and you get an editor

and you can kind of fix it up yourself, right?

Then we started using ASR

for our own human transcriptionists.

And then the kind of rev.ai is the final step

of the journey, which is, you know,

we have this amazing engine.

What can people build with it, right?

What kind of new applications could be enabled

if you have SpeedTrack that’s that accurate?

Do you have ideas for this

or is it just providing it as a service

and seeing what people come up with?

It’s providing it as a service

and seeing what people come up with

and kind of learning from what people do with it.

And we have ideas of our own as well, of course,

but it’s a little bit like, you know,

when AWS provided the building blocks, right?

And they saw what people built with it

and they try to make it easier to build those things, right?

And we kind of hope to do the same thing.

Although AWS kind of does a shitty job of like,

I’m continually surprised, like Mechanical Turk,

for example, how shitty the interface is.

We’re talking about like Rev making me feel good.

Like when I first discovered Mechanical Turk,

the initial idea of it was like,

it made me feel like Rev does,

but then the interface is like, come on.

Yeah, it’s horrible.

Why is it so painful?

Does nobody at Amazon want to like seriously invest in it?

It felt like you can make so much money

if you took this effort seriously.

And it feels like they have a committee

of like two people just sitting back,

like a meeting, they meet once a month,

like what are we going to do with Mechanical Turk?

It’s like two websites making me feel like this,

that and craiglist.org, whatever the hell it is.

It feels like it’s designed in the 90s.

Well, Craigslist basically hasn’t been updated

pretty much since the guy originally built.

Do you seriously think there’s a team,

like how big is the team working on Mechanical Turk?

I don’t know.

There’s some team, right?

I feel like there isn’t.

I’m skeptical.

Yeah.

Well, if nothing else, they benefit from the other teams

like moving things forward in a small way.

But I know what you mean.

We use Mechanical Turk for a couple of things as well.

And yeah, it’s painful UI.

It’s painful, but yeah, it works.

I think most people, the thing is most people

don’t really use the UI, right?

Like we, for example, we use it through the API, right?

But even the API documentation and so on,

like it’s super outdated.

Like I don’t even know what to…

I mean, the same criticism, as long as we’re ranting,

my same criticism goes to the APIs

of most of these companies.

Like Google, for example, the API for the different services

is just the documentation is so shitty.

Like it’s not so shitty.

I should actually be…

I should exhibit some gratitude.

Okay, let’s practice some gratitude.

The documentation is pretty good.

Like most of the things that the API makes available

is pretty good.

It’s just that in the sense that it’s accurate,

sometimes outdated, but like the degree of explanations

with examples is only covering, I would say,

like 50% of what’s possible.

And it just feels a little bit,

like there’s a lot of natural questions

that people would wanna ask that doesn’t get covered.

And it feels like it’s almost there.

Like it’s such a magical thing.

Like the Maps API, YouTube API, there’s a bunch of stuff.

I gotta imagine it’s like, there’s probably some team

at Google responsible for writing this documentation

that’s probably not the engineers, right?

And probably this team is not where you wanna be.

Well, it’s a weird thing.

I sometimes think about this for somebody

who wants to also build a company.

I think about this a lot.

YouTube, the service is one of the most magical,

like I’m so grateful that YouTube exists.

And yet they seem to be quite clueless on so many things

like that everybody’s screaming them at.

Like it feels like whatever the mechanism

that you use to listen to your quote unquote customers,

which is like the creators, is not very good.

Like there’s literally people that are like screaming why,

like their new YouTube studio, for example.

There’s like features that were like begged for

for a really long time.

Like being able to upload multiple videos at the same time.

That wasn’t missing for a really, really long time.

Now, like there’s probably things that I don’t know,

which is maybe for that kind of huge infrastructure,

it’s actually very difficult to build

some of these features.

But the fact that that wasn’t communicated

and it felt like you’re not being heard.

Like I remember this experience for me

and it’s not a pleasant experience.

And it feels like the company doesn’t give a damn about you.

And that’s something to think about.

I’m not sure what that is.

That might have to do with just like small groups

working on these small features and these specific features.

And there’s no overarching like dictator type of human

that says like, why the hell are we neglecting

like Steve Jobs type of characters?

Like there’s people that we need to speak

to the people that like wanna love our product

and they don’t.

Let’s fix this shit.

Maybe at some point you just get so fixated

on the numbers, right?

And it’s like, well, the numbers are pretty great, right?

Like people are watching,

doesn’t seem to be a problem, right?

And you’re not like the person that like build this thing,

right?

So you really care about it.

You’re just there, you came in as a product manager, right?

You got hired sometime later,

your mandate is like increase this number,

like 10%, right?

And you just.

That’s brilliantly put.

Like if you, this is, okay, if there’s a lesson in this

is don’t reduce your company into a metric

of like how much, like you said,

how much people watching the videos and so on

and like convince yourself that everything is working

just because the numbers are going up.

There’s something, you have to have a vision.

You have to want people to love your stuff

because love is ultimately the beginning

of like a successful longterm company

is that they always should love your product.

You have to be like a creator

and have that like creator’s love for your own thing, right?

Like, and you’re pained by these comments, right?

And probably like Apple, I think did this generally

like really well.

They’re well known for kind of keeping teams small

even when they were big, right?

And, you know, he was an engineer,

like there’s a book, a creative selection.

I don’t know if you read it by a Apple engineer

named Ken Koscienda.

It’s kind of a great book actually

because unlike most of these business books where it’s,

you know, here’s how Steve Jobs ran the company.

It’s more like here’s how life was like for me, you know,

an engineer here, the projects I worked on

and here what it was like to pitch Steve Jobs, you know,

on like, you know, I think it was in charge of like

the keyboard and the auto correction, right?

And at Apple, like Steve Jobs reviewed everything.

And so he was like, this is what it was like

to show my demos to Steve Jobs and, you know,

to change them because like Steve Jobs didn’t like how,

you know, the shape of the little key was off

because the rounding of the corner was like not quite right

or something like this, right?

He was famously a stickler for this kind of stuff.

But because the teams were small,

he really owned this stuff, right?

So he really cared.

Yeah, Elon Musk does that similar kind of thing with Tesla,

which is really interesting.

There’s another lesson in leadership in that

is to be obsessed with the details.

And like, he talks to like the lowest level engineers.

Okay, so we’re talking about ASR

and so this is basically where I was saying

we’re gonna take this like ultra seriously.

And then what’s the mission?

To try to keep pushing towards the 3%.

Yeah, and kind of try to build this platform

where all of your, you know, all of your meetings,

you know, they’re as easily accessible as your notes, right?

Like, so, like, imagine all the meetings

a company might have, right?

You know, now that I’m like no longer a programmer, right?

Then I’m a quote unquote manager.

That’s less like my day as in meetings, right?

Yeah.

And, you know, pretty often I wanna like see

what was said, right?

Who said it, you know?

What’s the context?

But it’s generally not really something

that you can easily retrieve, right?

Like imagine if all of those meetings

were indexed, archived, you know, you could go back,

you could share a clip like really easily, right?

So that might change completely.

It’s like everything that’s said,

converted to text might change completely

the dynamics of what we do in this world,

especially now with remote work, right?

Exactly, exactly.

With Zoom and so on.

That’s fascinating to think about.

I mean, for me, I care about podcasts, right?

And one of the things that was,

you know, I’m torn.

I know a lot of the engineers at Spotify.

So I love them very much because they dream big

in terms of like, they wanna empower creators.

So one of my hopes was with Spotify

that they would use a technology like Rev

or something like that to start converting everything

into text and make it indexable.

Like one of the things that sucks with podcasts

is like, it’s hard to find stuff.

Like the model is basically subscription.

Like you find, it’s similar to what YouTube used to be like,

which is you basically find a creator that you enjoy

and you subscribe to them.

And like, you just kind of follow what they’re doing,

but the search and discovery wasn’t a big part of YouTube

like in the early days,

but that’s what currently with podcasts,

like is the search and discovery is like non existent.

You’re basically searching for like

the dumbest possible thing,

which is like keywords in the titles of episodes.

Yeah.

Even aside from a search and discovery, like all the time.

So I listened to like a number of podcasts

and there’s something said,

and I wanna like go back to that later

because I was trying to, I’m trying to remember,

what do you say?

Like maybe like recommended some cool product

that I wanna try out.

And like, it’s basically impossible.

Maybe like some people have pretty good show notes.

So maybe you’ll get lucky and you can find it, right?

But I mean, if everyone had transcripts

and it was all searchable, it would be so much better.

I mean, that’s one of the things that I wanted to,

I mean, one of the reasons we’re talking today

is I wanted to take this quite seriously.

The rough thing, I just been lazy.

So, because I’m very fortunate

that a lot of people support this podcast,

that there’s enough money now to do a transcription and so on.

And it seemed clear to me, especially like CEOs

and sort of like PhDs, like people write to me

who are like graduate students in computer science

or graduate students in whatever the heck field,

it’s clear that their mind,

like they enjoy podcasts

when they’re doing laundry or whatever,

but they wanna revisit the conversation

in a much more rigorous way.

And they really wanna transcript.

Like it’s clear that they want to analyze conversations.

Like so many people wrote to me

about a transcript for Yosha Bach conversation.

I had just a bunch of conversations.

And then on the Elon Musk side,

like reporters, they wanna write a blog post

about your conversation.

So they wanna be able to pull stuff.

And it’s like, they’re essentially doing

on your conversation transcription privately.

They’re doing it for themselves and then starting to pick,

but it’s so much easier when you can actually do it

as a reporter, just look at the transcript.

Yeah, and you can like embed a little thing,

like into your article, right?

Here’s what they said, you can go listen

to like this clip from the section.

I’m actually trying to figure out,

I’ll probably on the website create

like a place where the transcript goes,

like as a webpage so that people can reference it,

like reporters can reference it and so on.

I mean, most of the reporters probably want

to write clickbait articles that are complete falsifying,

which I’m fine with.

It’s the way of journalism, I don’t care.

Like I’ve had this conversation with a friend of mine,

a mixed martial artist, the Ryan Hall.

And we talked about, you know,

as I’ve been reading The Rise and Fall of the Third Reich

and a bunch of books on Hitler and we brought up Hitler

and he made some kind of comment where like,

we should be able to forgive Hitler

and, you know, like we were talking about forgiveness

and we’re bringing that up as like the worst case

possible things, like even, you know,

for people who are Holocaust survivors,

one of the ways to let go of the suffering

they’ve been through is to forgive.

And he brought up like Hitler as somebody

that would potentially be the hardest thing

to possibly forgive, but it might be a worthwhile pursuit

psychologically, so on, blah, blah, blah, it doesn’t matter.

It was very eloquent, very powerful words.

I think people should go back and listen to it.

It’s powerful.

And then all these journalists,

all these articles written about like MMA fight, UFC fight.

MMA fighter loves Hitler.

No, like, well, no, they didn’t.

They were somewhat accurate.

They didn’t say like loves Hitler.

They said, thinks that if Hitler came back to life,

we should forgive him.

Like they kind of, it’s kind of accurate ish,

but the headline made it sound a lot worse

than it was, but I’m fine with it.

That’s the way the world, I wanna almost make it easier

for those journalists and make it easier

for people who actually care about the conversation

to go and look and see.

Right, they can see it for themselves.

For themselves.

There’s the headline, but now you can go.

There’s something about podcasts,

like the audio that makes it difficult

to jump to a spot and to look

for that particular information.

I think some of it, I’m interested in creating,

like myself experimenting with stuff.

So like taking rev and creating a transcript

and then people can go to it.

I do dream that like, I’m not in the loop anymore,

that like, Spotify does it, right?

Like automatically for everybody,

because ultimately that one click purchase

needs to be there, like, you know.

Like you kind of want support from the entire ecosystem.

Exactly.

Like from the tool makers and the podcast creators,

even clients, right?

I mean, imagine if like most podcast apps,

you know, if it was a standard, right?

Here’s how you include a transcript into a podcast, right?

Like it’s just an RSS feed ultimately.

And actually just yesterday I saw this company

called Buzzsprout, I think they’re called.

So they’re trying to do this.

They proposed a spec, an extension to their RSS format

to reference transcripts in a standard way.

And they’re talking about like,

there’s one client dimension that will support it,

but imagine like more clients support it, right?

So any podcast, you could go and see the transcripts

right in your like normal podcast app.

Yeah.

I mean, somebody, so I have somebody who works with me,

works with helps with advertising, Matt, this awesome guy.

He mentioned Buzzsprout to me, but he says,

it’s really annoying because they want exclusive,

they want to host the podcast.

Right.

This is the problem with Spotify too.

This is where I’d like to say, like F Spotify,

there’s a magic to RSS with podcasts.

It can be made available to everyone.

And then there’s all, there’s this ecosystem

of different podcast players that emerge

and they compete freely.

And that’s a beautiful thing,

that that’s why I go on exclusive,

like Joe Rogan went exclusive.

I’m not sure if you’re familiar with,

he went to Spotify as a huge fan of Joe Rogan.

I’ve been kind of nervous about the whole thing,

but let’s see, I hope that Spotify steps up.

They’ve added video, which was very surprising

that they were able to put on.

Exclusive meaning you can’t subscribe

to the RSS feed anymore.

It’s only in Spotify.

For now you can until December 1st.

And December 1st, it’s all, everything disappears

and it’s Spotify only.

I, you know, and Spotify gave him a hundred million dollars

for that.

So it’s an interesting deal, but I, you know,

I did some soul searching and I’m glad he’s doing it.

But if Spotify came to me with a hundred million dollars,

I wouldn’t do it.

I wouldn’t do, well, I have a very different relationship

with money.

I hate money, but I just think I believe

in the pirate radio aspect of podcasting, the freedom.

And that there’s something about.

The open source spirit.

The open source spirit, it just doesn’t seem right.

It doesn’t feel right.

That said, you know, because so many people care

about Joe Rogan’s program,

they’re gonna hold Spotify’s feet to the fire.

Like one of the cool things with what Joe told me

is the reason he likes working with Spotify

is that they, they’re like ride or die together, right?

So they, they want him to succeed.

So that’s why they’re not actually telling him what to do

despite what people think.

They, they don’t tell them,

they don’t give them any notes on anything.

They want him to succeed.

And that’s the cool thing about exclusivity with a platform

is like, you’re kind of wanting each other to succeed.

And that process can actually be very fruitful.

Like YouTube, it goes back to my criticism.

YouTube generally, no matter how big the creator,

maybe for PewDiePie, something like that,

they want you to succeed.

But for the most part, from all the big creators

I’ve spoken with, Veritasium, all of those folks,

you know, they get some basic assistance,

but it’s not like, YouTube doesn’t care

if you succeed or not.

They have so many creators.

Yeah, like a hundred other.

They don’t care.

So, and especially with, with somebody like Joe Rogan,

who YouTube sees Joe Rogan,

not as a person who might revolutionize the nature of news

and idea space and nuanced conversations.

They see him as a potential person

who has racist guests on,

or like, you know, they see him as like a headache,

potentially.

So, you know, a lot of people talk about this.

It’s a hard place to be for YouTube, actually,

is figuring out with the search and discovery process

of how do you filter out conspiracy theories

and which conspiracy theories represent dangerous untruths

and which conspiracy theories are like vanilla untruths.

And then even when you start having meetings

and discussions about what is true or not,

it starts getting weird.

Yeah, it’s difficult these days, right?

I worry more about the other side, right?

Of too much, you know, too much censorship.

Well, maybe censorship is the right word.

I mean, censorship is usually government censorship,

but still, yeah, putting yourself in the position

of arbiter for these kinds of things.

It’s very difficult and people think it’s so easy, right?

Like, cause like, well, you know, like no Nazis, right?

What a simple principle.

But you know, yes, I mean, no one likes Nazis,

but there’s like many shades of gray,

like very soon after that.

Yeah, and then, you know, of course everybody, you know,

there’s some people that call our current president a Nazi

and then there’s like, so you start getting a Sam Harris.

I don’t know if you know that is wasted, in my opinion,

his conversation with Jack Dorsey.

Now I’ll also, I spoke with Jack before in this podcast

and we’ll talk again, but Sam brought up,

Sam Harris does not like Donald Trump.

I do listen to his podcast.

I’m familiar with his views on the matter.

And he asked Jack Dorsey, he’s like,

how can you not ban Donald Trump from Twitter?

And so, you know, there’s a set, you have that conversation.

You have a conversation where some number,

some significant number of people think

that the current president of the United States

should not be on your platform.

And it’s like, okay.

So if that’s even on the table as a conversation,

then everything’s on the table for conversation.

And yeah, it’s tough.

I’m not sure where I land on it.

I’m with you, I think that censorship is bad,

but I also think the show…

Ultimately, I just also think, you know,

if you’re the kind of person that’s gonna be convinced,

you know, by some YouTube video, you know,

that, I don’t know, our government’s been taken over

by aliens, it’s unlikely that like, you know,

you’ll be returned to sanity simply because, you know,

that video is not available on YouTube, right?

Yeah, I’m with you.

I tend to believe in the intelligence of people

and we should trust them.

But I also do think it’s the responsibility of platforms

to encourage more love in the world,

more kindness to each other.

And I don’t always think that they’re great

at doing that particular thing.

So that, there’s a nice balance there.

And I think philosophically, I think about that a lot.

Where’s the balance between free speech

and like encouraging people,

even though they have the freedom of speech

to not be an asshole.

Yeah, right.

That’s not a constitutional, like…

So you have the right for free speech,

but like, just don’t be an asshole.

Like you can’t really put that in the constitution

that the Supreme Court can’t be like,

eh, just don’t be a dick.

But I feel like platforms have a role to be like,

just be nicer.

Maybe do the carrot, like encourage people to be nicer

as opposed to the stake of censorship.

But I think it’s an interesting machine learning problem.

Just be nicer.

Machine, yeah, machine learning for niceness.

It is, I mean, that’s…

Responsible, yeah, I mean, it is.

It is a thing, for sure.

Jack Dorsey kind of talks about it

as a vision for Twitter is,

how do we increase the health of conversations?

I don’t know how seriously

they’re actually trying to do that though.

Which is one of the reasons that I’m in part considering

entering that space a little bit.

It’s difficult for them, right?

Because, you know, it’s kind of like well known

that people are kind of driven by rage

and you know, outrage maybe is a better word, right?

Outrage drives engagement.

And well, these companies are judged by engagement, right?

So it’s…

In the short term, but this goes to the metrics thing

that we were talking about earlier.

I do believe, I have a fundamental belief

that if you have a metric of long term happiness

of your users, like not short term engagement,

but long term happiness and growth

and both like intellectual, emotional health of your users,

you’re going to make a lot more money.

You’re going to have long…

Like you should be able to optimize for that.

You don’t need to necessarily optimize for engagement.

And that’ll be good for society too.

Yeah, no, I mean, I generally agree with you,

but it requires a patient person with, you know,

trust from Wall Street to be able

to carry out such a strategy.

This is what I believe the Steve Jobs character

and Elon Musk character is like,

you basically have to be so good at your job.

Right, you got to pass for anything.

That you can hold the board

and all the investors hostage by saying like,

either we do it my way or I leave.

And everyone is too afraid of you to leave

because they believe in your vision.

But that requires being really good at what you do.

It requires being Steve Jobs and Elon Musk.

There’s kind of a reason why like a third name doesn’t

come immediately to mind, right?

Like there’s maybe a handful of other people,

but it’s not that many.

It’s not many.

I mean, people say like, why are you…

Like people say that I’m like a fan of Elon Musk.

I’m not, I’m a fan of anybody

who’s like Steve Jobs and Elon Musk.

And there’s just not many of those folks.

It’s the guy that made us believe

that like we can get to Mars, you know, in 10 years, right?

I mean, that’s kind of awesome.

And it’s kind of making it happen, which is like…

And it’s kind of gone like that kind of like spirit, right?

Like from a lot of our society, right?

You know, like we can get to the moon in 10 years

and like we did it, right?

Yeah.

Especially in this time of so much kind of existential dread

that people are going through because of COVID,

like having rockets that just keep going out there

now with humans.

I don’t know that it, just like you said,

I mean, it gives you a reason to wake up in the morning

and dream, for us engineers too.

It is inspiring as hell, man.

Let me ask you this, the worst possible question,

which is, so you’re like at the core, you’re a programmer,

you’re an engineer, but now you made the unfortunate choice

or maybe that’s the way life goes

of basically moving away from the low level work

and becoming a manager, becoming an executive,

having meetings, what’s that transition been like?

It’s been interesting.

It’s been a journey.

Maybe a couple of things to say about that.

I mean, I got into this, right?

Because as a kid, I just remember this like incredible

amazement at being able to write a program, right?

And something comes to life that kind of didn’t exist before.

I don’t think you have that in like many other fields,

like you have that with some other kinds of engineering,

but you’re maybe a little bit more limited

with what you can do, right?

But with a computer,

you can literally imagine any kind of program, right?

So it’s a little bit godlike what you do

like when you create it.

And so, I mean, that’s why I got into it.

Do you remember like first program you wrote

or maybe the first program that like made you fall in love

with computer science?

I don’t know what was the first program.

It’s probably like trying to write one of those games

and basic, you know, like emulate the snake game

or whatever.

I don’t remember to be honest, but I enjoyed like,

that’s why I always loved about, you know,

being a programmer, it’s just the creation process.

And it’s a little bit different

when you’re not the one doing the creating.

And, you know, another aspect to it I would say is,

you know, when you’re a programmer,

when you’re a individual contributor,

it’s kind of very easy to know when you’re doing a good job,

when you’re not doing a good job,

when you’re being productive,

when you’re not being productive, right?

You can kind of see like you trying to make something

and it’s like slowly coming together, right?

And when you’re a manager, you know, it’s more diffuse,

right?

Like, well, you hope, you know, you’re motivating your team

and making them more productive and inspiring them, right?

But it’s not like you get some kind of like dopamine signal

because you like completed X lines of code, you know, today.

So kind of like you missed that dopamine rush

a little bit when you first become,

but then, you know, slowly you kind of see,

yes, your teams are doing amazing work, right?

And you can take pride in that.

You can get like, what is it?

Like a ripple effect of somebody else’s dopamine rush.

Yeah, yeah, you live off other people’s dopamine.

So is there pain points and challenges you had to overcome

from becoming, from going to a programmer to becoming

a programmer of humans?

Programmer of humans.

I don’t know, humans are difficult to understand,

you know, it’s like one of those things,

like trying to understand other people’s motivations

and what really drives them.

It’s difficult, maybe like never really know, right?

Do you find that people are different?

Yeah.

Like I, one of the things, like I had a group at MIT

that, you know, I found that like some people

I could like scream at and criticize like hard

and that made them do like much better work

and really push them to their limit.

And there’s some people that I had to nonstop compliment

because like they’re so already self critical,

like about everything they do that I have to be constantly

like, like I cannot criticize them at all

because they’re already criticizing themselves

and you have to kind of encourage

and like celebrate their little victories.

And it’s kind of fascinating that like how that,

the complete difference in people.

Definitely people respond to different motivations

and different loads of feedback

and you kind of have to figure it out.

It was like a pretty good book,

which for some reason now the name escapes me,

about management, first break all the rules.

First break all the rules?

First break all the rules.

It’s a book that we generally like ask a lot of

like first time managers to read it rough.

And like one of the kind of philosophies

is managed by exception, right?

Which is, you know, don’t like have some standard template

like, you know, here’s how I, you know,

tell this person to do this or the other thing.

Here’s how I get feedback, like manage by exception, right?

Every person is a little bit different.

You have to try to understand what drives them.

And tailor it to them.

Since you mentioned books,

I don’t know if you can answer this question,

but people love it when I ask it, which is,

are there books, technical fiction or philosophical

that you enjoyed or had an impact on your life

that you would recommend?

You already mentioned Dune, like all of the Dune.

All of the Dune.

The second one was probably the weakest, but anyway,

so yeah, all of the Dune is good.

I mean, yeah, can you just slow little tangent on that?

Is, how many Dune books are there?

Like, do you recommend people start with the first one

if that was?

Yeah, you gotta have to read them all.

I mean, it is a complete story, right?

So you start with the first one,

you gotta read all of them.

So it’s not like a tree, like a creation of like

the universe that you should go in sequence?

You should go in sequence, yeah.

It’s kind of a chronological storyline.

There’s six books in all.

Then there’s like many kind of books

that were written by Frank Herbert’s son,

but those are not as good.

So you don’t have to bother with those.

Shots fired.

Okay.

But the main sequence is good.

So what are some other books?

Maybe there’s a few.

So I don’t know that like, I would say there’s a book

that kind of, I don’t know, turned my life around

or anything like that, but here’s a couple

that I really love.

So one is Brave New World by Aldous Huxley.

And it’s kind of incredible how prescient he was

about like what a brave new world might be like, right?

You know, you kind of see genetic sorting in this book,

right, where there’s like these alphas and epsilons

and how from like the earliest time of society,

like they’re sort of like, you can kind of see it

in a slightly similar way today where,

well, one of the problems with society is people

are kind of genetically sorting a little bit, right?

Like there’s much less, like most marriages, right,

are between people of similar kind of intellectual level

or socioeconomic status, more so these days than in the past.

And you kind of see some effects of it

in stratifying society and kind of he illustrated

what that could be like in the extreme.

There’s different versions of it on social media as well.

It’s not just like marriages and so on.

Like it’s genetic sorting in terms of what Dawkins called

memes as ideas being put into these bins

of these little echo chambers and so on.

Yeah, I know, so that’s the book

that’s I think a worthwhile read for everyone.

I mean, 1984 is good, of course, as well.

Like if you’re talking about, you know,

dystopian novels of the future.

Yeah, it’s a slightly different view of the future, right?

But I kind of like identify with Brave New World a bit more.

Yeah, speaking of not a book,

but my favorite kind of dystopian science fiction

is a movie called Brazil,

which I don’t know if you’ve heard of.

I’ve heard of and I know I need to watch it,

but yeah, because it’s in, is it in English or no?

It’s an English movie, yeah.

And it’s a sort of like dystopian movie

of authoritarian incompetence, right?

It’s like nothing really works very well, you know,

the system is creaky, you know,

but no one is kind of like willing to challenge it,

you know, just things kind of ample along

and kind of strikes me as like a very plausible future

of like, you know, what authoritarianism might look like.

It’s not like this, you know,

super efficient evil dictatorship of 1984.

It’s just kind of like this badly functioning, you know,

but it’s status quo, so it just goes on.

Yeah, that’s one funny thing that stands out to me

is in whether it’s authoritarian, dystopian stuff,

or just basic like, you know,

if you look at the movie Contagion,

it seems in the movies,

government is almost always exceptionally competent.

Like it’s like used as a storytelling tool

of like extreme competence.

Like, you know, you use it whether it’s good or evil,

but it’s competent.

It’s very interesting to think about

where much more realistically is it’s incompetence

and that incompetence isn’t itself has consequences

that are difficult to predict.

Like bureaucracy has a very boring way of being evil,

of just, you know, if you look at the show,

HBO show at Chernobyl,

it’s a really good story of how bureaucracy, you know,

leads to catastrophic events,

but not through any kind of evil

in any one particular place,

but more just like the…

It’s just the system kind of system.

Distorting information as it travels up the chain

that people unwilling to take responsibility for things

and just kind of like this laziness resulting in evil.

There’s a comedic version of this,

I don’t know if you’ve seen this movie,

it’s called The Death of Stalin.

Yeah, I liked that.

I wish it wasn’t so…

There’s a movie called Inglourious Bastards

about, you know, Hitler and so on.

For some reason, those movies pissed me off.

I know a lot of people love them,

but like, I just feel like there’s not enough good movies,

even about Hitler.

There’s good movies about the Holocaust,

but even Hitler, there’s a movie called Dawnfall

that people should watch.

I think it’s the last few days of Hitler.

That’s a good movie, turned into a meme, but it’s good.

But on Stalin, I feel like I may be wrong on this,

but at least in the English speaking world,

there’s not good movies about the evil of Stalin.

That’s true.

Let’s try to see that.

Actually, so I agree with you on Inglourious Bastard.

I didn’t love the movie

because I felt like kind of the stylizing of it, right?

The whole Tarantino kind of Tarantinoism, if you will,

kind of detracted from it

and made it seem like unserious a little bit.

But Death of Stalin, I felt differently.

Maybe it’s because it’s a comedy to begin with.

This is not like I’m expecting seriousness,

but it kind of depicted the absurdity

of the whole situation in a way, right?

I mean, it was funny, so maybe it does make light of it,

but something goes probably like this, right?

Like a bunch of kind of people,

they’re like, oh shit, right?

You’re right.

But like the thing is,

it was so close to like what probably was reality.

It was caricaturing reality

to where I think an observer might think that this is not,

like they might think it’s a comedy.

Well, in reality, that’s the absurdity

of how people act with dictators.

I mean, that’s, I guess it was too close to reality for me.

The kind of banality of like what were eventually

like fairly evil acts, right?

But like, yeah, they’re just a bunch of people

trying to survive.

Cause I think there’s a good,

I haven’t watched it yet, the good movie on,

the movie on Churchill with Gary Oldman,

I think it’s Gary Oldman.

I may be making that up.

But I think he won,

like he was nominated for an Oscar or something.

So I like, I love these movies about these humans

and Stalin, like Chernobyl made me realize the HBO show

that there’s not enough movies about Russia

that capture that spirit.

I’m sure it might be in Russian there is,

but the fact that some British dude that like did comedy,

I feel like he did like hangover or some shit like that.

I don’t know if you’re familiar

with the person who created Chernobyl,

but he was just like some guy

that doesn’t know anything about Russia.

And he just went in and just studied it,

like did a good job of creating it

and then got it so accurate, like poetically.

And the facts that you need to get accurate,

he got accurate, just the spirit of it

down to like the bowls that pets use,

just the whole feel of it.

It was incredible.

It was good, yeah, I saw the series.

Yeah, it’s incredible.

It’s made me wish that somebody did a good,

like 1930s, like starvation that Stalin led to,

like leading up to World War II

and in World War II itself, like Stalingrad and so on.

Like, I feel like that story needs to be told.

Millions of people died.

And to me, it’s so much more fascinating than Hitler

because Hitler is like a caricature of evil almost

that it’s so, especially with the Holocaust,

it’s so difficult to imagine that something like that

is possible ever again.

Stalin to me represents something that is possible.

Like the so interesting, like the bureaucracy of it

is so fascinating that it potentially might be happening

in the world now, like that we’re not aware of,

like with North Korea, another one that,

like there should be a good film on.

And like the possible things that could be happening

in China with overreach of government.

I don’t know, there’s a lot of possibilities there.

I suppose.

Yeah, I wonder how much, you know,

I guess the archives should be maybe more open nowadays,

right, I mean, for a long time, they just, we didn’t know,

right, or anyways, no one in the West knew for sure.

Well, there’s a, I don’t know if you know him,

there’s a guy named Stephen Kotkin.

He is a historian of Stalin that I spoke to on this podcast.

I’ll speak to him again.

The guy knows his shit on Stalin.

He like read everything and it’s so fascinating

to talk to somebody, like he knows Stalin better

than Stalin himself, it’s crazy.

Like you have, so he’s, I think he’s a Princeton,

he is basically, his whole life is Stalin.

Fighting Stalin.

Yeah, it’s great.

And in that context, he also talks about

and writes about Putin a little bit.

I’ve also read at this point,

I think every biography of Putin, English biography of Putin,

I need to read some Russians.

Obviously, I’m mentally preparing

for a possible conversation with Putin.

So what is your first question to Putin

when you have him on the podcast?

I, it’s interesting you bring that up.

First of all, I wouldn’t tell you, but.

You can’t give it away now.

But I actually haven’t even thought about that.

So my current approach, and I do this with interviews often,

obviously that’s a special one,

but I try not to think about questions until last minute.

I’m trying to sort of get into the mindset.

And so that’s why I’m soaking in a lot of stuff,

not thinking about questions, just learning about the man.

But in terms of like human to human,

it’s like, I would say it’s,

I don’t know if you’re a fan of mob movies,

but like the mafia, which I am, like Goodfellas and so on,

he’s much closer to like mob morality, which is like.

Mob morality, maybe, I could see that.

But I like your approach anyways of this,

the extreme empathy, right?

It’s a little bit like Hannibal, right?

Like if you ever watched the show Hannibal, right?

They had that guy, well, you know Hannibal of course, like.

Yeah, Silence of the Lambs.

But there were those TV shows as well,

and they focused on this guy, Will Durant,

who’s a character like extreme empath, right?

So in the way he like catches all these killers,

as he pretty much, he can empathize with them, right?

Like he can understand why they’re doing

the things they’re doing, right?

It’s a pretty excruciating thing, right?

Like, because you’re pretty much like spending

half your time in the head of evil people, right?

Like, but.

I mean, I definitely try to do that with others.

So you should do that in moderation,

but I think it’s a pretty safe place, safe place to be.

One of the cool things with this podcast,

and I know you didn’t sign up to hear me

listen to this bullshit, but.

That was interesting.

I, and what’s his name?

Chris Latner, who’s a Google,

oh, he’s not Google anymore, SciFi.

He’s legit, he’s one of the most legit engineers

I talk with, I talk with him again on this podcast.

And one of the, he gives me private advice a lot.

And he said for this podcast, I should like interview,

like I should widen the range of people

because that gives you much more freedom to do stuff.

Like, so his idea, which I think I agree with Chris

is that you go to the extremes.

You just like cover every extreme base

and then it gives you freedom to then go

to the more nuanced conversations.

And it’s kind of, I think there’s a safe place for that.

There’s certainly a hunger for that nuanced conversation,

I think, amongst people where like on social media,

you get canceled for anything slightly tense,

that there’s a hunger to go full.

Right, you go so far to the opposite side.

And that’s like demystifies it a little bit, right?

Yeah, that’s.

There is a person behind all of these things.

And that’s the cool thing about podcasting,

like three, four hour conversations

that it’s very different than a clickbait journalism,

it’s like the opposite, that there’s a hunger for that.

There’s a willingness for that.

Yeah, especially now, I mean,

how many people do you even see face to face anymore?

Right, like this, you know?

It’s like not that many people like in my day today,

aside from my own family that like I sit across.

It’s sad, but it’s also beautiful.

Like I’ve gotten the chance to like,

like our conversation now, there’s somebody,

I guarantee you there’s somebody in Russia

listening to this now, like jogging.

There’s somebody who is just smoked some weed,

sit back on a couch and just like enjoying.

I guarantee you that we’ll write in the comments right now

that yes, I’m in St. Petersburg, I’m in Moscow, whatever.

And we’re in their head and they have a friendship with us.

I’m the same way, I’m a huge fan of podcasting.

It’s a beautiful thing.

I mean, it’s a weird one way human connection.

Like before I went on Joe Rogan and still,

I’m just a huge fan of his.

So it was like surreal.

I’ve been friends with Joe Rogan for 10 years, but one way.

Yeah, from this way, from the St. Petersburg way.

Yeah, the St. Petersburg way and it’s a real friendship.

I mean, now it’s like two way, but it’s still surreal.

And that’s the magic of podcasting.

I’m not sure what to make of it.

That voice, it’s not even the video part.

It’s the audio that’s magical.

I don’t know what to do with it,

but it’s people listen to three, four hours.

Yeah, we evolved over millions of years, right?

To be very fine tuned to things like that, right?

Oh, expressions as well, of course, right?

But back in the day on the Savannah,

you had to be very attuned to whether

you had a good relationship with the rest of your tribe

or a very bad relationship, right?

Because if you had a very bad relationship,

you were probably gonna be left behind

and eaten by the lions.

Yeah, but it’s weird that the tribe is different now.

Like you could have a one way connection with Joe Rogan

as opposed to the tribe of your physical vicinity.

But that’s why it works with the podcasting,

but it’s the opposite of what happens on Twitter, right?

Because all those nuances are removed, right?

You’re not connecting with the person

because you don’t hear the voice.

You’re connecting with like an abstraction, right?

It’s like some stream of tweets, right?

And it’s very easy to assign to them

any kind of evil intent or dehumanize them,

which it’s much harder to do when it’s a real voice, right?

Because you realize it’s a real person behind the voice.

Let me try this out on you.

I sometimes ask about the meaning of life.

Do you, your father now, an engineer,

you’re building up a company.

Do you ever zoom out and think like,

what the hell is this whole thing for?

Like why are we descended to vapes even on this planet?

What’s the meaning of it all?

That’s a pretty big question.

I think I don’t allow myself to think about it too often,

or maybe like life doesn’t allow me

to think about it too often.

But in some ways, I guess the meaning of life

is kind of contributing to this kind of weird thing

we call humanity, right?

Like it’s in a way, you can think of humanity

as like a living and evolving organism, right?

That like we all contributing in a sway way,

but just by existing, by having our own unique set

of desires and drives, right?

And maybe that means like creating something great.

And it’s bringing up kids who are unique and different

and seeing like, they can join what they do.

But I mean, to me, that’s pretty much it.

I mean, if you’re not a religious person, right?

Which I guess I’m not, that’s the meaning of life.

It’s in the living and in the creation.

Yeah, there’s something magical

about that engine of creation.

Like you said, programming, I would say,

I mean, it’s even just actually what you said

with even just programs.

I don’t care if it’s like some JavaScript thing

on a button on the website.

It’s like magical that you brought that to life.

I don’t know what that is in there, but that seems,

that’s probably some version of like reproduction

and sex, whatever that’s in evolution.

But like creating that HTML button has echoes

of that feeling and it’s magical.

Right, well, I mean, if you’re a religious person,

maybe you could even say, all right,

like we were created in God’s image, right?

Well, I mean, I guess part of that is the drive

to create something ourselves, right?

I mean, that’s part of it.

Yeah, that HTML button is the creation in God’s image.

Maybe hopefully it’ll be something a little more.

So dynamic, maybe some JavaScript.

Yeah, maybe some JavaScript, some React and so on.

But no, I mean, I think that’s what differentiates us

from the apes, so to speak.

Yeah, we did a pretty good job.

Dan, it was an honor to talk to you.

Thank you so much for being part of creating

one of my favorite services and products.

This is actually a little bit of an experiment.

Allow me to sort of fanboy over some of the things I love.

So thanks for wasting your time with me today.

It was really fun.

Well, it was awesome.

Thanks for having me on and giving me a chance

to try this out.

Awesome.

Thanks for listening to this conversation

with Dan Kokotov and thank you to our sponsors,

Athletic Greens, Only One Nutrition Drink,

Blinkist app that summarizes books,

Business Wars podcast and Cash App.

So the choice is health, wisdom or money.

Choose wisely, my friends.

And if you wish, click the sponsor links below

to get a discount and to support this podcast.

And now let me leave you with some words

from Ludwig Wittgenstein.

The limits of my language means the limits of my world.

Thank you for listening and hope to see you next time.