My guest today, John List, is an economics professor, a friend and a colleague at the University of Chicago. And of all my peers, John is the one who has most greatly influenced the way I think about the world.
John LIST: So I took my knowledge from the baseball card market and said, “Look, I’ve been doing these field experiments, I think we should do that rather than lab experiments.” And every professor at Wyoming basically said, “If you want to use data from outside the lab, do things like what Steve Levitt, Orley Ashenfelter, Alan Krueger, and Josh Angrist are doing. That’s what people do who are doing empiricism.” And I said, “No, I think that we should be doing field experiments.”
Welcome to People I (Mostly) Admire, with Steve Levitt.
The use of randomized experiments in economics has exploded in the last 25 years, and John deserves more credit for that than anyone else. But how he did it, that’s the really remarkable thing. It’s the story of a complete outsider, a renegade thinker whose ideas and personality were so powerful that they couldn’t be defeated.
* * *
Steve LEVITT: I’ve always suspected that the fact that you had such a different set of interests and experiences has contributed to the incredible originality that you’ve brought to economics. And your path has not been an easy one. I mean, we’re in a field in which pedigrees are extremely important. Almost everyone teaching at the top economics departments has a resume filled with places like Harvard, Princeton, Stanford, M.I.T. But when I look at your resume, I find University of Wisconsin, Stevens-Point; University of Wyoming; the University of Central Florida. You really are the Cinderella story of modern economics, aren’t you?
LIST: Well, that’s very kind of you. I take that as a compliment. Sometimes when you see someone, it’s almost like they were born on third base, and they think they hit a triple. It’s probably fair to say in my life that I wasn’t even born in the dugout. Look, my background is, I’m a first-gen kid. I stumbled upon economics because it was always second nature to me, but it wasn’t my original dream at all. My original dream was to be a golfer and the only reason why I went to school as an undergrad at Stevens-Point was to play golf. My backup plan was, I love economics and I think like an economist naturally. For my graduate training, I went to the University of Wyoming. And when you go to a school like University of Wyoming, you do get a fair amount of attention from them. But when you go on the market, it’s probably a little different than if you go on the market from Harvard, M.I.T., Chicago, or Stanford.
LEVITT: How many job applications did you put out when you were coming out of grad school?
LIST: I applied to 150 schools when I went on the original job market in 1996. One of the 150 gave me an interview. Look, I was lucky enough that University of Central Florida even interviewed me because nobody else was interviewing me. You know, of those original 150 applications, 149 people tell you, “You’re not good enough for us to even talk to.” Do you think I saved those letters? Absolutely, I saved those letters. Do I occasionally have to take a look at those letters for a little bit of extra juice or for a little bit of extra energy? Absolutely I do.
LEVITT: Wait, where do you have them? You have them in your office or your home?
LIST: I have them in my home, somewhere now in a big box. Look, University of Central Florida gave me a shot, and for that I will be forever grateful.
LEVITT: So much of your work is built around randomized experiments. Before we talk about your research, I want to ask you a question about randomization that has long puzzled me. My dad is a medical researcher, and he explained the basic idea of randomized experiments to me when I was maybe 8 years old. And my recollection was that it was so obvious and commonsensical that he didn’t need to explain it twice for me to understand. And my guess is that virtually everyone listening to this podcast understands how randomized experiments work. There’s a treatment group and a control group, and because research subjects are randomly assigned to treatment or control, if the experimenter didn’t intervene, the outcomes of the treatment and control group would on average be identical. So therefore, if some treatment is given only to the control group, like maybe a vaccination or access to a training program, a lump sum transfer of $10,000 — any difference in outcomes between the treatment group and the control group can be causally attributed to the treatment intervention. Common sense, right?
LEVITT: Okay, so randomized experiments are such a powerful tool, and the idea is so simple that one might’ve expected that randomized experiments would’ve been common in ancient Greece or in Rome, or at least they would’ve been central to the scientific revolution, that brilliant thinkers like DaVinci or Galileo or Newton or Darwin would use them. But no! Somehow randomized experiments didn’t become commonplace until the early 20th century. Don’t you find that almost beyond belief?
LIST: The great thinkers in economics like John Stuart Mill or Milton Friedman or Paul Samuelson, when you look at their writings, by and large, our heroes have lamented that if you want to do experimentation, you should choose another practice. And if you want to engage in empirical economics, you better start looking for naturally occurring data. Like Mill writes, “The moral sciences cannot run experiments.” When you look at Friedman’s work on, say, his ‘53 Positive Economics book, he says, “Economists must rely on evidence cast by the experiments that happen to occur.” When you look at Joan Robinson, she has an interesting Journal of Economic Literature paper in 1977 that frankly just says, “Economists cannot make use of controlled experiments to settle their differences.” So now a good question is: Why did they believe that? Why didn’t they believe that you could randomize? The first book I used was in 1985, it was by Samuelson and Nordhaus. So, Samuelson wrote this great principles book that really revolutionized the way that we think about teaching undergrads. And I still remember looking at this page, I think it was like page eight in their ‘85 book that says something along the lines of: “The economic world is complicated. There are millions of people in firms, thousands of prices. One possible way of figuring out economic laws in such a setting is by using controlled experiments, like chemists, physicists, biologists.” So, when I was reading this, I was thinking, “this is really cool.” But then they go on to say: “Economists have no such luxury when testing economic laws, because they cannot easily control other important factors. Like astronomers or meteorologists, they generally must be content to observe.” I think this is the missing link about why economists didn’t go after using experimentation earlier. They were thinking of it like a chemist. And remember, rule number one in chemistry is you need to have a clean test tube. And if you don’t have a clean test tube, then you’re going to be confounded. They didn’t really conceptualize, “Well, wait a second, the world is dirty. We could use randomization to balance the dirt across treatment and control and then just difference it out.” That’s something that never really clicked for them.
LEVITT: I agree with that 100 percent. But what about outside of economics? That’s what baffles me even more. Science, in general, now would say randomized experiments are the gold standard. How was it that we didn’t see that until the 20th century?
LIST: Oh, I think it’s fair to say that Pasteur saw it in 1882 when he did his famous sheep experiment where he gets 50 sheep, and he uses 25 as controls and 25 were vaccinated. And then he gave them all a lethal dose of Anthrax. And then, of course, all the ones who did not have the vaccination died.
LEVITT: Okay, but still, it’s 1882.
LIST: I agree.
LEVITT: Doesn’t it seem like it should have been 100 A.D.?
LIST: I agree. So, Galileo was doing experiments, but he just wasn’t doing randomization. I think the concept of randomization as a real thing really came up in the early ‘20s, and it is kind of befuddling that, why would it take humanity so long to understand about the value of chance and the value of randomization? That point is exactly right.
LEVITT: Now I probably shouldn’t go around making too much fun, because I had my own lack of vision around randomized experiments as a young economist in the 1990s. At that time, experiments just weren’t very important in economics. There were a few massive, expensive randomized experiments, for instance, testing the impact of a negative income tax. But those were far beyond my reach. I had a tiny research budget, no possible way of launching a $50 million decade-long social experiment. And then there were a bunch of small laboratory experiments done on students, but those were held in such low esteem in my circles. We thought, “We economists, we’re better than the psychologists who do lab experiments.” And plus, like Samuelson and Friedman said, they were contrived and artificial and they weren’t the way to learn about the problems I cared about, like crime or education. So, people like me turned to what we call natural experiments or accidental experiments where for instance, a classic paper in this style, noticed that sometimes school districts are drawn right down the middle of a street so that the west side of the street goes to one school and the east side of the street goes to a different school. Sandy Black, who wrote that paper, she argued that well, except that the school districts are different, you’d expect housing prices to be the same or similar, at least on both sides of the street. And so, if you did see different housing prices on a different side of the street, it might represent how much parents are willing to pay to go to a good school versus a bad school. And that’s the kind of thing that people like me were doing, trying to go out in the world and exploit variation that was given to us. It definitely wasn’t truly randomized, and it definitely wasn’t an experiment, but we thought it was the way to go. That was my basic worldview until one day, I stumbled onto a research paper by some guy I’d literally never heard of. His name was John List. And you were running these small, randomized experiments in real-world settings. Now, admittedly, they were almost comically trivial settings, like baseball card conventions and flea markets. But reading those papers was like a revelation. It never occurred to me that you could draw real insights out of experimentation in the real world. Almost instantaneously, how I thought research should be done, changed. And what’s so crazy is, you know, you were a complete nobody in the field, right? And yet, in my view, you launched one of the most important revolutions our profession has ever had. It’s truly stunning, the impact you’ve had. Was it just obvious to you from the very beginning that these field experiments were the way to go? Or did you have some kind of a-ha moment along the way?
LIST: Thanks so much for your kind words. I think it’s fair to say that nobody in the profession has supported me as much as you have. So late ‘80s, I’m doing baseball card show experiments at these big conventions. You know, I’m going to buy, sell, and trade.
LEVITT: But why are you doing experiments?
LIST: Okay, so I’m trying to make money. And to make more money, I was trying new things. I would do things like an auction instead of a price. I would negotiate in different ways. I would bargain in different ways. I would sell my goods in different ways, and I would buy, sell, and trade differently at different moments in the convention. I was simply trying to make more money.
LEVITT: Okay. And this had nothing to do with thinking you were going to revolutionize economics. You were doing this as a baseball car dealer because you had common sense and you thought, “If I try some different things, maybe one of them is going to work well. And I’m going to do it for the rest of my life if it works well.”
LIST: Exactly. So now I begin to learn about economics in the classroom at Stevens-Point. And there are different theories about what should happen when more people who demand the good, or preferences change or there’s a tax, whatever. So I start to think, “Wow, economics has predictions for what should happen in this market when I change things.” So I would change a few things, and then see is economic theory correct or not? I would maybe put a certain kind of auction in use, like a first-price, sealed-bid auction where everyone puts a bid in and the winner is a person who bids the highest and then they pay their bid; versus a second-price auction where everybody puts a bid in and the high bidder pays the second-highest bid. Economic theory has predictions about that. So I would then bring those back to the classroom and I would say to my professor, “Look, I just tried it, and everything you told us wasn’t right.” And I’m sure I was the least favorite student in there. So, I began this kind of symbiotic relationship where I would do some experiments and then learn. And then when I got to Wyoming, I found out that people were actually doing research in the lab. So, a little bit like you, I was taken back that people would bring, say 30 students into a classroom and have them engage in an experiment, and then make the kind of inference that they would make. For example, they’d talk about colluding firms. And I was a subject in that experiment, as a graduate student thinking, “Am I really a colluding firm?” So I took my knowledge from the baseball card market and said, “Look, I’ve been doing these field experiments, I think we should do that rather than lab experiments.” And every professor at Wyoming basically said, “Go pound sand. If you want to do an experiment, do it in the lab. That’s how everyone does it. If you want to use data from outside the lab, do things like what Steve Levitt, Orley Ashenfelter, Alan Krueger, and Josh Angrist are doing. That’s what people do who are doing empiricism.” And I said, “No, I think that we should be doing field experiments.” So go into the real world and use randomization alongside that realism, and that will really help us learn about where economics applies, and can help us with policies, et cetera, et cetera.
LEVITT: Were these field experiments you did, were they part of your dissertation that only got you one interview out of 150?
LIST: They were my one paper that I had submitted. So, it was actually the paper that was the furthest along of my job market materials. And people on the job market were not interested in that field experiment. You might ask, “Why did you do your early field experiments at baseball card conventions?” And the answer’s simple: That is the only place wherein I could afford to do these experiments because I had to fund myself. Nobody was interested in working with me. So, I essentially set up shop and ran experiments in baseball card conventions because that was my only outlet. Did I think that those were important markets per se? Of course not. But did I think they were markets that I could randomize people into treatment groups and a control group and make a causal statement? Yes.
We’ll be right back with more of my conversation with John List.
* * *
LEVITT: I haven’t done the calculations, but I am almost certain that there was a 10- or 20-year period where you published more papers in the top economics journals than anyone else. You’re really good at academics. It didn’t seem to me like what you really cared about was changing the world — solving the big problems. But somewhere along the way I sensed that that changed. And now anyone who spends any time with you at all will recognize that you’re focused on having a real-world impact, on leaving a legacy. Are you able to see that transformation in yourself? Was it a conscious choice?
LIST: The difference came when I was given more opportunities. When you’re funding your own work in a baseball card convention, there are only a certain number of questions that you can take on within that market setting. So, it is true that I’ve become more, let’s change the world, and let’s help people. But when you look at my older work, when I had a chance within my market to explore those types of questions, I did. And one example is discrimination. So this is a paper that’s titled “The Nature and Extent of Discrimination in the Marketplace,” and it was published in 2004 in the Quarterly Journal of Economics. I was trying to measure is there discrimination in this market? Did the dealers in that market treat women differently than men? Or treat older people differently than younger people, or Black and brown people different from white people? So I set up a series of field experiments that allowed me to measure, but then differentiate, what is the underlying motivation of these sellers when they were discriminating against people. And the two major models that economics puts forward on the one hand is what Gary Becker wrote about in his dissertation in 1957, which is, people just get satisfaction out of hurting another person in a different group. That’s one kind of theory. The other kind is what Pigou called “third-degree price discrimination.” Now, what this is, is I discriminate against you, for example, I charge men a higher price than women, not because I dislike men, but because I’m just trying to make more money. In the Becker theory, you’re actually forgoing profits to cater to your prejudice. In the Pigou theory, you don’t really care about hurting another group. All you care about is making more money. So these are the two standard models of why do people discriminate? And I felt that I could not only measure discrimination, but I could set up an apparatus that could differentiate those two theories. And that’s what I did in the 2004 paper.
LEVITT: I suspect that people who are listening are like, “Okay, that sounds great, but how are you going to do that at a baseball card convention?”
LIST: When you think about these conventions, you have maybe a thousand different sellers who are behind these six-foot seller tables, and they’re selling and trading and buying from customers who walk in the front door. So, what I would do is I would recruit customers as they walked through the front door, I would have a separate room reserved. If they were able to be my experimental participants, I would have them go and negotiate to buy goods from dealers all around the marketplace, men and women and old and young, et cetera. And then they would come back and detail what they purchased or what they sold, and they would detail what the negotiations looked like with that particular dealer. So now what I’m doing is, I’m sending different kinds of consumers to each dealer without the dealers knowing that they’re in an experiment. So, this is what’s very different than a lab experiment where people understand that their choices are being scrutinized.
LEVITT: Yeah, if I’m discriminating against women or African Americans, if I know that this is going to be recorded, I’d probably stop. It’s really a key point to what you’re doing. Everybody’s just doing their business, completely unaware that this egghead is tallying all the information and eventually going to publish it in an economics journal.
LIST: I think that’s right. So, these dealers, there will be hundreds and hundreds of people who walk up to their table, maybe thousands of people. So, they have no idea who my two or three confederates are. And they’re negotiating with my confederates in a way that allows me to learn about, first of all, are they treating men differently than women? And then secondly, what is the underlying reason for why? But let’s go back to your general point though. I think this is what psychologists oftentimes refer to as attribution bias. So they look at the actions of a person and they say, “You’re that kind of person. You know, you want to try to save the world,” or “You want to just get papers published.” But they don’t appreciate the background or the properties of the situation that the person is actually in. So, in my case, I think it was always there, but it just couldn’t show itself.
LEVITT: I believe that’s partially true, but I also see a transformation when I compare the two books that you’ve written for a general audience. So the first book you wrote back in 2013 was called The Why Axis, with “why” spelled W-H-Y. And your most recent book is The Voltage Effect. The Why Axis — it’s a great book. I wrote the foreword to it. It does a fantastic job of explaining many of your early experimental results and those of your co-author Uri Gneezy. It’s breezy though. It’s like Freakonomics. You and your personality are mostly absent, and it feels like a lot of other economics books in that way. Not in a bad way, but just it feels like other books. But now with The Voltage Effect, it’s a totally different book. It’s a very personal, impassioned call to arms. It’s clear you deeply believe in what you’re doing and its importance for the world. Don’t you agree?
LIST: There, you’re exactly right. We wrote The Why Axis in part to showcase my early field experiments and why the world should use field experiments. The Voltage Effect is really what I would call the back nine of my career. I, frankly, became a little bit angry at myself for what I did in the front nine. I think we got a little bit astray in the last 30 years in terms of what are we trying to do with field experiments? Are we actually being serious about wanting to change the world? Or are we just writing about, “I want to change the world,” and then doing an experiment in a Petri dish that is basically concocted in the best ways possible to give you the largest treatment effects. And then we write about it, and we forget to tell everyone else that it was an efficacy test. And then when the lucky programs or interventions that are chosen are scaled, they actually have a huge, what I call “voltage drop” because it looked great in the Petri dish, but then when we scaled it in the large, it ends up only being a really small fraction in terms of the benefits that everyone’s receiving. I get angry at my past self because we were in part responding to the incentives that the profession has given us. The way you get tenure is to have publications in the top journals. But the issue is to have publications in the top journals, you need to create surprising results that have big treatment effects. So, in a way, the whole profession is set up to find a Petri dish, give your idea its best shot, and then publish it and move on to the next idea. And I started to think, that’s really not how we should go about changing the world. We should think about, does the science go from the Petri dish to the large? And when I started working on scaling, I realized that we have so many great interventions that work in the Petri dish, and we have no idea whether those will actually scale because we’ve never even tried to test it. That’s when I really did change in what you’re talking about, in thinking about implementation. I think that piece is just entirely missing right now in our profession. And when I started writing The Voltage Effect, I think you probably see that passion, trying to take field experiments and make them more valuable to society rather than simply stopping at testing an economic theory and leaving it at that. It’s testing an economic theory, but also creating a program that we can have confidence in when we scale it.
LEVITT: It’s easy to run an experiment and to look at the treatment group and the control group and subtract off the means and see what the effect is. How do you learn about scaling, about taking it from your Petri dish to the big scale when that’s not an exercise we do very often?
LIST: And let’s be clear, it’s still a developing exercise. I think it’s fair to say that our work and the book are really about introducing the problem and summarizing what I’ve learned so far. The most obvious one is sometimes we try to scale ideas that were simply false positives. They were false positives because it was simply bad luck, or the researcher was fraudulent, or just made mistakes that they didn’t know they were making.
LEVITT: So obviously, it makes no sense if something’s not true in the first place to go and invest $500 million in government money to bring it to scale. So, the answer there, I think is pretty straightforward. There should be more replication, right? Because if something’s a false positive and you do replications of it in the same kind of Petri dish, you don’t think it’ll come up positive again. Is that the right approach to getting rid of false positives?
LIST: Yeah, I think that’s part of it. But I also think that it’s about — did you run the experiment correctly? Did you analyze the data correctly? Are you making the correct inference from the data that you have? This is a little bit like what happens in the Council of Economic Advisors. So, when I worked there, my biggest contribution was stopping bad ideas from happening. And that’s exactly what we can do in organizations: having a scaling team that always test new ideas that other groups in the organization come up with.
LEVITT: So avoiding incompetence.
LIST: That’s right.
LEVITT: I also heard you saying, separate out the idea generation and the idea testing, because it’s a dangerous combination to put the person whose baby is the idea in charge of determining whether that baby is going to win the beauty contest or not.
LIST: Yeah, that’s a great way to put it. Psychologists talk about confirmation bias. If we have a general feeling or a prior that something is true and it’s our idea, we tend to look at data and generate data in a way that will get back evidence that, wow, you know, that’s true. When I used to be chief economist at Uber, I would always tell people that I’m going to be the person who stands outside of Travis Kalanick’s door — the founder, and back then, the C.E.O. of Uber — and any idea you have has to come through my team before it’s going to get on Travis’s desk because in a firm or in an organization, the incentives are also set up for confirmation bias. Because think about it — if you have an idea, you test it, if you then ship it, you end up getting the bonus or the promotion at the end of the year because you had an idea that shipped. Forget about psychology. It’s also the pecuniary and non-pecuniary incentives within the workplace that might be set up for you to advance what’s simply a false positive.
LEVITT: Okay. So, what’s another reason why projects fail to scale?
LIST: Another consideration is whether you have the same inputs at scale that you used in the Petri dish. So here, let’s take us back, Steve, to when you and I and Roland Fryer, our old postdoc who’s now at Harvard, opened the preschool in Chicago Heights. So, we start building it in 2008 and we had to hire teachers for the classrooms. We all knew that. And we also knew that an important input was to have a good teacher in every classroom. When we started that pre-K, we hired 30 teachers, and it wasn’t that difficult to find 30 good teachers. Now, if we wanted to scale our Chicago Heights program around Chicago, and let’s say we wanted a thousand or 10,000 preschool programs, we would have to maybe hire 30,000 teachers. That’s an altogether different proposition because it would be nearly impossible to find 30,000 good teachers around Chicago who were going to be of the same quality as the 30 that we hired in the Petri dish of Chicago Heights.
LEVITT: So, let me just be clear, too. I am the absolute worst offender on this dimension. You can recount from every project we’ve ever done. Whenever anybody says, “Hey, shouldn’t we do this in a way that’s scalable?” I would always say, “My God, we get zero every time we do anything. We never get any effects. I don’t care about scaling. Just once, I want to get a big result. Let’s throw every single thing we have of it and just pray that once we have a big impact on kids beause we never have a big impact on kids.” But it’s funny, in retrospect, it’s exactly what you said. It’s a combination of we’re putting in so much effort to try to do this model school, and how awful is it when it turns out that you’re just okay? Plus our private incentives — we want to be heroes, we want to do something amazing. We want people to say, “Oh my God, this school was the best school ever.” And for me at least, it’s so far from my mind whether in the end this will be adopted. Obviously, we hope that we do something great, and it gets adopted, but I’m the worst. I’m absolutely the worst on this dimension.
LIST: No, I don’t think you’re the worst. You’re the modal economist, right? We just raised $20 million; we spent a lot of time building this preschool. Why do we want to set it up to fail? It just doesn’t make any sense in terms of every incentive that we face as an academic, or even as a person, who’s trying to help kids. So I say, “Look, do your A/B test,” which is really what we’re talking about. A is a control group. B is a group that we’re going to throw everything at it and do the best we can. Let’s just call that the normal course of business for academics. And I’m saying, “Add option C.” And option C is the treatment arm that essentially brings in all of the constraints that you’ll be facing at scale. So, for this particular example, in option C, we would actually hire the kinds of teachers that we would have to hire at scale. And we would put them in the classroom in treatment arm C because now we not only have the efficacy test that you want, but we also have the scaling test that I want. You might say, “Well, John, we can do that later. The pioneers of the idea should get away with just doing A/B test. And then the next group can add the boundary conditions and the scaling ideas, et cetera.” That works in medicine because the first round is an efficacy test, and you do phase one, phase two, phase three. That’s a mistake in economics. It doesn’t work for us because of the fixed cost. We had to build a preschool from scratch, and we had to spend a lot of money and time. There’s no way you can count on another research group coming in and replicating ours and then adding option C. We really need to add it ourselves because in the types of experiments that economists or social scientists run, they have a huge fixed cost and we just need to add option C from the very beginning.
LEVITT: Do you have an example from your time as the chief economist at Uber when scaling didn’t go as planned?
LIST: I think a good example is what happened when we rolled out tipping in the app on Uber. The background story is: In late January, I think it was January 27th, 2017, President Trump issued an executive order on immigration. And the taxicab drivers around J.F.K. decided to go on strike because of the Trump executive order. Okay, that’s all well and fine, but here comes Uber. Back then, whenever there was a market interruption, what Uber would do is they would turn off surge. So, people who don’t use Uber, surge is something that when demand is much greater than supply, the price goes up to clear the market. That’s the beauty behind Uber and Lyft’s business model. So, the taxicab drivers around J.F.K. noticed that Uber turned off surge and they thought Uber did that to break up their strike. Someone went on Twitter and ranted about Uber and then ended with #DeleteUber. And that tweet ended up going viral. It was amazing because when you look at the data, Lyft’s market share was about five or 10 percent, and Uber’s was like 90 percent. But overnight, Sunday and Monday and Tuesday, Lyft shot up to like 30-percent market share.
LIST: Yeah, and a bunch of drivers were leaving Uber. A bunch of riders were leaving Uber. So, Travis Kalanick came to my team, and you might remember my team there was called Ubernomics as an ode to Freakonomics. Travis came to my team and said, “John, your job is to get the drivers back.” And Steve, can you guess what a typical economist would do to get drivers back?
LEVITT: Raise wages.
LIST: Raise wages. It’s exactly right. But I wanted to do it through tipping. So, I went door to door to the executive team and said, “We need to add tipping in the Uber app.” And they said, “No, no, no, no, no.” And I said, “Look, I have survey evidence that shows the riders hate it when a driver puts a tin can in the back seat and says, ‘Please give me a tip.’ And drivers really want tipping. So, drivers will come in droves if we add tipping.” So we fought, and we ended up winning. When you win a battle like that, the booty is that you get to roll out tipping as a nationwide field experiment. So that’s essentially what we did. But in doing it, what you can show is that if you only pull out 5 percent of the drivers in a market — say you choose Chicago, and pull out 5 percent of the drivers and say, “You 5 percent get to receive tips and the rest of the drivers in the market, they cannot receive tips.” What you find is that those drivers who can receive tips make more money and they work more. Okay, so win-win.
LEVITT: So, you did that.
LIST: We did that.
LEVITT: And from that, you said, “Wow, this is —”
LIST: This is great stuff. But then as you scale it up, 20 percent of drivers, 50, 100 — by the time you get to a hundred, which we did in October. So, three or four months later, we roll it out to all drivers in Chicago. What happens is there are so many drivers who increase their labor supply and new drivers come online that they undo the entire tip effect.
LEVITT: So, what you’re saying is, because each individual driver is so excited about the opportunity to receive tips, some of those drivers work more hours because they’re making more money. They tell their friends, and their friends are like, “Oh my God, I want to make tips.” And by the time they’re done, there are so many drivers driving for so many hours that their cars are much emptier than they otherwise would be, and their total wage has fallen back down to whatever it would’ve been without the tips.
LIST: Yeah, that’s exactly what happens. They do drive a little bit more, but since Uber only pays you when you have somebody in your backseat, it happens that you’re driving around with an empty car more often. And in the end, you earn exactly the same amount whether you had tipping or whether you didn’t have tipping.
LEVITT: So, Travis had given you a mission to make the drivers happy. You came up with an approach which was tipping. You did a bunch of testing, but it sounds to me, given that tipping is a part of Uber, that you did not find out until it was too late that this was not actually accomplishing what you would hope would accomplish.
LIST: Yeah, we had seen when you add more and more drivers what happens. So, you could predict where it was going, but we didn’t know exactly where it would end up until we actually rolled it out. And when you roll it out, that’s when the data show you that having the tipping feature does not increase hourly wages.
LEVITT: Were people angry at you when that was learned?
LIST: No, I don’t think it’s really learned. I don’t think the drivers really know that. And it’s important to point out that this is an average effect. So, some drivers are making more, some drivers are making less.
LEVITT: Right. The good drivers, the drivers who are doing great now have a way to distinguish themselves from the people who are unpleasant. In that way it’s good, right? That Uber would like to attract good drivers and have those good drivers drive more no matter what.
LIST: Yeah. In that sense it helps because if you’re moving to drivers who are more conscientious or giving a better trip or trying harder in some way, then that’s good stuff. The issue there however, is that tipping ends up being so small that it doesn’t lead to a significant increase in service quality. That’s what a lot of people argue is that the reason we have tipping is because it leads to much higher service quality. But in our data, we just couldn’t find that.
You’re listening to People I (Mostly) Admire with Steve Levitt and his conversation with economist John List. After this short break, they’ll return to talk about John’s switch to Walmart.
* * *
Before the break, we began talking about John’s work with Uber. I’m curious to hear more about what pushed someone who loves academics into the world of business. And as you might have noticed and, as you might have noticed, John is a bit of a character. I wonder if he tones it down when he puts on his business hat.
LEVITT: As long as I’ve known you, John, you’ve been as devoted to academics as anyone around. You publish dozens of research papers a year. You poured your heart into being department chair for years, and you seemingly accept any academic invitation to give a lecture anywhere in the world just to spread your message because you believe in it. So, when you told me you were taking the job as chief economist at Uber, I was flabbergasted. I knew you weren’t doing it for the money because you just don’t care about money. Why did you make that leap into the corporate world?
LIST: I think it was to continue to do science. When you look at how field experiments have advanced over the last 30 years, they’ve advanced to the point where if John List would go and do a baseball card convention experiment, that would get desk rejected. And what that means for all the people who aren’t in academia, the editor would not even consider it seriously for publication because field experiments have advanced so much that people are looking for real-world field experiments that have a really tight theory, mounds and mounds of data, millions of observations in important markets. So, I could see that if I wanted to continue to be a scientist, which I do, that to stay current, I had to do these other activities such as be the chief economist at Uber, because that would give me an entirely new sandbox to use field experiments to not only help Uber, which is fun, but really, I’m not there to help Uber per se. I’m there to do better science and write about the scientific findings. So, for me, it was a progression that had to occur otherwise, I’m not even sure if I’d be on the frontier of field experimental research anymore.
LEVITT: Uber’s not hiring you to do good science, though. Uber’s hiring you to make them more money. Did you find a conflict there?
LIST: I think there are a fair number of questions wherein you can do good science and it ends up helping Uber as well. One example is what we call Uber Apologies. The background story is, I received a bad trip, and I went to Travis Kalanick, and said, “I’m very angry about this bad trip and I’m going to start taking Lyft.”
LEVITT: Wait. This is true or are you making this up?
LIST: This really happened. Uber Apologies is a product that currently Uber has and so does Uber Eats and also Lyft by the way, that really started when I received a bad trip back in 2017 in an Uber. I was going to downtown Chicago from Hyde Park, to present at the American Economic Association meetings in a debate against a fellow that some of your listeners might know, who won the Nobel Prize in development economics, named Angus Deaton. Angus commonly criticizes field experiments. So, I was going down there to engage in a debate, my car got turned around, and I ended up back in front of my home in Hyde Park by mistake. So, I looked at the driver and I said, “No! What were you doing?” They said, “Well, you look so busy,” because I was working on my slides like I always do at the last minute. So I said, “I was working on my slides, but I still need to go down to the hotel. So turn around and go.” And I was a few minutes late and then we did our debate. So then that night I was so angry that I called Travis. And I let Travis know what I thought about his blankity-blank app and that I would never take it again, and that I was going to become a Lyft loyalist.
LEVITT: Wait, you were chief economist at Uber at the time, right?
LIST: Yes. So this is a chief economist at Uber calling the C.E.O. and founder, telling him, “Take your app and put it where the sun doesn’t shine and I’m going to be a Lyft loyalist.” And I said, “Travis, the worst part about this is that I’ve never received an apology from anyone at Uber.” And he said, “Look, John, we haven’t gotten to that yet.” And I said, “We have now.” So, when you have an idea like this, the first thing you have to do is you have to make the business case. So the business case of course is, are these bad trips detrimental to revenue and profits for Uber? But as you know, nobody at Uber is going to let me run a field experiment where I give people bad trips randomly, so now I have to do the Steve Levitt trick. I need to look at mounds and mounds of data, and I need to find statistical twins in the data — two identical people in terms of their gender, their race, how many trips they’ve taken in the past month, how much money they spend on Uber, where they live, et cetera, et cetera. And I need to find these twins in cases where one had a good trip and one had a bad trip, and then I can compare after you had a bad trip, or good trip, how much money do you spend in the next, say, week or 90 days, et cetera. So that’s what we did, and what we find is that Uber loses between 5 and 10 percent of their revenues because of these bad trips.
LIST: So that case, Travis is like, “Okay, have at it.” Here’s where the science now comes in. So I reach out to one of your friends, Ben Ho, a Stanford Ph.D. student who did work on the economics of apologies. So we end up designing a field experiment whereby it was simply after people had a bad trip, what kinds of apologies work well? And then we see the apologies that worked well, of course, keep people spending on Uber. The apologies that didn’t work so well, those are the kinds of people who say, “I don’t want to take Uber as often anymore.” And we wrote a paper on this, it’s called “The Economics of Apologies,” but we also created a product that Uber still finds useful today. Because what we found in that experiment is we can undo some of the bad stuff that comes along with bad trips by using apologies and promotions.
LEVITT: From Uber’s perspective, why don’t they want to keep it secret? Because as soon as you make it public and publish it, now Lyft is doing the same thing.
LIST: Well, it’s the price of doing business with John List. You know the story back when Jeff Bezos offered me the chief economist position at Amazon, I think it was around 2009 or 2010. That offer was a great offer. The share price of Amazon was like $7 or $9 a share. And Jeff Bezos said, “Look, John, come on in and do great stuff, but nobody outside the walls of Amazon will learn about it because we want to keep the secrets and make Amazon a stronger company,” which I totally get. But I ended up making the decision that any firm I work with, we have to have the ability to publish some of the science that we do.
LEVITT: You eventually left Uber, and you did some time at Lyft. But your current gig is as the chief economist at Walmart, the biggest company in the world by revenue — over $500 billion a year in revenue, over 2 million employees. That must be a dream come true for you — talk about scale.
LIST: Exactly right. Look, Uber and Lyft are great companies, but when I had the chance to go and play in the Walmart sandbox — 4,700 stores, and kind of a little-known fact is that 90 percent of Americans live within 10 miles of a Walmart. That’s the kind of scale that you have a real shot to make a difference, right? I had been in ride-share for six years, and I think the science around those questions had become less interesting over time. And Walmart just presents an entirely new set of really interesting questions to work on.
LEVITT: I certainly wasn’t clever enough to have the idea to do field experiments before anyone else, but to my credit, I instantly appreciated the value of what you were doing. And I’m sure you still remember the conversation where I told you I wanted to try to get you a job offer at the University of Chicago.
LIST: So you handled an early market paper of mine that was published in 2004, and you approached me and said, “Look, John, I love these field experiments and I would love to attract you to Chicago, but it’s not the easiest place to get a job offer, but would you seriously consider it?” I said, “Look, if you get me a job offer, I will join you at the University of Chicago.”
LEVITT: Yeah, you gave me your word that if I got you an offer, you would come. It was not easy getting you an offer. It took many, many months, but I finally pulled it off. In the meantime, though, other top departments started to notice you as well.
LIST: Yeah, I think I had 10 or 12 offers, because 2004 was kind of a watershed moment for me. I think I had like 20 papers that were printed, and I think five or six of them were in top four journals. So I became let’s say a more attractive person for the econ higher-ups. And Princeton essentially did everything right. Whereas, at Chicago, it felt like nobody in the department wanted me, but maybe you and a handful of others.
LEVITT: Not far from the truth.
LIST: Not far — it was like University of Chicago was doing everything to send me signals that they didn’t want me. Beyond the offer from the University of Chicago in pecuniary terms was by far the worst of all my offers. So I still remember the day when you said, “Look, I know that this has gone down in a different way than what you anticipated. I will not have any issue if you go ahead and go somewhere else.” That was never in my vernacular, honestly. That was never a consideration. When you give somebody your word, you give them your word. And I followed through on that promise. And I’ve never looked back. I’ve loved my days at the University of Chicago.
LEVITT: There are very few economists who would’ve done what you did. It still makes no sense to me. You must value your word more than I value my word. But you left out one key issue. You said, “I’m only going to come to Chicago if you agree to play 18 holes of golf with me.” And honestly, that was the last thing in the world I wanted to do. I had played golf once in the previous decade, and that had gone very poorly. Why did you want to play golf with me so badly?
LIST: Well, I knew that you were somebody who would be a very important partner for me and friend. I think you had played in the departmental golf outing and people said, “The guy’s got this great swing. He looks like Jack Nicklaus, but then he hits it like 80 yards, and we don’t know what’s wrong with the guy.”
LEVITT: Yeah, it was awful. And it was enough for me to never want to play again. So I remember, like it was yesterday, sitting in the bar after the round and I looked over the scorecard and I just couldn’t believe how well you’d played. And in complete seriousness, I said, “John, you should forget about economics. You should be trying to make the senior professional golf tour, the Champions Tour.” And you laughed and said, “Oh, I gave it my shot. I tried hard in college. I wasn’t good enough.”
LIST: But I said, “Your swing looked much more like a professional swing than mine.” So I said, “Why don’t you give it a go? Get some strength — that’s key. Get a little bit of strength. And get after it, and you’ll have a shot.” You and I are 38 and 39 years old. You have to be 50 years old to play on the senior tour. So I said, “Give it a go.” I didn’t think there was any way you were going to take me up on that.
LEVITT: So I laughed because it was such a joke. But weirdly, over the next couple of weeks, I just became captivated by the idea, and you had a front-row seat as I did spend an unthinkable amount of time and effort in pursuit of that absurd dream to play professional golf over the next decade. And we all know I never succeeded, that realistically I never had any chance of succeeding. But I’m curious, I’ve never asked you, what were you thinking as you watched me trying to become a pro golfer?
LIST: You have a dream and you’re going after it. And it gave you a goal, right? I did nothing but respect that choice and your effort. Now, honestly, I was like you. Did I think you could do it? No, but did I think you had a shot? Absolutely. Because you have the basic mechanics to be good.
LEVITT: You’re right that the pursuit of that goal — the tournaments we played, and the time we spent on the golf course, honestly, those are some of the most enjoyable moments of my entire life. I want to thank you for getting me on that ridiculous path. It’s not often that you fall way short of your goal, but you still say, “God, I’m so glad that I went down that path anyway.” But that’s definitely the case with golf for me.
LIST: No, but let’s be clear, this was a path of friendship and a path of co-authorship and working on papers together. I can still remember some rounds where you and I would go around the entire course and not say a word to each other, and then we would look at each other and say, “That was the best round that I’ve ever had.” But then there were other cases where we would debate and argue and say, “You know, I don’t think you’re viewing that in the right way. And this could be a nice scientific contribution.” So I actually viewed it as much more than golf. I viewed it as a flowering friendship and years and years of co-authorship where, you know, we’re trying to change the world.
When I fought to get John List a job in the University of Chicago Department of Economics so many years ago, I knew he was a great economist. But I never would have imagined that he would come to have such a profound impact on my life, both personally and professionally. I’m really lucky that John’s a man of his word, that he came to Chicago just because of the promise he made me. And thank God, also, that he’s such a tough bargainer, because if he hadn’t forced me to play that first round of golf, my life would have turned out very differently. And now, a question from our listener.
Morgan LEVEY: Hi Steve. So, our listener Farhad had a question for you about education, which, obvious to anyone who has listened to this podcast, is a priority for you. Farhad says success in education is reliant on doing well on exams. Exams usually test the ability of the student to answer questions correctly. However, given most people’s easy access to the internet, the ability to answer questions seems not to be as critical or as valuable as it used to be. So he thinks the relative value of a good question to a good answer has gone up. He believes that asking good questions is just as important as answering them, but the education system tends not to teach this or test it in any meaningful way. Do you agree that the value of a good question has gone up compared to a good answer?
LEVITT: I think the value of a good question has always been really high. I think what’s really been diminished over time has been being able to answer the kinds of questions we ask in exams. Those are simple questions. Those are factual questions. Those are computational questions. And it’s true, the internet has made it really easy to get answers to those kinds of questions. But I think there’s a whole other class of questions where getting the answers is just as important or even more important than before. And those are open-ended questions like, how do you solve the violence problem in Chicago? That’s a case where everyone agrees on the question, how do you reduce violence? But getting to the answer, that’s just really hard and the internet’s not going to help you very much. There’s a hundred possible solutions out there, and which one or two might work, it’s just difficult to figure that out.
LEVEY: Historically, exams ask questions that ask students to sort of regurgitate facts because it’s easy to evaluate if that is right or wrong. Do you have methods in your classroom for how you evaluate more open-ended questions?
LEVITT: Well, I tell you, I have abandoned exams completely. I haven’t given an exam in a class in four or five years.
LEVEY: Really? I’d like to take one of your classes.
LEVITT: Well, no, you wouldn’t because actually my classes have gotten much harder. The only way I test my students is through analysis of data sets where I will provide them with data, and I try to make the problems as realistic as possible. So one of the things I do is I usually put mistakes in the data, because when you get data in the real world, it’s always rife with errors. And I want my students to understand that if you don’t go through your data carefully to understand where mistakes might be in it, you’ll get ridiculous answers. Then, instead of asking them very specific problems, I say, “Provide me with a three-page PowerPoint deck that discusses the three most interesting and important relationships that you can find in the data.” And so it really challenges them to think it’s not just a matter of going through the motions. They have to really look at the data and ponder what makes something interesting, what makes it important? And then I conclude, usually, with something that’s a little more tangible, say a prediction exercise. Where it isn’t just judgment about what’s interesting, important, but actually, who can make the best predictions out of sample using the data. And my motivation for doing it this way is that’s what people do in the real world. And I thought it would be useful to give students a chance to see what real world projects look like. And it turns out there’s a really great side benefit to doing problem sets like this, which is that I get a great signal on which students I would want to hire to be my data scientists. I always used to hire the best students based on exams. And I would say on average, they were less good employees than the ones I’ve hired since I started doing these problem sets. And I think that’s some proof that what I’m doing is actually capturing something far more useful than what I used to hire the students who had the best grades on my exams.
LEVEY: I think another way to learn how to ask and answer open-ended questions is practice. And one of the best ways to practice that is to start a podcast. I think anyone would agree that from the beginning of PIMA to now, your ability to ask open-ended questions has greatly improved. What do you think?
LEVITT: Really? I feel that I’m the exact same interviewer that I was when I started. I can’t see any improvement at all from my perspective.
LEVEY: Oh my gosh. I think you’ve improved so much. I don’t think there’s anyone besides me and Jasmin, our engineer, who has listened to you ask more questions. And you’ve learned to trust your natural curiosity a little bit more. I think you used to go in very steadfast to a plan. And while I still know that you prepare a lot and do have a plan going into interviews, I think you’re more willing to be flexible in that plan. I think you’ve just become more comfortable, and you’ve learned how to structure a question so that it’s going to lead to a more interesting answer.
LEVITT: It’s certainly true I had very little experience asking questions when I started this podcast because in everyday life, I just don’t ask very many questions. It hasn’t been my style of conversation. I hope it’s true. It’s always hardest I think for the individual to see how they’ve changed because I really just feel like me. I feel like I’ve always felt, but that would be great news for everyone if I’ve actually gotten a lot better.
LEVEY: And if our listeners are interested in more good questions, they can check out our sibling podcast No Stupid Questions hosted by Stephen Dubner and Angela Duckworth. If you have a question for us, our email is email@example.com. That’s P-I-M-A@freakonomics.com. It’s an acronym for our show. Steve and I read every email that’s sent and we look forward to reading yours.
And in two weeks we’re back with a brand-new episode featuring yet another guest who’s had a profound impact on my life, none other than my Freakonomics coauthor, Stephen Dubner
Stephen DUBNER: I just wanted to write. I just wanted to go interview people. I had a million ideas. I had another year of teaching lined up and I said, “You know, I’m going to go do journalism.” I quit so much. Levitt, you’re like one of the few things I’ve never quit.
LEVITT: Not yet.
DUBNER: You’re like my cigarettes. I just can’t shake them.
* * *
People I (Mostly) Admire is part of the Freakonomics Radio Network, which also includes Freakonomics Radio, No Stupid Questions, and Freakonomics M.D. All our shows are produced by Stitcher and Renbud Radio. This episode was produced by Morgan Levey and mixed by Jasmin Klinger. Lyric Bowditch is our production associate. Our executive team is Neal Carruth, Gabriel Roth, and Stephen Dubner. Our theme music was composed by Luis Guerra. To listen ad-free, subscribe to Stitcher Premium. We can be reached at firstname.lastname@example.org, that’s P-I-M-A@freakonomics.com. Thanks for listening.
LIST: You and I helped coach a baseball team.
LEVITT: I did not coach a baseball team.
LIST: You absolutely did. You stood on the sidelines with the book.
- John List, Walmart’s Chief Economist and professor of economics at the University of Chicago.
- The Voltage Effect: How to Make Good Ideas Great and Great Ideas Scale Hardcover, by John List (2022).
- “Toward An Understanding of the Economics of Apologies: Evidence from a Large-Scale Natural Field Experiment,” by Basil Halperin, Benjamin Ho, John A List, and Ian Muir (The Economic Journal, 2022).
- “Introducing CogX: A New Preschool Education Program Combining Parent and Child Interventions,” by Roland Fryer, Steve Levitt, John List, and Anya Samek (BFI Working Paper, 2020).
- “The First Live Attenuated Vaccines,” by Caroline Barranco (Nature, 2020).
- “What You Need to Know About #DeleteUber,” by Mike Isaac (The New York Times, 2017).
- “Protecting the Nation From Foreign Terrorist Entry Into the United States,” by the Executive Office of the President (Federal Register, 2017).
- The Why Axis: Hidden Motives and the Undiscovered Economics of Everyday Life, by Uri Gneezy and John List (2013).
- “The Nature and Extent of Discrimination in the Marketplace: Evidence from the Field,” by John A. List (The Quarterly Journal of Economics, 2004).
- “Do Better Schools Matter? Parental Valuation of Elementary Education,” by Sandra E. Black (The Quarterly Journal of Economics, 1999).
- “What Are the Questions?” by Joan Robinson (Journal of Economic Literature, 1977).
- The Economics of Discrimination, by Gary S. Becker (1953).
- Essays in Positive Economics, by Milton Friedman (1953).
- Economics, by Paul Samuelson and William Nordhaus (1948).
- A System of Logic: Ratiocinative and Inductive, by John Stuart Mill (1843).
- “Why Do Most Ideas Fail to Scale?” by Freakonomics Radio (2022).
- “Policymaking Is Not a Science (Yet),” by Freakonomics Radio (2020).
- “Why Does Tipping Still Exist?” by Freakonomics Radio (2019).
- “What Can Uber Teach Us About the Gender Pay Gap?” by Freakonomics Radio (2018).
- “Why You Should Bribe Your Kids,” by Freakonomics Radio (2014).
- “What Makes a Donor Donate?” by Freakonomics Radio (2011).
- No Stupid Questions.