OBERMEYER: There’s a set of decisions that you have to make right off the bat, which is, am I really worried about this person or not? Sometimes it’s obvious, if they’re unconscious, if their vital signs are completely abnormal. But a lot of the things that we worry about the most in the emergency department are things that can actually look pretty subtle at first.
Dr. Ziad Obermeyer is an emergency medicine physician and a professor at U.C. Berkeley.
OBERMEYER: So everyone’s first image of a heart attack is a middle-aged man clutching his chest and saying something about an elephant. But most heart attacks don’t look like that. Most heart attacks are a little bit of nausea or a tinge of chest pressure.
The stakes are high in the emergency department. People arrive with symptoms that could be life-threatening, or benign, or anywhere in between.
OBERMEYER: There’s approximately billions of tests and combinations of tests that you could order for any given patient. And so you need to narrow down to the ones that are gonna help you figure out what to do with this person right now.
A lot of Ziad’s research has focused on tests, and how to make sure we’re testing the right people for the right things at the right time to get the right information. In emergency medicine, that means making choices that help you answer one big question:
OBERMEYER: What do I do with this person next? Do I bring them into the hospital or do I send them home? And that’s a really hard one.
In 2020, there were more than 33 million hospital admissions in the U.S. That number is down slightly from prior years, probably because of the pandemic.
But that’s still a lot of hospital stays. And a lot of decisions by Ziad and his fellow emergency medicine physicians. When they send a patient home, it’s usually fine. But there’s always a risk that something bad could happen. Something that could have been avoided, if only that patient had been admitted to the hospital.
OBERMEYER: We do in fact send most people home, even though it certainly doesn’t feel that way. But every one of those decisions weighed on me so much.
Not long ago, Ziad wondered if there might be a way to help doctors make those decisions — and, maybe, get better outcomes.
From the Freakonomics Radio Network, this is Freakonomics, M.D. I’m Bapu Jena. Today on the show, we’re going to talk about why it can be so hard to figure out which patients need to be in the hospital and which patients can safely be sent home.
OBERMEYER: It didn’t look like the doctors really had appreciated that this was a person who had a high risk of dying in the next week.
Doctors are only human, though. They make mistakes. Who — or what — could help them get it right?
OBERMEYER: There’s a bunch of high-risk people that the algorithm says, boy, this person looks very risky, that the doctors were not testing
* * *
JENA: What should I call you? Dr. Obermeyer? Ziad? What do you prefer?
OBERMEYER: We’ve known each other for a long time, Dr. Jena. So you pick, but Ziad is just fine.
JENA: I’ll call you Ziad then if that’s okay.
As an emergency medicine physician, my friend Dr. Ziad Obermeyer gets to dabble in a lot of different specialties.
OBERMEYER: One of the things I loved about the job during my training and I still practice now, is that you just get to see every part of medicine: cardiology, OBGYN, infectious disease, you kind of see it all in one place. And you’re not necessarily the best at any one thing, but you’re there.
Being there also means being the one to make some really tough calls.
OBERMEYER: I used to go home from my shifts in the emergency department and I would just lay awake, agonizing over people I’d sent home and wondering if I should have done a different test or done more tests or brought them into the hospital. And giving up that control that you have over the situation when they’re in front of you is incredibly hard and incredibly stressful. And it was actually what motivated a lot of the early research that I did on people who had bad outcomes after they were sent home.
Ziad is also a prolific researcher, and in 2017 he and some colleagues, including me, published a study that tried to capture just how often otherwise healthy patients died soon after being discharged from the emergency department. E.D. visits are incredibly common. In 2020, there were 131 million emergency department visits in the U.S. And as Ziad said earlier, we do send most people home. Only around 14 percent of those visits resulted in the patient being admitted to the hospital. If you show up at the emergency room with a problem, the most likely outcome is that you’ll be evaluated, treated, and then… sent home.
Are those physicians getting it right, for the most part? What happens when they don’t?
OBERMEYER: What we tried to do was look at all of the emergency visits by Medicare patients across the country to emergency departments. And we tried to take out all the patients where if there were something seriously wrong you might actually hesitate about bringing them into the hospital. So people who were over 80-years-old, people who were frail for some other reason because they had metastatic cancer or end stage heart failure, or maybe they were in a nursing home. So, we tried to focus in on a population of people that were basically healthy people living at home who had normal lives that were not limited by some serious illness. And then among the people who were sent home, which is the majority of people we looked at what happened in the week after and we calculated how many people died. What we found was that, glass half full, it’s only about 0.1 percent of people who die. Glass half empty, that’s still a lot of people because that’s every emergency visit in the country. So it ends up being about 10,000 people per year just in the Medicare population. And when we looked at the diagnoses that they were given in the emergency department, it didn’t look like the doctors really had appreciated that this was a person who had a high risk of dying in the next, week.
JENA: In that study, the presumption is that if these doctors had instead kept the patients in the hospital, admitted them to the hospital instead of discharging them home, that their outcomes might have been different. Is that the intuition behind why the 0.1 percent of people dying within seven days of being seen in the emergency department and sent home — why that’s a bad thing?
OBERMEYER: I think it’s really difficult to say whether or not it’s a bad thing or, what the right number is. I think we might say, well, the right number is zero, but of course, you can’t admit everyone to the hospital. In a lot of ways that project asked a lot more questions than it answered. My assumption is that some fraction of those people who were sent home with a pretty benign diagnosis and then died a few days later, we could have done something differently for some of those people that would’ve helped them. That was what set me off on a research agenda to understand who those people were a little bit better, and more importantly, to figure out how we could improve the decision making in a way that might help reduce the rate at which bad things happen to people.
JENA: One of the things that was really interesting to me in that study was that you showed a relationship between the likelihood of people dying within seven days of being seen in the emergency department, and also the probability that patients in that given hospital tended to be hospitalized versus not. What did you find in that analysis?
OBERMEYER: We looked at hospitals in a few different groups. There are the hospitals that are incredibly cautious and conservative. Probably like the hospitals where we trained, you and I. And we found those are the hospitals that admit a lot of people. And then there are the hospitals that send a lot of people home. And they’re actually often hospitals that are in rural areas. And they’re under-resourced hospitals where they probably can’t admit as many people as they’d want to. We looked at the hospitals that were sending a lot of people home relative to other hospitals in the country. And we found a pretty striking pattern where those were the hospitals where the near-term death rates really spiked. If you looked at the week after people got sent home, compared to weeks two and three and out to the next year, there was a huge mass in that first week of people who were dying relative to the later weeks. The other hospitals did not have that pattern. And it wasn’t like those hospitals were just serving very, very sick people because the patients that they admitted to the hospital were actually doing much better in terms of their near term mortality than all the other hospitals. Here, the trend was very different. It was really the patients that were being sent home that were popping out as a high mortality group.
JENA: One way to interpret that finding would be that part of the reason why we might see a higher mortality rate for people who are seen in the E.D. and then discharged home and they die within seven days is that those hospitals simply were not hospitalizing enough patients.
OBERMEYER: There are two explanations. One is that these hospitals are not doing that triage properly. Your job as the emergency doctor is to get the sick patients in the hospital and the healthy patients out of the hospital. So one hypothesis is that that decision is not going well. The other hypothesis is that those hospitals just see really sick patients, and even if they’re doing their job well, you’re gonna end up with some sick patients that die, and there’s nothing anyone could have done about it. And so I think that, at least in these very low admission rate hospitals, there’s something going wrong in the first case. The triage is not working as well as it is in the other hospitals.
JENA: And of course, hospitals and the doctors who work in them, they’re concerned about two things. One is they’re concerned about making sure that anybody who’s at risk of something bad happening to them gets hospitalized. But they’re also very concerned about hospitalizing patients who don’t need to be hospitalized, and not only because of cost, but because they worry about things that can happen to you in the hospital. You know, hospitals are not always the safest place to be if you don’t actually need to be there. And that really speaks to this broader issue in medicine of how do you distinguish between overuse, over hospitalization, or underuse, or in this particular case, under hospitalization. You’ve done a lot of work trying to parse out these issues. Can you tell me a little bit about it?
OBERMEYER: Despite being an economist, Bapu, you didn’t mention incentives. And I think everyone’s view of when doctors are deciding whom to hospitalize and whom to send home, if you asked a bunch of people, especially in health policy circles, they’d say, well, of course, the doctors are gonna over admit people to the hospital because they don’t wanna get sued. And because the hospitals are incentivized to admit more patients, cause they get more reimbursement. But there has to be something beyond incentives going on here because no emergency doctor in her right mind if you told her, oh yeah, this patient has a pretty high probability of dropping dead after you send them home, there is no way that doctor is gonna decide to send them home. The first order thing that every doctor cares about is, I don’t want bad things to happen to my patients. But the other thing you think about is that you do mess up. And so I got really, really interested in error as a way to explain this pattern that we see where a lot of people get admitted to the hospital that don’t need to be admitted to the hospital — overuse. But then I was also finding these signals that there was underuse too. At around the same time that I was getting these results from this study, I was also getting interested in how machine learning or artificial intelligence could help. For example, if a doctor knew, oh, this person would do poorly outside of the hospital, that could be very valuable when she’s making the decision to admit the patient to the hospital. It’s really hard to answer the question, “Who needs to be in the hospital?” And it’s a lot easier to focus on subparts of that question. Like who’s having a heart attack?
Chest pain is a classic sign of a heart attack — and it’s the third most common reason people visit the emergency department in the U.S. So, when Ziad started to think about how he could use artificial intelligence, or A.I., to help emergency medicine physicians figure out who needs to be hospitalized and who doesn’t, heart attack seemed like a good place to start. One test doctors perform on these patients is cardiac catheterization. We’ve talked about it on the show before: It’s invasive, and designed to see if the heart’s arteries are blocked. If so, they can often be reopened. But some suspected heart attack patients may not get this test at all. Others may get it and have totally normal looking arteries because they didn’t actually have a heart attack. Whether a cardiac catheterization is positive or negative tells you something about how often a doctor’s initial instincts were right.
OBERMEYER: We trained an algorithm that could predict if I tested a person with these advanced invasive tests like catheterization, would that test be positive? And using that prediction as a way to evaluate the doctor’s decision making process, what we found is that there’s a bunch of people that the doctors decide to test for heart attack because they’re worried enough about heart attack that it’s worth the risk and the cost and the inconvenience of these tests. About two-thirds of the time the algorithm said, don’t test this person. This person’s never gonna have a positive test. You’re just gonna spend a lot of money and you’re gonna be no further along than when you started. And when the algorithm said that the algorithm was right, and those people overwhelmingly went on to have negative tests, suggesting that the doctor’s just over testing. It’s an error. Now that’s not that surprising, I think, to most people who know anything about the healthcare system. Yeah, there’s a lot of overuse. The more surprising thing we found, is that there’s a bunch of high risk people that the algorithm says, boy, this person looks very risky, that the doctors were not testing.
After the break: why are doctors missing these high-risk patients?
OBERMEYER: We were able to use the artificial intelligence algorithm to try to get inside the head of the physician.
I’m Bapu Jena, and this is Freakonomics, M.D.
* * *
OBERMEYER: Doctors are, testing these low-risk patients that they shouldn’t, they’re not testing these high-risk patients that they should. And the question is why.
In 2021, Dr. Ziad Obermeyer and the economist Sendhil Mullainathan trained an AI algorithm to evaluate the way emergency medicine physicians make decisions when they encounter a patient with heart attack symptoms. Their work was published in the Quarterly Journal of Economics. As Ziad told us just before the break, he and Sendhil weren’t surprised to find that doctors over-test patients that were at low-risk for a cardiac event. But they were surprised at how many high-risk patients they weren’t testing.
OBERMEYER: It was like 1 to 2 percent of people.
In some cases, it was clear the doctors weren’t even considering heart attack in those patients.
OBERMEYER: Sometimes they don’t even do an E.C.G. or a troponin test. These most basic screening tests that if you have the slightest suspicion for heart attack you do. When we follow those patients over the month after they were not tested for heart attack, they had really bad outcomes. About 15 percent of them had a major adverse cardiac event. About a third of those were death. So these were some of the people that would’ve been included in our first study about people who die shortly after being sent home from the E.D.
JENA: So how much of this is a function of doctors either not seeing what we might think are clear patterns in the data versus doctors not even knowing the data in the first place?
OBERMEYER: In our particular study, the algorithm and the physician had access to exactly the same information up until the triage desk. There’s no information asymmetry. The doctor has the benefit of the whole emergency department visit to gain additional information over the algorithm. So the doctor should be doing better if it were just an information question.
Ziad and his co-author designed an algorithm that would not only find high- and low-risk heart attack patients, but that could also do something else:
OBERMEYER: We were able to try to get inside the head of the physician. We built these artificial intelligence models that predicted not what was gonna happen to the patient, but what the physician was gonna do, whom they were gonna test. And what we found was that doctors were actually very, very good at honing in on the risk factors that we all learn about in medical school. They tested older people more, they tested people with known risk factors more. And in fact, they were indistinguishable from the algorithm if they only looked at a small number of variables. The algorithm’s advantage came from the fact that the algorithm could pay attention to thousands and thousands of variables, whereas only about 20 or 30 can fit inside of our head. So the algorithm was able to aggregate, small amounts of signal from lots and lots of variables in the electronic health record. And that turned out to be why the algorithm did better. Now, the other reason the algorithm did better is because even though doctors are pretty good, doctors are also vulnerable to all of the biases that we know from the psychology literature. So for example, doctors test patients who are stereotypical for heart attack much more than they should. Even though chest pain is a great signal of heart attack, doctors go even further than they should in interpreting that as a signal of risk. Same with age, same with male sex. So doctors are paying attention to genuinely useful risk information in these variables, but they’re going too far and they’re falling victim to these kinds of stereotypes that they shouldn’t, and that the algorithm doesn’t.
JENA: Do they miss people too by also paying too much attention to stereotypes?
OBERMEYER: Yeah, absolutely. The errors go both ways because, there’s a fixed budget of tests, and so if you’re testing some of these people too much, you’re testing other people too little. It’s just that the people you’re testing too little don’t have a stereotype. There’s no way for doctors to really key in on that information, and I think that’s one of the reasons that using algorithms for this kind of task is so exciting.
JENA: If you were to think about doctors who are just higher testing versus lower testing, what would you expect to see in terms of outcomes from those doctors?
OBERMEYER: Yeah, it’s a great question and I think if you asked an economist what was going on here, what the economist would say is well, the low testing doctors are just testing the very high-risk people. And then the medium testing doctors are starting to test lower and lower risk people. And the really high testing doctors are going off the deep end. They’re dipping down into the very low risk part of the distribution. And that’s why we get this flat of the curve medicine, this diminishing returns problem of the low testing doctors are already testing the high risk patients and everything else is just unnecessary. What we found was actually the exact opposite of that, which is that if you compare the high-testing doctors to the low-testing doctors, the low testing doctors were testing the low risk patients less, so that’s good. The problem is that those frugal doctors are also being frugal with the high-risk patients. They’re testing everyone less, and the high testing doctors are testing everyone more.
JENA: So is one prediction of your work then, that you could take a bunch of the tests that were being performed in low-risk patients, and reallocate them to patients who are high risk, where the doctor was informed that this is a high-risk patient that you may have missed?
OBERMEYER: I think that’s the dream. In our preferred scenario we’d cut about two-thirds of tests that doctors are currently doing, but we’d also add back in about 10 to 15 percent of tests that doctors are not currently doing in these high risk patients that are untested. So overall, we’d cut testing by quite a bit if we were listening to the algorithm. But of the tests that we do, a pretty big fraction would actually be new tests. If you just went with the cost effectiveness of testing, you’d cut testing by quite a bit, but you’d really change the composition of who’s getting tested. And so it’s not overuse or underuse. It’s not, we should be doing more, we should be doing less. We should be doing the test on the right people and testing according to predicted risk.
JENA: If you were to go back to the question that we started with, which is, does this person need to be in the hospital? Yes or no? Do you think that the tools can exist to help doctors make that decision to figure out who are the patients who we don’t need to be hospitalizing and who are the patients who we do need to be hospitalizing that, were clearly falling through the cracks?
OBERMEYER: I think there’s a lot of potential for algorithms to help with that particular decision, but I think where I would start is realizing that that particular decision does not have an algorithmic answer. As an internist, if I asked you to write down explicitly the list of reasons that someone needed to be in the hospital, that list would go on forever. But when you’re trying to get an algorithm to solve a problem you need to give an algorithm a super well-defined question, like, what is the probability that if I test this person for heart attack, they will be shown to have a heart attack by that test? That’s a great question for an algorithm. The question about hospitalization is a lot more fluid, and I think the temptation of a lot of people who build algorithms is to say, well it’s a lot of work to enumerate and to build one algorithm for each thing. Why don’t we just predict what the human would do? So instead of trying to get the algorithm to predict heart attack, let’s just get the algorithm to predict who are doctors currently admitting today? Let’s just admit those people and those are the ones who quote unquote, “need to be in the hospital” and the other ones don’t. The problem there is that you’ve substituted the algorithm learning from nature and from the patient and from whether they had a heart attack to learning from human judgment. The doctor’s judgment about whether this person needs to be in the hospital or not. And that’s where a lot of problems get into algorithms because along with that human judgment, you’re also automating a lot of structural biases, like who can afford to be in the hospital and who can’t? Who has resources to help them with childcare when they’re in the hospital? All those things get bundled up when you build algorithms that learn from humans rather than learning from patients and from outcomes , and from nature. And that’s how a lot of racial bias, gets into algorithms.
JENA: So let’s exclude people who we know who are at very high risk. I’m thinking about the 0.1 percent of people who were admitted to the E.D., but then discharged home from the E.D. The doctors felt that they did not need to be hospitalized. Is there anything that we could do to identify who those people were? Because clearly the doctors missed them. And there’s a separate question of whether or not by identifying them, we could improve their outcomes.
OBERMEYER: As a practicing doc, I would actually love to just have a little dashboard of things like, well, if you sent this patient home, what’s the probability that they die in the next week? If you tested this person for heart attack, what’s the probability to have a heart attack? That is how I would probably design a system that’s helping human doctors make decisions is just processing a bunch of data in the background and giving them these little, very high value pieces of information one by one in a way that helps them make this very important decision. I think we’re very far from a world where algorithms can take those decisions themselves, and I don’t think we’ll ever be in that world but I think there’s a lot of help that algorithms can give human decision makers when it comes to processing this vast amount of data You could spend hours reading about any one of the patients that you see in the E.D., but you need to see 30 in an eight-hour period. Dealing with that information overload is a huge, huge promise of algorithms and processing them down into these little snippets of information that can help decision makers.
JENA: We’ve talked about what doctors can do. We’ve talked about what computers can do to assess doctors. We haven’t really talked much about what patients can do in any of these scenarios.
OBERMEYER: We’re seeing this incredible development of low-cost wearables, E.C.G.s you can do on your iPhone or on your watch. And I think a really appealing part of those technologies is that they almost cut the doctor and the medical system out of the loop. It’s not perfect right now but I think that our current decision making in medicine is based on seeing a patient in the office or in the emergency department and getting that very, very thin slice of their life and their health on which we have to make a bunch of really important decisions. As we start to accumulate more and more data from outside of the hospital and process it through algorithms, I think that’s the kind of thing that can really help make better decisions in the hospital or anywhere else, because we’re going to be getting a lot more data on patients so that we can help them by getting them to the right treatments and tests at the right time.
The idea of using smartwatches and similar devices to track patients before or after a hospital encounter reminds me of one of Ziad’s other studies. In 2019, he co-authored a paper that found when subtle vital sign abnormalities were detected in E.D. patients — for example, an ever so slightly elevated heart rate — they were significant predictors of poor outcomes, like death. Even small changes increased mortality risk. I suggested to Ziad that perhaps giving these types of patients a wearable device at discharge could begin to address this problem.
OBERMEYER: I think what you’re talking about is blurring the lines between the home and the hospital.
That is what I’m talking about. When I started my conversation with Ziad I had a very specific question in mind: who needs to be in the hospital to get hospital care? And, who could maybe be at home? But, as Ziad says, what if that line was blurred?
LEVINE: Many folks have said, if home hospital were a pill, everyone would want to take it.
What if your home was just like a hospital? That’s coming up next week on Freakonomics, M.D.
That’s it for today. I’d like to thank my guest this week, Dr. Ziad Obermeyer. And thanks to you, of course, for listening.
Here’s an idea to leave you with based on my discussion with Ziad. We talked about the front end of an important decision, whether to admit someone to the hospital. But once someone is hospitalized, doctors have a different problem: figuring out when it’s safe to go home. So, here’s my idea. What if we used the arrival of extreme weather events, like snowstorms, as a natural experiment to see what happens when we discharge people sooner than we otherwise would. When snowstorms are expected, like any week now in Boston, doctors may discharge patients home earlier, to get them out before the storm. Are those patients more likely to be re-hospitalized? If so, maybe we’re keeping patients in the hospital the right amount of time. But if readmissions don’t go up, maybe it means we could be safely discharging some people sooner. Think about it and in the meantime, let us know what you thought about the show. I’m at email@example.com. That’s B-A-P-U at freakonomics.com. Have a great week.
* * *
Freakonomics, M.D. is part of the Freakonomics Radio Network, which also includes Freakonomics Radio, No Stupid Questions, and People I (Mostly) Admire. All our shows are produced by Stitcher and Renbud Radio. You can find us on Twitter at @drbapupod. This episode was produced by Julie Kanfer and mixed by Eleanor Osborne, with help from Jasmin Klinger. Lyric Bowditch is our production associate. Our executive team is Neal Carruth, Gabriel Roth, and Stephen Dubner. Original music composed by Luis Guerra. If you like this show, or any other show in the Freakonomics Radio Network, please recommend it to your family and friends. That’s the best way to support the podcasts you love. As always, thanks for listening.
JENA: Is it your official medical advice that we should have economists staffing the emergency department?
OBERMEYER: I, um, I think I’ll plead the fifth that one.
- Ziad Obermeyer, emergency medicine physician and professor of health policy and management at the University of California, Berkeley.
- “Diagnosing Physician Error: A Machine Learning Approach to Low-Value Health Care,” by Sendhil Mullainathan and Ziad Obermeyer (The Quarterly Journal of Economics, 2022).
- “Are Vital Sign Abnormalities Associated With Poor Outcomes After Emergency Department Discharge?” by C. Y. Chang, S. Abujaber, M. J. Pany, and Z. Obermeyer (Acute Medicine, 2019).
- “Early Death After Discharge From Emergency Departments: Analysis of National U.S. Insurance Claims Data,” by Ziad Obermeyer, Brent Cohn, Michael Wilson, Anupam B. Jena, and David M Cutler (BMJ, 2017).
- “Who Gets a Heart Disease Test?” by Freakonomics, M.D. (2022).