Bad Medicine, Part 2: (Drug) Trials and Tribulations
Our latest Freakonomics Radio episode is called “Bad Medicine, Part 2: Drug Trials and Tribulations” (You can subscribe to the podcast at iTunes or elsewhere, get the RSS feed, or listen via the media player above.)
How do so many ineffective and even dangerous drugs make it to market? One reason is that clinical trials are often run on “dream patients” who aren’t representative of a larger population. On the other hand, sometimes the only thing worse than being excluded from a drug trial is being included.
Below is a transcript of the episode, modified for your reading pleasure. For more information on the people and ideas in the episode, see the links at the bottom of this post. And you’ll find credits for the music in the episode noted within the transcript.
* * *
In the mid-20th century, an exciting new drug hit the market.
TERESA WOODRUFF: It’s a small molecule that was produced in West Germany in the late 50s and early 60s.
It was a sedative, but not a barbiturate. So it wasn’t addictive, it didn’t clash with alcohol or other drugs and, according to its manufacturer, was entirely safe. They based this claim on the fact that no matter how much of it they fed the lab rats, the rats did not die. Once this new sleeping pill was made available, doctors discovered it did more than help people sleep.
WOODRUFF: It would combat for pregnant women morning sickness.
And so pregnant women all over the world were given the drug. It was called thalidomide.
WOODRUFF: The problem was the thalidomide would actually cross the placenta and impact the baby. And it would cause a whole series of malformations and probably a lot of fetal death.
Fetal deaths were thought to number at least 10,000. Among the babies who survived, there were serious birth defects:
WOODRUFF: Children that survived were deaf and blind, had a number of disabilities; they had shortened or lacked limbs.
Babies born with horribly malformed limbs, with missing or malfunctioning organs. Because of the putatively super-safe drug their mothers took to prevent morning sickness. Thalidomide was on the market for roughly five years before it was banned. Its German manufacturer, Chemie Grünenthal, first denied the disastrous side effects before ultimately accepting blame. The history of medicine is full of tragic missteps. But thalidomide, coming as it did during a boom in global mass media, made more noise than most:
NEWS CLIP: The problem of tighter controls to prevent the distribution of dangerous drugs such as thalidomide is a matter of concern to the president at his news conference …
NEWS CLIP: Concern over the tragic effects of the new sedative thalidomide prompts President Kennedy …
NEWS CLIP: Already more than 7000 children have been born with some or all of their arms and legs missing …
JOHN F. KENNEDY: Every doctor, every hospital, every nurse has been notified …
Although a few million thalidomide tablets had been distributed to doctors in the U.S. for trial use, it was never approved for sale here. That was thanks to a doctor at the Food and Drug Administration named Frances Oldham Kelsey. She didn’t believe the application from the American distributor offered complete and compelling evidence of the drug’s safety. President Kennedy later hailed Dr. Kelsey as a hero.
KENNEDY: The alert work of our Food and Drug Administration, in particular Francis Kelsey, prevented this particular drug from being distributed commercially in this country.
Even though the U.S. was an outlier in blocking thalidomide, the disaster had a number of lasting effects on American drug regulation. For one, the FDA established much more stringent rules for drug approval. It also rewrote the rules on what kind of people should be included in clinical trials.
WOODRUFF: Because of the effects on young women and on the fetus, it suggested that women shouldn’t be included in clinical trials because of the potential adverse events to the fetus.
Meaning: women were summarily excluded from early clinical trials for new drugs. On one level, this might make sense; it’s a protective impulse. But this impulse had a downside.
WOODRUFF: The study of women in general became part of the collateral damage of that pregnancy conversation. So there certainly are young women who are not pregnant who could be included in clinical trials and women in general could be included in clinical trials, to really understand some of the effects of drugs on their own health. And they were labeled as broadly vulnerable because of the potential to become pregnant. And I think that was part of a very rapid response to a very, very visible tragedy.
We see this all the time. Something terrible happens and we rush to introduce laws or regulations or just mores that respond to the terrible thing but, often, wind up overcorrecting. Think about the Three Mile Island nuclear-reactor accident in 1979. No one was killed, and the lasting health and environmental effects were negligible. But it was so frightening that it essentially killed off the nuclear-power expansion in the U.S. Even as other countries embraced nuclear as a relatively clean and safe way to make electricity — often using American technology, by the way — the U.S. retreated.
What’d we do instead? We burned more and more coal to make electricity. From an environmental and health perspective, coal is almost indisputably worse than nuclear. But that’s where the correction took us; that’s where the fear took us.
And the fear of another thalidomide took us to exclude most women from early-stage drug trials, and also to underrepresent women for a time in Phase 2 and 3 trials, even if the drug’s market included women. And, as you’ll hear today on Freakonomics Radio, that had some severe unintended consequences:
WOODRUFF: It’s just heartbreaking to know that so many women had to wake up in the morning and drive into the side of a mailbox because we didn’t have sex as one of the variables that we would study.
Also, when the only thing worse than being excluded from a medical trial was being included.
EVELYNN HAMMONDS: The use of vulnerable populations of African-Americans, people in prison, children in orphanages, vulnerable populations like these had been used for medical experimentation a fairly long time.
And: what happens when a new class of drugs comes to market with great clinical-trial results …
BEN GOLDACRE: But none of them have got evidence showing that they reduce your risk of heart attack or renal failure or any of the actual, real stuff that patients actually care about.
* * *
This is the second episode in a three-part series we’re calling “Bad Medicine.” It’s about the many ways in which the medical establishment — for all the obvious good they’ve done in the world — has also failed us. Last episode, we talked about how much we still don’t know, from a medical perspective, about the human body:
ANUPAM JENA: I would say maybe 30, 40 percent that we don’t know.
We talked about the fact that medicine hasn’t always been — and, often, still isn’t — as empirical as you might think.
VINAY PRASAD: You know, medical practice was based on bits and scraps of evidence, anecdotes, bias, preconceived notions, and probably a lot psychological traps.
We went over some of medicine’s greatest hits and its worst failures.
JEREMY GREENE: You take a sick person, slice open a vein, take a few pints of blood out of them …
On any list of medical failures, thalidomide is near the top. Although we should point out that long after it was found to have disastrous side effects on pregnant women, it’s had a productive renaissance. Thalidomide and its derivatives have been used to successfully treat leprosy, AIDS, and multiple myeloma. That said, its effect on pregnant women, as we heard, contributed to women being excluded from many drug trials. Thalidomide and another good-seeming drug that went bad, called DES.
WOODRUFF: Diethylstilbestrol, or DES, was manufactured in the early part of 1900s.
That’s Teresa Woodruff, who’s been telling us the thalidomide story.
WOODRUFF: I am the Watkins professor of obstetrics and gynecology at Northwestern University.
WOODRUFF: It’s a word that was coined only about 10 years ago. So what we did was to bring together both oncologists and fertility specialists. So many young people are surviving that initial diagnosis of cancer that we’ve really converted it over the last 20 years from a death sentence to a critical illness. Many of the young people will actually survive that initial diagnosis and live long lives. And so when they returned from that cancer experience many of them were sterilized by those same life-preserving treatments. And so we want to provide fertility options to both males and females. So we developed not only kind of the corridors of communication between oncology and fertility, but we also created new technologies that could provide new options for young women and for pediatric males and females.
So for Teresa Woodruff, as for many in the medical community, the future holds great promise. But so many decisions are informed by mistakes of the past. Like thalidomide and DES, which first became available in the 1930s.
WOODRUFF: So DES, it’s an estrogenic compound, was being prescribed to pregnant women to prevent miscarriage. Miscarriage was thought at the time medically to be caused at some level by low estrogen. And so supplying this estrogenic-like factor was thought to correct a really difficult problem.
DUBNER: Makes perfect sense. Was miscarriage in fact caused by an estrogen shortage?
WOODRUFF: It’s probably not. It’s multi-factorial. There may be some cases where low estrogen would have modest effect, but in general that’s not the case.
DES, as it turned out, wasn’t very effective in preventing miscarriage. Worse yet, it sometimes produced side effects that would become manifest only years later — in the offspring of women who’d taken DES. If affected boys and, especially, girls.
WOODRUFF: Well, the physicians just started reviewing the medical records of these young women who were now coming up with this very, very rare vaginal cancer. The onset of that disease is clearly estrogen-dependent and probably a very narrow window during pregnancy when estrogen would have that effect. You know, DES and Thalidomide are both tragedies, but it wasn’t that the physicians were going out trying to create an adverse problem for women who are pregnant. But as you look back across medicine, across science, we’re always learning.
In 1977, because of the tragic consequences of DES and thalidomide, the FDA made a big change. It recommended excluding from early clinical trials all “premenopausal females capable of becoming pregnant” unless they had life-threatening diseases. Which meant that many of the drugs that later came to market had been tested only on male subjects. Which could cause some real trouble for women.
WOODRUFF: A great example of this is the drug Ambien, which was just the latest of the large number of drugs that had adverse events in females.
Ambien is a sleeping pill whose main ingredient is a drug called zolpidem. Americans love their sleeping pills — about 60 million prescriptions are written each year for roughly 9 million people. Some two-thirds of these medications contain zolpidem, which was approved by the FDA in 2007. But as it turned out, men and women metabolize the drug differently.
WOODRUFF: The drug maker actually have in the FDA filing the metabolism of this drug in males and females. And in fact knew that it cleared the circulation of males faster than it did females. But they only studied the efficacy on males, had no females in that efficacy study.
DUBNER: And when you say the clearance, it means how quickly the body is metabolizing, yes?
WOODRUFF: That’s right. How long that drug is available in the body.
DUBNER: Can you just explain that a little bit? Let’s say it’s a 150-pound male and a 150-pound female. I assume those will be different clearance rates. Can you explain why that is?
WOODRUFF: Right. It’s going to depend on individuals and so some drugs will go into the fat and will be available for longer, so how much fat exists and what kind of fat can take up some of the drugs. But probably the most important part of drug metabolism is the liver. And so males and females have different enzymes and different P450s that are on the liver. And so that can alter the way drugs get cleared. For example, women wake faster from sedation with anesthetics, so they recover much more slowly and have more reported pain events in hospital.
DUBNER: Talk for a few moments about the differences between females and males in medicine and/or medical science.
WOODRUFF: Well so I think one is hormones, and that’s what we often think about. Males have testosterone, and females have estrogen and progesterone. And so those hormones influence a lot of the biology of males and females in a very distinct and different way. But the fundamental way males and females differ is that every cell in a male’s body is XY and every female cell is XX. And the sex chromosomes actually also inform — just like the other chromosomes within the cell — the overall function of that particular cell. And so understanding how chromosomal sex informs the biology of kidney cells or of eye cells or of muscle cells is really important. In addition there are anatomical differences between males and females. So heart size might differ, and that’s relevant to cardiovascular disease. And then the environment, the microbiome. We now know from a variety of studies that there is a sex to the gut microbiome that inhabits all of us.
DUBNER: So I would think therefore if I want to be a doctor or medical researcher, or running the FDA, or anywhere up and down the ladder, I would like to think that for the past 100 if not for the past 1000 years, I’ve been very careful to consider any treatment and how different people would accept it differently based on their biology.
WOODRUFF: Right, I think that’s the surprise to everyone. That, in fact, sex has not been a fundamental part of the way we look at biological systems. And at some level this is just the way biology has always been done, and then science keeps building on what was done in the past. And I think that’s the really critical question. Are there real adverse events that occur when you only use one sex? And the answer is: of course, yes. Something like eight out of the last ten drugs pulled from market by the FDA were because of this profound sex difference.
In the case of Ambien, the FDA was getting complaints for years from users who were sleepwalking, even sleep driving.
WOODRUFF: It’s just heartbreaking to know that so many women had to wake up in the morning — and they still got up — but they went out and drove into the side of a mailbox, because we didn’t have sex as part of one of the variables that we would study.
Eight hours after taking an Ambien, 10 to 15 percent of women still had enough zolpidem in their system to impair daily function, compared to 3 percent of men. The FDA’s ultimate recommendation: women should take a smaller dose than men. The federal government had acknowledged for years the problem of excluding women from medical trials. In 1993, Congress required that women be included in all late-stage clinical trials funded by the National Institutes of Health unless it was a drug taken only by men. But what that didn’t do …
WOODRUFF: What that didn’t do was include males and females in the animal studies and the cell studies that are the precursor to all — it’s the engine to all of medicine.
Meaning: drugs that might be useful for women, but not for men, might not even get to the earliest stages of testing.
WOODRUFF: It wasn’t that they were thinking, “Well, let’s make it hard on women to have this drug down the line.” I think they were thinking of trying to do the most clean study they could imagine. And the study group that they imagined were the simplest were the males.
Males were considered “simple” because they don’t have menstrual cycles that change hormone levels; they don’t get pregnant; they don’t go through menopause. As one researcher puts it, studying only men “reduces variability, and makes it easier to detect the effect that you’re studying.” But ultimately, the exclusion of women was deemed inappropriate. In 2014, the NIH spent $10 million to include more women in studies, and in 2016, they decreed that all studies had to include sex as part of the equation:
WOODRUFF: And that date January 25, 2016 — to me there is a before and there is an after. And before that time, sex wasn’t a variable in the way time or temperature or dose has always been. And I think we’re going to see an enormous number of new discoveries simply because science now has an entirely new toolbox to work with.
So that’s progress. But, as we’ll hear later, drug companies still like to use very narrow populations for their drug trials — the better to prove efficacy, of course. So exclusion still exists. On the other hand, it wasn’t so long ago that exclusion from a certain kind of medical trial would have been a blessing.
HAMMONDS: The use of vulnerable populations — of African-Americans, people in prison, children in orphanages, vulnerable populations like these had been used for medical experimentation a fairly long time.
WAILOO: You see it in the era when the birth-control pill is being tested in Puerto Rico in the 1950s, and you see it in things like, the Tuskegee syphilis study, which extended from the 30s into the 1970s.
“The Tuskegee Study of Untreated Syphilis in the Negro Male,” as it was called, is one of the most infamous cases in U.S. medical history. Its goal was …
WAILOO: … trying to understand the long-term effects of venereal disease as it developed through its various stages. And the study was being conducted on a group of really poor African-American men.
HAMMONDS: White government doctors working for the U.S. public-health service found approximately 400 African-American men presumed to all have syphilis.
WAILOO: The problems emerge after penicillin is discovered and more widely used. And the question that should’ve been asked is now that we have a series of effective treatments for venereal disease, ought we to continue a study of untreated syphilis, or ought we to provide treatment?
So even though a syphilis treatment became available, it was withheld from the men in the study. Put aside for a moment the short-term elements of this maneuver — the cruelty, the ethical failure. Consider the long-term implications: what happens when one segment of the population is so willfully exploited by the mainstream medical establishment? Well, that part of the population might develop a deep mistrust of said establishment.
A recent study by two economists, found that the Tuskegee revelation seriously diminished African-Americans’ participation in the healthcare system. They were simply less willing to go to a doctor or a hospital. The result? A decrease in male African-American life expectancy of about 1.4 years. Which, at the time, accounted for roughly one-third of the life-expectancy gap between blacks and whites. Coming up on Freakonomics Radio: with such a fraught history of inclusion and exclusion in medical studies, who does end up in clinical trials?
GOLDACRE: When you look at the evidence, what you often find is that trials are often conducted in absolutely perfect dream patients.
Also: how good are the new drugs that typically make it to market?
PRASAD: I think if we’re honest with ourselves, we’ll have to admit that the majority of new cancer drugs offer sort of very small gains at tremendous prices.
And: what happens if you write about conflicts of interest among oncology researchers, and then you go to an oncology conference?
PRASAD: I always wear a bulletproof vest.
* * *
My name is Stephen Dubner. This is Freakonomics Radio, and this is the second of a three-part series we’re calling “Bad Medicine.” We don’t mean to be ungrateful for the many marvels that medicine has bestowed upon us; nor do we mean to pile on, or to point out the avalanche of obvious flaws and perverse incentives — but, well, it’s just so easy.
PRASAD: Doctors do something for decades and it’s widely done, it’s widely believed to be beneficial and then one day, a very seminal study contradicts that practice.
That’s Vinay Prasad. He’s an oncologist and an assistant professor of medicine at Oregon Health and Science University. He also co-authored a book about what are called medical reversals — when an established treatment is overturned. Which happens how often?
PRASAD: It’s widespread, and it’s resoundingly contradicted. It isn’t just that it had side effects we didn’t think about, it was that the benefits that we had postulated, turned out to be not true or not present.
How can it be that so many smart, motivated people — physicians and medical researchers — come up with so many treatments that go all the way through the approval process and then turn out to be ineffective, or even harmful? A lot of it simply comes down to the incentives.
PRASAD: So much of the research agenda, even the randomized-trial research agenda, is driven by the biopharmaceutical industry. And that’s not necessarily a bad thing. I think there’s many good things about that, that really drives many, many trials; it drives a lot of good products. It also drives a lot of marginal products, or products that don’t work. And the people who designed those trials are I think very clever. You can sort of tilt the playing field a little bit to favor your drug and the incentive to do so is often tremendous — billions of dollars hinge on one of these pivotal trials. And to some degree that’s because it’s a human pursuit. But to some degree we could have policy changes that could more align the medical research agenda with what really matters to patients and doctors.
DUBNER: Let me ask you: in your own field, in oncology and in the particular cancers that you treat, how much more effective generally would you say the new cancer drugs are, than the ones that they are replacing or augmenting?
PRASAD: Let me say that there are a few cancer drugs that have come out in the last two decades that are really wonderful drugs, great drugs. One drug came out of work here, in the Oregon Health and Science University, by Dr. Druker, Gleevec, and that’s a drug that transformed a condition where maybe 50 or 60 percent of people are alive at three years to one where people more or less have a normal life expectancy. So that’s a really wonderful drug. But I think if we’re honest with ourselves, we’ll have to admit that the majority of new cancer drugs are marginal, that they offer sort of very small gains at tremendous prices, and to give you an example of that, among 71 drugs approved for the solid cancers, the median improvement in overall survival or how long people lived was just 2.1 months. And those drugs routinely cost over $100,000 per year of treatment or course of treatment.
DUBNER: But that points to one of the tricks that works so well — which is if it’s 2.1 months extra, and if the expected lifespan was, just let’s pretend for a moment, it was six months, then on a percentage basis that’s a massive improvement. So as the patient or as the pharma representative, I’m not talking about that length of time — which might be lived under physical duress and financial duress — but rather I’m thinking about goodness gracious, 33 percent life expectancy extension!
PRASAD: Right, a new drug improves lifespan 33 percent longer.
DUBNER: And who doesn’t want that, especially when you’re sitting there with your loved one in a horrible situation, facing the end?
PRASAD: The other thing that I’d point out is, those 2.1 months, these clinical trials that are often conducted by the biopharmaceutical industry, they really choose sort of the healthiest patient, the people who are the fittest of the patients. On average, the age is almost 10 years younger in pivotal trials for the FDA drug approval than in the real world. And then when you start to extrapolate drugs that have real side effects in very carefully selected populations, and small benefits, in carefully selected populations, to the average patient that walks into my clinic, who is older, who has other problems, who is taking heart medicine. There was a paper that came out about one of those costly expensive drugs for liver cancer, and in the pivotal trial it had the benefit of about two, three months, something like that. But in the real world, in the Medicare data set, it had no improvement in survival over just giving somebody good nursing care and good supportive care. And I think that’s the reality for many of these marginal drugs, when you actually use them in the real world, they start to not work so well, and maybe not work at all.
DUBNER: You’ve written and spoken about cronyism and conflicts of interest between drug makers and the doctors who prescribe drugs. I’m curious what happens when you go to an oncology conference. Are you an unpopular person there?
PRASAD: Stephen, I always wear a bulletproof vest when I go. No, but this has really been sort of the way medicine has operated for many years. To some degree, practicing doctors in the community having ties to the drug makers — that’s one thing — but increasingly, we see that the leaders in the field, the ones who design the clinical trials, who write up the manuscripts, who write the review articles, who sort of guide everyone in how to practice in those fields, they have heavy financial ties to drug makers. And there’s a large body of evidence suggesting that biases the literature of towards finding benefits, where benefits may not exist, towards more favorable cost-effective analyses when drugs are really not cost-effective. It’s a bias.
LISA BERO: Yes, well, we have a great deal of empirical data showing that funding sources and author financial conflicts of interest are associated with over-optimistic data.
That’s Lisa Bero. She’s a professor of medicine. She’s also co-chair of the Cochrane Collaboration, a global consortium of medical professionals and statisticians. Cochrane promotes evidence-based medicine by performing systematic reviews of medical research.
BERO: And in fact we have a Cochrane review on this very question. And this finding shows that if a drug study is funded by a pharmaceutical company whose drug is being examined, they’re much more likely to find that the drug is effective or safe.
How much more likely?
BERO: It’s about thirty percent.
Did you catch that? An industry-funded study is thirty percent more likely to find the drug is effective and safe than a study with non-industry funding.
BERO: And they’re likely to find this even if they control for other biases in the study. So by that what I mean, it could be a really well-done study, it could be randomized, it could be blinded, but if it’s industry-funded it’s still more likely to find that the drug works.
But if a study is well-done, how can the results be so skewed?
BERO: So it’s everything from, I mean, the question they’re actually asking, to how they frame the question, the comparators they use, how they design the study, how it’s conducted behind the scenes.
GOLDACRE: Trials are very often flawed by design in such a way that they are no longer the gold standard, no longer a fair test of which treatment is best.
That’s Ben Goldacre.
GOLDACRE: I’m an academic in Oxford working in evidence-based medicine, and I also write books about how people misuse statistics.
He’s also a doctor.
GOLDACRE: Yeah, that’s right. So I qualified in medicine in 2000, and I’ve been seeing patients on and off in the NHS for the past 15 years now.
One of Goldacre’s books is called Bad Pharma. He echoes what Vinay Prasad was telling us about the people who are chosen for clinical trials.
GOLDACRE: When you look at the evidence, what you’ll often find is that trials are often conducted in absolutely perfect dream patients. People who are, by definition, much more likely to get better quickly. Now that’s very useful for a company that are trying to make their treatment look like it’s effective. But actually, for my real-world treatment decisions, that kind of evidence can be really very uninformative.
Imagine you’re a doctor who’s treating a patient with asthma. Not hard at all to imagine:
GOLDACRE: Now asthma is obviously a very common condition, it’s about one in 12 adults.
With such strong demand for asthma treatment, there’s been a bountiful supply from drug makers, with dozens of clinical trials. A 2007 review of these studies looked at the characteristics of real-world asthma patients and how they compared to the people who’d been included in the trials.
GOLDACRE: They said, “OK, let’s have a look and see, on average, what proportion of those real-world asthma patients would have been eligible to participate in the randomized trials that are used to create the treatment guidelines which are then in turn used to make treatment decisions for those asthma patients. And the answer was, overall, on average, six percent. So 94 percent of everyday real-world patients with asthma would have been completely ineligible to participate in the trials used to make decisions about those very patients.
Of course it isn’t only with asthma patients where this happens.
GOLDACRE: It’s very common for randomized trials of antidepressants, for example, to reject people if they drink alcohol. Now that sounds superficially sensible. But actually I can tell you as somebody who has prescribed antidepressants to patients in everyday clinical practice, it’s almost unheard of to have somebody who is depressed and warrants antidepressants who doesn’t also drink alcohol. So you need trials to be done in people who are like people who you actually treat.
If you look at the overall efficacy rate of most antidepressants, you’ll find it to be very, very low, if there’s any efficacy at all. And of course there’s the opportunity cost to consider:
GOLDACRE: Because you tend to prescribe one antidepressant at a time.
Which means while a patient is on one drug that may not be working, they can’t try another that might. Plus which, there are the side effects to consider. So a lot of drugs that look great on paper don’t do very well in the real world. Why? Part of it is what Ben Goldacre and Vinay Prasad were talking about — cherry-picking subjects for clinical trials. But Goldacre says there are plenty of other ways to manipulate trial numbers in the drug maker’s favor. What do you do, for instance, when research subjects quit a trial because of the treatment’s side effects?
GOLDACRE: What you see is people inappropriately using a statistical technique like Last-Observation-Carried-Forward to account for missing data from patients who dropped out of a study because of side effects.
“Last-Observation-Carried-Forward” is a statistical extrapolation — pretty much what it sounds like, and worth looking up if you’re interested in that kind of thing. You can see how an inappropriate use of such a technique would tilt things in the drug maker’s favor.
There’s also the widespread use of what are called surrogate outcomes, as opposed to real-world outcomes. Consider many of the drugs recently approved by the FDA to treat diabetes.
GOLDACRE: All of those drugs have been approved onto the market with only evidence showing that they improve your blood sugar. But none of them have got evidence showing that they reduce your risk of heart attack or renal failure or eye problems or any of the actual, real stuff that patients with diabetes care about.
Since all of those outcomes would be hard to test for in a clinical trial — and by “hard,” what I really mean is time-consuming and expensive — instead the researchers go for the simple surrogate outcome of whether their pill lowers blood sugar.
GOLDACRE: But actually it’s not correlated as well as you might hope. And the history of medicine is absolutely littered with examples of where we have been given false reassurance by a treatment having a good impact on a surrogate outcome, a laboratory measure, and then discovering that actually it had completely the opposite effect on real-world outcomes.
As in the case of the infamous CAST trial that we covered in Part 1 of this series, in which the drug that suppressed aberrant heart rhythms actually worsened survival outcomes. Now, we should point out that Ben Goldacre — and everyone we’ve been speaking with for our “Bad Medicine” episodes — fully appreciates that medicine is science and that failure is part of science. The human body is an extremely complex organism, with lots to go wrong. Diagnosing and treating even a simple problem can be very difficult. It’s easy to take potshots from the sideline at good ideas that went bad; it’s even easier to criticize pharmaceutical companies who seem much more intent on making money than on making good medicines.
But, as Goldacre points out, those companies are simply responding to the incentives that are placed before them. Incentives that don’t necessarily encourage them to do the right thing. Goldacre points to a massive eight-year study, called the ALLHAT trial, in which academic researchers compared various drugs, from a number of drug makers, that were intended to lower blood pressure and cholesterol. Two of these drugs were made by the American pharmaceutical company Pfizer.
GOLDACRE: Pfizer came along and they said, “Look, we’ve got this fantastic new blood-pressure-lowering drug, and we’ve got various grounds for believing it’s going to be better than old-fashioned blood-pressure-lowering drugs, but at the moment all we can tell you is that it’s roughly as good at lowering blood pressure.”
So Pfizer asked the ALLHAT researchers to test whether their drug actually reduced the real-world outcomes that really matter: heart attack, stroke, and death.
GOLDACRE: So the researchers said what all academic researchers have said to drug companies since the dawn of time, which was, “Thank you very much, that sounds like a fabulous idea, that will be about $175M please.”
Actually, Goldacre misspoke — it was only $125 million, and Pfizer’s share was just $40 million but still — $40 million!
GOLDACRE: And it’s so expensive simply because measuring real-world outcomes like that, especially before the era of electronic health records, was extraordinarily expensive.
So Pfizer pays in, and the trial begins.
GOLDACRE: It was timetabled to run for a very, very long time — many, many, many years. But it was stopped early because the Pfizer treatment, which was just as good at lowering blood pressure, was so much worse at preventing heart attack, stroke, and death that it was regarded as unethical to continue to exposing patients to it.
The Pfizer drug we’re talking about was called Cardura. So where does Pfizer come out in all this?
GOLDACRE: It’s really important, I think, to recognize that Pfizer did nothing wrong here. Pfizer did exactly what we would hope all companies should do. They didn’t just say, “Oh, that’s fine, we’ve got some surrogate end-point data, we’ve got laboratory data showing it lowers blood pressure and that’s all we need.” Instead, they went out and they did the right thing. They exposed themselves to a fair test. They said, “We want to see if this treatment improves real-world outcomes that matter to patients” — heart attack, stroke and death — and they were unlucky. And it flopped.
The real problem, Goldacre says, is when drugs aren’t subjected to the real-world test.
GOLDACRE: The real bad guys here are the people who continue to accept weak surrogate end-point data. Like, for example, on the new diabetes drugs. It may well be that they lower the laboratory measure on a blood test but that doesn’t necessarily mean that they reduce your risk of heart attack, stroke, and death. And to find that out, we need to do proper randomized trials, which are admittedly longer and more expensive.
But there’s yet another problem. What happens if a proper randomized trial doesn’t show the efficacy a drug-maker was hoping to show? Well, there’s a good chance the world will never know about it, because of …
CHALMERS: Publication bias.
Iain Chalmers, a co-founder of the Cochrane Collaboration, is a major player in the evidence-based medicine movement.
CHALMERS: About half of the clinical trials that are done never see the light of day. They don’t get published. Isn’t that outrageous?
Which trials do get published?
CHALMERS: Trials that show results that are so-called “statistically significant” are more likely to get published than those that don’t have those results.
Now, you might think: Well yes, it makes sense to publish trials where a medicine seems to work; and if it doesn’t seem to work, why is that important to publish? Ben Goldacre again:
GOLDACRE: So if you cherry-pick the results, if you only publish or promote the results of trials which show your favored treatments in a good light, then you can exaggerate the apparent benefits of that treatment.
As Chalmers tells us, there are all kinds of reasons why the results of an unsuccessful trial might not get published.
CHALMERS: It may threaten a commercial enterprise’s interests to publish a trial which is disappointing. It may be something which someone who has had a favorite hypothesis and been known for writing and speaking about it for years finds out that the first really good study to test the hypothesis doesn’t find any support for it. There’s laziness.
GOLDACRE: And that’s the real scandal here, is that you are allowed to legally withhold the results of these trials, and so people do. The results of trials are routinely and legally withheld, from doctors, researchers, and patients. The people who need this information the most. That is a systematic, structural failure.
The structure, that is, imposed by government regulators, by funding sources, by the markets themselves. All of which can be very hard to change. So: how does Ben Goldacre see the situation improving?
GOLDACRE: We’ve set up something called the AllTrials Campaign a couple years ago. And the AllTrials Campaign is a global campaign to try and stop this problem from happening. So asking companies, research institutes, academic and medical professional bodies, patient groups, and all of the rest to sign up and to say all trials should be registered, so you publicly post on a publicly accessible register the fact that you’ve started a trial. Because that means we know which trials are actually happening, so we can see if some of them aren’t being published.
The AllTrials Campaign also urges the publication of what’s called a clinical study report:
GOLDACRE: And a clinical study report is a very long, very detailed document — hundreds, sometimes thousands of pages long, that describes in great detail the design of the study and the results of the study. And that’s really important because often a trial can be flawed by design in a way that is sufficiently technical that it is glossed over in the brief report that you get in an academic journal article about a trial. And those flaws can only be seen in the full-length clinical study report.
There’s also a growing momentum to curb conflicts of interest in medical research.
BERO: Well, I think we’ve already had great improvements in transparency and what’s really pushed the disclosure of funding sources has been the journals.
That’s Lisa Bero again, from the Cochrane Collaboration.
BERO: So if you publish something, you are required to disclose the funding source. And you know this is still not 100 percent enforced but it’s getting pretty close.
On the other hand, Bero says, a given researcher or investigator may have undisclosed biases or conflicts of interest.
BERO: And one sort of loophole is that the investigator themselves have to decide if something is relevant to the particular study. So they may say, “Well, I just don’t think it’s relevant.”
There’s another quirk in the medical industry that probably doesn’t serve the public good:
BERO: Drug companies evaluate their own products, whereas in software you usually get somebody external to check the quality of your project. Engineers get people to do the earthquake checks for them, who are independent from the people who built the bridge. So it’s a very odd system that we have, where the companies with an interest or stand to gain financially from testing the product are testing it themselves. So I think we need to change that.
And finally, there are the doctors themselves — the endpoint in this complicated, conflicted infrastructure that’s meant to deliver better medicine. Ben Goldacre — the gadfly physician who knows so much about bad pharma and bad medicine — acknowledges the entire system is due for reform.
GOLDACRE: And it’s a structural failure that persists because of inaction by regulators, by policy makers, by doctors and researchers, as much as because of industry, and none of us can let ourselves off the hook.
So next time on Freakonomics Radio, in our third and final episode of “Bad Medicine” — what’s a doctor to do?
WAILOO: I see the opioid story as part of the recurring sense of hope and despair associated with these drugs that are supposed to solve problems, but they end up being problems in themselves.
What to do about the troubling finding that more experienced doctors have worse outcomes than young doctors:
DUBNER: So I would think that you are a downright danger to your patients. How is it that you’re not?
JENA: Haha! No comment.
And finally — yes, finally — lots of reasons to be optimistic, at least cautiously so, about the future of medicine:
WOODRUFF: Where science and medicine is going in the future is to more and more precision medicine, so that we can get closer to an autonomous and individualized diagnosis.
That’s next time, on Freakonomics Radio.
* * *
Freakonomics Radio is produced by WNYC Studios and Dubner Productions. Today’s episode was produced by Stephanie Tam, with help from Arwa Gunja. Our staff includes Shelley Lewis, Jay Cowit, Merritt Jacob, Christopher Werth, Greg Rosalsky, Alison Hockenberry, Emma Morgenstern, Harry Huggins and Brian Gutierrez. Special thanks to Andy Lanset at WNYC Archives. If you want more Freakonomics Radio, you can also find us on Twitter and Facebook and don’t forget to subscribe to this podcast on iTunes or wherever else you get your free, weekly podcasts.
Here’s where you can learn more about the people and ideas in this episode:
- Teresa Woodruff, professor of obstetrics and gynecology and director of Women’s Health Research Institute at Northwestern University
- Evelynn Hammonds, professor of the history of science and African-American studies at Harvard University
- Keith Wailoo, health policy historian at Princeton University
- Vinay Prasad, assistant professor of medicine at Oregon Health & Science University
- Ben Goldacre, physician and senior clinical research fellow at University of Oxford
- Lisa Bero, pharmacologist and co-chair of the Cochrane Collaboration
- Sir Iain Chalmers, co-founder of the Cochrane Collaboration
- Commissioner, Office of the. “Women’s Health Research – Regulations, Guidance, and Reports Related to Women’s Health.” WebContent.
- ORWH. “Sex as a Biological Variable | ORWH Director’s Corner.” https://orwh.od.nih.gov/about/director/messages/sex-biological-variable/
- “Tuskegee Study – Timeline – CDC – NCHHSTP.”
- Alsan, Marcella, and Marianne Wanamaker. “Tuskegee and the Health of Black Men.” Working Paper. National Bureau of Economic Research, June 2016.
- Ending Medical Reversal, Vinay Prasad, 2015, Johns Hopkins University Press
- Sanoff, Hanna K., YunKyung Chang, Jennifer L. Lund, Bert H. O’Neil, and Stacie B. Dusetzina. “Sorafenib Effectiveness in Advanced Hepatocellular Carcinoma.” The Oncologist, May 16, 2016, theoncologist.2015-0478. doi:10.1634/theoncologist.2015-0478
- Lundh, Andreas, Sergio Sismondo, Joel Lexchin, Octavian A Busuioc, and Lisa Bero. “Industry Sponsorship and Research Outcome.” In Cochrane Database of Systematic Reviews. John Wiley & Sons, Ltd, 2012.
- Goldacre, Ben. Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients. Reprint edition. New York: Farrar, Straus and Giroux, 2013.
- Travers, Justin, Suzanne Marsh, Mathew Williams, Mark Weatherall, Brent Caldwell, Philippa Shirtcliffe, Sarah Aldington, and Richard Beasley. “External Validity of Randomised Controlled Trials in Asthma: To Whom Do the Results of the Trials Apply?” Thorax 62, no. 3 (March 2007): 219–23. doi:10.1136/thx.2006.066837.
- Turner, Erick H., Annette M. Matthews, Eftihia Linardatos, Robert A. Tell, and Robert Rosenthal. “Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy.” New England Journal of Medicine 358, no. 3 (January 17, 2008): 252–60. doi:10.1056/NEJMsa065779.
- Yu, Tsung, Yea-Jen Hsu, Kevin M. Fain, Cynthia M. Boyd, Janet T. Holbrook, and Milo A. Puhan. “Use of Surrogate Outcomes in US FDA Drug Approvals, 2003–2012: A Survey.” BMJ Open 5, no. 11 (November 1, 2015): e007960. doi:10.1136/bmjopen-2015-007960.
- Research Group, The ALLHAT Officers and Coordinators for the ALLHAT Collaborative. “Major Cardiovascular Events in Hypertensive Patients Randomized to Doxazosin vs Chlorthalidone: The Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT).” JAMA 283, no. 15 (April 19, 2000): 1967–75. doi:10.1001/jama.283.15.1967.
- Dolgin, Elie. “Publication Bias Continues despite Clinical-Trial Registration.” Nature News, September 11, 2009. doi:10.1038/news.2009.902.
- Hopewell, Sally, Kirsty Loudon, Mike J Clarke, Andrew D Oxman, and Kay Dickersin. “Publication Bias in Clinical Trials due to Statistical Significance or Direction of Trial Results.” In Cochrane Database of Systematic Reviews. John Wiley & Sons, Ltd, 2009.
- The Cochrane Collaboration
- AllTrials Campaign
- Retro Report. “The Shadow of Thalidomide.” The New York Times, September 23, 2013.
- Rudy Pusateri, “Hot Springy Bass”
- Jack Miele, “Otis Theme” (from Jack Miele)
- Little Invisibles, “Headrush”
- Blindfold, “Ambiharmina” (from Blindfold)
- Jetty Rae, “Take Me To The Mountain” (from Jetty Rae)
- Mike Barresi, “It’s All Good” (from All of Me)
- Kero One, “When the Sunshine Comes” (from When the Sunshine Comes)
- Jetty Rae, “Still Gotta Fight It” (from Jetty Rae)
- Debbie Miller, “It’s Been a Day” (from Measures and Weights)
- Beckah Shae, “Me First” (from Champions)