Complementary and alternative medicine: The New York Times and the elephant in the room

When I first started blogging, I liked to refer to myself as a booster of evidence-based medicine (EBM). These days, I’m not nearly as likely to refer to myself this way. It’s not because I’ve become a woo-meister of course. Even a cursory reading of this blog would show that that is most definitely not the case.

So what’s changed? Basically, I’ve come to the realization that EBM is an imperfect tool. Don’t get me wrong, EBM goes a long way towards systematizing how we approach clinical data, but there’s one huge flaw in it. (I can just see a quack somewhere quote-mining that sentence: “Orac says EBM has a huge flaw!”) That flaw is that it devalues basic science. In any hierarchy of evidence in the commonly used EBM systems, at the very top is, as they should be, are randomized, double-blind studies. Such studies control for the most potentially confounding variables and tries to rigorously isolate the difference between experimental groups to just the study drug or treatment. Thus, level 1a evidence is evidence from multiple randomized controlled trials with homogeneity of the trials. From there, the strength of studies falls by study type all the way down to the least powerful forms of evidence, such as case series. At the very bottom is the following:

Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles”

In other words, EBM devalues basic science.

So what, you ask? Why is that a problem? After all, why should basic science matter? Here’s why. In the absence of a basic scientific basis to think that a treatment is at least plausible on a physiological basis, all sorts of mischief happens. That mischief takes the form of “complementary and alternative medicine,” much of which has zero plausibility from a scientific viewpoint. I’m not just talking about mild improbability, either. I’m talking about relying on first principles that flout everything we know about physics and chemistry, that for them to be true would require the overthrow of major, well-established theories. Think homeopathy, for instance. For homeopathy to be true, not only would our entire understanding of the chemistry and physics of water have to be seriously in error, but so would our understanding of how cells respond to chemical compounds. Think reiki. For reiki to be true, not only would there have to be a life “energy” (which, by the way, no scientist has yet been able to detect or characterize), but “healers” would have to be able to manipulate it. And not just manipulate it either, they’d have to be able to make it do their bidding to heal. Or think acupuncture. For acupuncture’s mechanism to be true, much of what we know about anatomy and physiology would have to be wrong. We know, for instance, that there are no such things as meridians, the mythical pathways along the body through which that magical, mystical life energy known as qi is claimed to flow and into which needles need to be inserted in order to “unblock” or “redirect” that flow in order to heal. True, it’s marginally possible that acupuncture could “work” by another mechanism consistent with modern science (endorphin or opioid release, for example), but classical acupuncture is clearly pseudoscience. On the basis of enormous amounts of data gathered over literally centuries, we can say with confidence that all of the above CAM therapies are incredibly implausible, bordering on being downright impossible.

None of this matters in the EBM paradigm as formalized by the Cochrane Collaboration and others. First principles based on basic science, no matter how well supported that basic science is, fall under level 5 (the lowest level) evidence, and for purposes of EBM as prior probability doesn’t matter. Even a poorly designed, badly carried out case series ranks higher on the scale of evidence than hundreds of years of science saying that homeopathy can’t work. Worse, as John Ioannidis has shown us, clinical trials are so prone to producing “wrong” results more than the 5% of the time expected by random chance alone using the conventional 95% confidence interval, with the probability of such a result increasing with the implausibility of the treatment being tested. In other words, the less likely the prior probability of a positive result based on scientific principles, the less likely the positive result obtained is a “true positive” or that it can be explained by a real therapeutic effect. When the result is equivocal or only weakly “positive” (as virtually all of the few well designed studies of CAM modalities that report “positive” results are), it’s even less likely that the positive result is due to a real effect.

What this blind spot in the EBM paradigm leads to is the appearance of enough “positive” trials of even incredibly improbable CAM modalities such as homeopathy when on the basis of prior probability alone it is not unreasonable from a scientific (not to mention ethical) viewpoint to reject these improbable remedies a priori while awaiting evidence that approaches being as compelling as the scientific evidence that says they can’t work. That’s a lot of evidence. It’s also why I’ve started referring more to “science-based” medicine rather than EBM, or “science- and evidence-based” medicine (SEBM). To me SEBM takes into account both clinical trials, other forms of evidence, and, most of all, science in the form of estimating plausibility and prior probability. Rarely do I see anyone writing about such topics in literature directed at a lay audience. (Actually, come to think of it, I rarely see articles directed at professionals discussing such matters.)

That’s why I was pleased to see an article in the New York Times yesterday that does about the best job I’ve seen at actually discussing the issue of how to rank the “believability” of clinical trials. Have you ever heard of the Avalon effect? The article uses it to demonstrate how three large, well-designed randomized trials that failed to find that beta carotene protected against cancer were trumped in the public consciousness by lots of lesser quality studies. As Frankie Avalon, while shilling for supplement manufacturers put it:

There were laboratory studies showing how beta carotene would work. There were animal studies confirming that it was protective against cancer. There were observational studies showing that the more fruit and vegetables people ate, the lower their cancer risk. So convinced were some scientists that they themselves were taking beta carotene supplements.

Then came three large, rigorous clinical trials that randomly assigned people to take beta carotene pills or a placebo. And the beta carotene hypothesis crumbled. The trials concluded that not only did beta carotene fail to protect against cancer and heart disease, but it might increase the risk of developing cancer.

It was “the biggest disappointment of my career,” said one of the study researchers, Dr. Charles Hennekens, then at Brigham and Women’s Hospital.

But Frankie Avalon, a ’50s singer and actor turned supplement marketer, had another view. When the bad news was released, he appeared in an infomercial. On one side of him was a huge stack of papers. At his other side were a few lonely pages. What are you going to believe, he asked, all these studies saying beta carotene works or these saying it doesn’t?

That, of course, is the question about medical evidence. What are you going to believe, and why? Why should a few clinical trials trump dozens of studies involving laboratory tests, animal studies and observations of human populations

Why indeed. Too bad the average lay person doesn’t understand that quality matters far more than quantity when it comes to deciding which clinical trials to believe. It is true that two or three well-designed studies do trump hundreds of weaker studies.

What follows is a fairly standard and well-described discourse on the rationale for what makes a medical study convincing. The importance of comparing populations as similar as possible except for the study intervention, randomization, controlling for confounding variables, and as large a sample size as possible. What shocked me is that Bayes’ theory was next described:

The third principle, Dr. Goodman says, “is often off the radar of even many scientists.” But it can be a deciding factor in whether a result can be believed. It’s a principle that comes from statistics, called Bayes’ theorem. As Dr. Goodman explains it,

“What is the strength of all the supporting evidence separate from the study at hand?”

A clinical trial that randomly assigns groups to an intervention, like beta carotene or a placebo, Dr. Goodman notes, “is typically at the top of a pyramid of research.” Large and definitive clinical trials can be hugely expensive and take years, so they usually are undertaken only after a large body of evidence indicates that a claim is plausible enough to be worth the investment. Supporting evidence can include laboratory studies indicating a biological reason for the effect, animal studies, observational studies of human populations and even other clinical trials.

That’s science-based medicine he’s talking about. Normally, the way in which treatments are developed and found to be effective in humans begins with a clinical observation or a scientific finding in the laboratory. It is then studied further using in vitro models, animal models, and all sorts of other forms of evidence. All these studies, known as “pre-clinical” studies, form the supporting basis that justifies small pilot studies in humans and ultimately larger randomized clinical trials. Dr. Goodman puts it very well, when he says that the guiding principle in interpreting clinical trials is “that “things that have a good reason to be true and that have good supporting evidence are likely to be true.”

The article then does something I’ve never seen in such an article before in a major newspaper. It gives an example:

To teach students the power of that reasoning, Dr. Goodman shows them a paper on outcomes of patients in an intensive care unit, with every mention of the intervention blacked out. The study showed that the intervention helped, but that the result was barely statistically significant, just beyond the threshold of chance.

He asks the students to raise their hands if they believe the result. Most indicate that they do. Then Dr. Goodman reveals that the intervention was prayer for the patient by others. Most of the hands go down.

The reason for the skepticism, Dr. Goodman says, is not that the students are enemies of religion. It is that there is no plausible scientific explanation of why prayer should have that effect. When no such explanation or evidence exists, the bar is higher. It takes more clinical trial evidence to make a result credible.

And that, my friends, is what science-based medicine is: EBM with science taken into account. It takes into account all the other preclinical evidence and evidence from other sources, such as basic science, that bear on the believability and plausibility of a therapy under study. That’s all. It really is that simple.

Nor is it being close-minded, either, and rejecting out of hand the possibility that a therapy might work. It is simply weighing all the evidence, rather than pretending that any therapy under study is as likely to work as any other. It is using what we already know to decide where to set the bar for evidence. For a therapy that is highly plausible, the bar is relatively low: A couple of convincing randomized trials might be enough. For something as improbable as homeopathy, whose principles go against so much of what is known about chemistry and physics, the bar would be much, much higher. There would have to be multiple well-designed randomized clinical trials with very clear, compelling, and undeniable results to make it reasonable to start to conclude that there may be something wrong with our understanding of chemistry and physics rather than something wrong with the clinical trials. Of course, in the case of homeopathy there are no such trials. The “positive” ones are almost invariably small and/or poorly designed, and even the occasional randomized trial that appears “positive” generally demonstrates an “effect” that is barely above statistical significance. Meanwhile, the better designed and more rigorous a clinical trial of homeopathy is, the less likely it is to show a “positive” result.

Of course, that’s the elephant in the room in discussions like these: How CAM research utterly ignores the issue of prior probability. True, Gina Kolata, the writer of the Times article, discusses prior probability and even interviews Dr. Goodman. In that, she goes further than virtually any other science or medical writer I’ve seen before in understanding how EBM should be applied. However, there’s still that damned elephant that can’t be avoided in CAM studies. It’s an obvious connection, but she doesn’t make it, and rarely does anyone else, it seems. In fact, we tend to pretend that it isn’t there. Sometimes we bump into it but pretend it’s something else. But it won’t go away, and it’s the reason that the vast majority of CAM research done under the auspices of the National Center for Complementary and Alternative Medicine and promoted by wealthy private foundations such as the Bravewell Collaborative is so often a huge waste of resources and an abuse of science.

Too bad someone didn’t tell another Times writer, William J. Broad, who clearly didn’t get the idea. On the very same day Kolata’s article appeared, he published an article called Applying Science to Alternative Medicine. Although there are some reasonable bits, there’s a lot of the double-talk used to justify wasting resources studying highly implausible CAM treatments:

Dr. Briggs said such investments would be likely to pay off in the future by documenting real benefits from at least some of the unorthodox treatments. “I believe that as the sensitivities of our measures improve, we’ll do a better job at detecting these modest but important effects” for disease prevention and healing, she said.

If the effects are so modest, why is it justified spending so much money and in the process twisting the very process of scientific medicine, I ask? On the other hand, tight funding may eventually bring some sense to the endeavor of studying CAM treatments by forcing a more rigorous form of triage:

An open question is how far the new wave will go. The high costs of good clinical trials, which can run to millions of dollars, means relatively few are done in the field of alternative therapies and relatively few of the extravagant claims are closely examined.

“In tight funding times, that’s going to get worse,” said Dr. Khalsa of Harvard, who is doing a clinical trial on whether yoga can fight insomnia. “It’s a big problem. These grants are still very hard to get and the emphasis is still on conventional medicine, on the magic pill or procedure that’s going to take away all these diseases.”

I hate that “magic pill” straw man. If there is a “magic pill” in scientific medicine, its effects are not magic; they’re documented by rigorous science and clinical trials. It’s CAM that is looking for magic in the form of that “magic supplement” or extravagant magical thinking in the form of modalities like homeopathy, craniosacral therapy, much of chiropractic, “detoxification,” and so many others. If there’s a silver lining that might come out the current dire NIH funding situation, it’s that it might force some rigor in our thinking about CAM and in how we decide what CAM modalities to study.

I guy can dream, can’t he? Or, if you will, think magically.