Acupuncture for xerostomia: Spin, spin, spin a negative study!

I’m going to start this post by asking you to think a moment about spit. Saliva serves several functions, from being a first step in the digestive process, to add moisture to food to aid swallowing, and to lubricate the mouth and to aid in detecting the taste of food. You really begin to appreciate just how important saliva is when you see patients who no longer make it or make too little of it. This condition is known as xerostomia, or dry mouth resulting from reduced or absent saliva flow. Although xerostomia can be caused by medications and disease, one of the most common causes is as a complication due to cancer treatment, specifically radiation for head and neck cancer that encompasses the salivary glands. It turns out that the salivary glands are very sensitive to radiation, which can damage the glands, resulting first in a reduction in salivary output and increased viscosity of the saliva, and ultimately permanent cessation of gland function if the total dose is above a certain level.

The complications of xerostomia can range from minor to quite serious. Common problems include constant sore throat, difficulty speaking and swallowing, and hoarseness. Because xerostomia increases the pH in the mouth, it contributes to dental caries, gum disease, and tooth decay. Ulcers can form in the mouth, and candida infections can set in. Treatment for xerostomia due to radiation damage to the salivary glands is not easy, either. Although there are drugs that can stimulate saliva production, they only work if some gland function remains. There are saliva substitutes, but their duration of action tends to be short, necessitating frequent use.

Because it’s such a common complication of the treatment of head and neck cancers, xerostomia is a major concern at cancer centers, where it can have a negative effect on the quality of life of cancer patients. As a result, a lot of research goes on at cancer centers looking for ways either to protect the salivary glands from the effects of radiation or to treat xerostomia after radiation therapy. This leads me to a study from M.D. Anderson that hit the news on a week ago, thanks to a press release and some news stories:

From the press release:

After receiving acupuncture treatment three days a week during the course of radiation treatment, head and neck cancer patients experienced less dry mouth, according to study results from researchers at The University of Texas MD Anderson Cancer Center.

The trial results, published in JAMA Network Open, is the first randomized, placebo-controlled, Phase III trial to evaluate the use of acupuncture during radiation therapy to reduce the incidence and severity of radiation-induced xerostomia, or dry mouth. Acupuncture treatment has very few side effects and is relatively low cost compared to standard treatments such as medication and saliva substitutes. These results support a 2011 study that found symptoms improved up to six months after radiation treatment with concurrent acupuncture sessions.

“Dry mouth is a serious concern for head and neck cancer patients undergoing radiation therapy. The condition can affect up to 80% of patients by the end of radiation treatment,” said the study’s principal investigator, Lorenzo Cohen, Ph.D., professor of Palliative, Rehabilitation, and Integrative Medicine and director of the Integrative Medicine Program. “The symptoms severely impact quality of life and oral health, and current treatments have limited benefits.”

This passage from the press release caught my attention:

A secondary analysis showed a significant difference between sites in response to placebo. The Chinese patients had little to no placebo response to sham acupuncture whereas the MD Anderson patients had a large placebo response, showing both forms of acupuncture worked. More studies are needed to understand these site differences, but it has been suggested that it could be due to the environment in which the acupuncture is delivered, cultural influences or the relationship between patient and practitioner.

So I went to the original study, published in JAMA Network Open and noticed that there was a rather odd invited commentary by Matthias Karst and Changwei Li accompanying the study entitled “Acupuncture—A Question of Culture” that tries to explain the observation above. It was upon reading this invited commentary that I knew I had to write about it, because it shows how acupuncture enthusiasts try to spin what was in essence a negative study and how acupuncture is one form of alternative medicine that all too many physicians seem far too open to.

A negative acupuncture trial for xerostomia

The acupuncture trial by Cohen’s group at U.T-M.D. Anderson Cancer Center is the result of a collaboration between multiple investigators at M.D. Anderson and the Department of Integrative Oncology at Fudan University’s Shanghai Cancer Center. The study is a two-center phase 3 randomized clinical trial of acupuncture in cancer patients with treatment-associated xerostomia. The three groups include a standard care control (SCC) group, which was compared with a true acupuncture (TA) group and a sham acupuncture (SA) group. The subjects enrolled in the trial included patients with oropharyngeal or nasopharyngeal carcinoma who were undergoing radiation therapy in comprehensive cancer centers in the United States and China. There was one unusual thing about this trial, namely that it used a mixture of “sham acupuncture” methods as placebo controls:

Treatments were given by 6 qualified, hospital-credentialed acupuncturists with a mean (range) of 10 (5-25) years of experience. Quality control was maintained by having members of the study team, including acupuncturists, from MD Anderson visit Fudan, and vice versa, approximately every 6 months during the study period. Patients were treated in a comfortable supine or semisupine position on the day of radiation therapy, either before or after radiation therapy. Upon insertion, needles were manipulated until the de qi sensation was elicited at the appropriate points. They were not manipulated further during the needle retention period. The specific acupuncture points and needling methods used are reported in detail elsewhere12,15 and are provided in eTable 3 in Supplement 2. Patients were told that the purpose of the study was to test 2 different acupuncture approaches but that 1 approach might not target their dry mouth symptoms. This language was used to avoid deception while maintaining naiveté as to the existence of a sham group. The sham procedure in this randomized clinical trial involved a real needle at a real point not indicated for xerostomia, real needles at sham points, and placebo needles at sham points. The mixture of real and sham points and needles used is defined as acupuncture. The Park system, a validated, nonpenetrating, telescoping needle with a separate device that attaches it to the skin, was used for the placebo needles.16,17

I also can’t help but quote from the supplemental material how specific acupuncture points were chosen, because it demonstrates better than anything else how much magical thinking goes into traditional Chinese medicine and acupuncture:

Although many point combinations could be used, the investigators attempted to identify a set of acupuncture points that integrated TCM and biomedicine, with a focus on using a minimal number of sites. The active acupuncture points for this study were selected on the basis of successful experience from our pilot studies as well as on previously published trials Points were also selected on the basis of their indications according to the classical theory of TCM 1,2 and current understanding of the various anatomical locations and neurovascular tissues associated with each point.16

The body acupuncture points selected for this protocol were Ren 24, Lung 7 (LU 7), and Kidney 6 (K6). One placebo needle was placed at Gallbladder 32 (Gb32) on the right side. This was intended to provide participants in the active treatment group with a stimulus that did not elicit De Qi sensation. Ear points selected were Shenmen, Point Zero, Salivary Gland 2′ (SG 2-prime), and Larynx. Except for Ren 24, which is located in the midline and the placebo needle at Gb32, all points will be treated bilaterally. The only facial point used in this study was Ren 24. We chose not to use points on the face that were selected in some prior studies (i.e., Stomach 4-7). Even though problems are rare, we preferred to err on the side of precaution and avoid needling tissues that could still be friable or easily injured after radiation. Furthermore, Johnstone et al.4,12,13 obtained good results without using facial points. Ren 24 is located in the midline and did not pose a problem. Patients were excluded from the study, if there was any indication of skin irritation or infection at this location.

So many hoops to jump through to take a treatment based on a prescientific understanding of the human body. It’s also a treatment with considerable—shall we say?—reinvention throughout history, or even, as I would put it, retconning and, today, promotion by the Chinese government. Indeed, acupuncture as practiced now is not even ancient, but rather a practice that reached its current form roughly 90 years ago. Before that, it was a brutal practice, more akin to “ancient Western” medicine involving bloodletting. (Indeed, if you want to read what acupuncture was like over 100 years ago, read Harriet Hall’s review of a book by a Scottish surgeon Dugald Christie, who was stationed in China from 1883 to 1913. The descriptions feature children with needles left plunged deep in their bodies for days, including one who died.) Around 90 years ago a Chinese pediatrician named Cheng Dan’an (承淡安, 1899-1957) proposed that needling therapy be resurrected because (he thought then) its actions could be explained through neurology, which is why he also proposed moving the needling points away from blood vessels. He also replaced the previously used coarse needles with the fine filiform needles in use today in part because he wanted to do acupuncture on children and babies.

I will give the investigators credit for the elaborateness of their placebo control group and how they managed to deceive the patients (stating that two different acupuncture approaches were being tested and that one might not work) while claiming that they weren’t deceiving them. Here’s what I mean by “elaborateness”:

Non-penetrating needles with the Park device17,18 were placed at inactive points as follows: Sham Location 1 – placebo needle at inactive point located 0.5 cun below and 0.5 cun lateral to CV 24 on the chin (for participants with beards this point was omitted and indicated on treatment forms); Sham Location 2 – placebo needle at inactive point located 0.5 cun radial and 0.5 cun proximal to SJ 6 between SJ and LI Channels (bilateral upper extremities); Sham Location 3 – placebo needle at inactive point located 1.0 cun below and 0.5 cun lateral to St 36, between St and Gb Channels (bilateral lower extremities). In order to elicit De Qi in the control group, one 0.25 x 40mm acupuncture needle was used at Gb 32 above the right knee. This point is not indicated for dry mouth. Finally, four 0.16 x 15mm acupuncture needles on the helix of each ear (8 ear points) was included.

The sham treatment was given according to the same schedule as the true acupuncture treatment. A total of 14 points was used in both groups. As the active acupuncture group received treatment using a placebo needle and the placebo acupuncture group received active treatment at a real acupuncture point (both at Gb 32) and with acupuncture needles inserted at inactive points on the ear, the blinding of the two groups was maintained.

Patients were randomized to one of the three groups: TA, SA, or SCC. An adaptive randomization plan was used to ensure equal distribution of all factors across all groups and balance at each site. Patients were thus stratified by stage of disease, age (running mean), sex, the mean planned parotid dose (left and right calculated separately and balanced between groups, <10, 10 to <20, 20 to <26, 26 to <30, 30 to <35, 35 to <40, 40 to <50, 50 to <60, or ≥60 Gy), induction therapy (yes or no), and concurrent chemotherapy (yes or no). Patients assigned to either TA or SA received acupuncture 3 days per week (same day as radiation treatment) during a 6- to 7-week course of radiation therapy. Patients in the SCC group received standard care for xerostomia, including information about oral hygiene (brushing with fluoride toothpaste, flossing, and daily use of fluoride tray applications).

The study endpoints included scores on validated patient-reported Xerostomia Questionnaires (XQs). Sialometry data (measuring saliva output) were collected at baseline, at the end of radiation therapy (week 7), and 3, 6, and 12 months after the end of radiation therapy. The primary hypothesis being tested was whether TA is more effective than SA or SCC for reducing the severity and incidence of radiation-induced xerostomia among patients with cancer at MD Anderson and Fudan 1 year after the end of radiation therapy.

Overall, 680 patients were screened (300 patients at Fudan and 380 patients at MD Anderson), and 60 patients from Fudan and 221 patients from MD Anderson were excluded, either because they did not eligibility criteria (eg, having previously received acupuncture, not having intact parotids), were part of a different randomized clinical trial, were unwilling to consent, or could not accommodate the study schedule. The remaining 399 eligible participants consented to participate and were randomized. There was a total of 132 patients in the TA group, including 79 patients at Fudan and 53 patients at MD Anderson; 134 patients in the SA group, including 81 patients at Fudan and 53 patients at MD Anderson; and 133 patients in the SCC group, including 80 patients at Fudan and 53 patients at MD Anderson. Ultimately 60 patients dropped out at various stages during the one year followup period, leaving 339 for the final analysis.

Now, before I present the final results, I’m going to pause and ask you to predict the result of this trial. Go on. Take a second, I bet you can if you’ve been reading about acupuncture studies on this blog. Here’s a large picture of a puppy that you’ll have to scroll past in order to read on. True fact, this is a puppy from a litter that my wife and I fostered about a year ago, along with their mother, who’s in the background of this picture:

Daphne doesn't have xerostomia

In fact, here’s another picture of puppies that we’ve fostered:

These puppies don't have xerostomia either.

OK, here we go.

For the primary aim, the adjusted XQ xerostomia score in the TA group was significantly lower than in the SCC group (26.8 versus 34.8, p=0.001) and marginally lower than the SA group, but not statistically significantly different from it (26.8 versus 31.3, p=0.06). The SA and TA groups were not statistically distinguishable (p=0.16). In other words, this is a negative study. I also note that the effect sizes reported were not particularly impressive, very much consistent with placebo effects. “True” acupuncture is indistinguishable from sham acupuncture. Did that stop these intrepid researchers? Of course not. So what did they do next?

This:

In secondary analysis, the acupuncture groups (TA and SA) were combined and compared with SCC, revealing significantly lower adjusted least square mean (SD) XQ scores for acupuncture (28.3 [18.7] vs 34.0 [18.9]; P = .008). All analyses were also conducted controlling for baseline XQ, age, sex, stage, treatment type (induction and concurrent chemotherapy), and radiation therapy dose, and the results remained the same. There were no group differences when comparing dose sparing to both sides, 1 side, or neither side.

And this:

Although sham-controlled clinical trials impart important information toward understanding putative mechanisms and a validated approach was used in this study, the choice of sham comparators in acupuncture trials is still highly debated. Thus, as other large, 3-arm acupuncture trials24 have demonstrated, the most relevant comparison is between TA and SCC. However, combining the 2 acupuncture groups also revealed significant differences vs SCC.

Now imagine, if you will, a drug company, finding that the effect of its drug on the primary endpoint of a clinical trial that included a no-treatment group for estimation of placebo effects, decided to combine the placebo and drug groups and do an analysis like this. They would be strongly criticized, and rightly so! Yet in clinical trials of acupuncture, I see this sort of intellectually dishonest analysis being done all the time. This is a negative trial, period. That doesn’t stop the authors from writing in the discussion:

This is the first phase 3 randomized clinical trial to evaluate the use of acupuncture to reduce the incidence and severity of RIX [radiation-induced xerostomia] in patients with head and neck cancer undergoing radiation therapy, to our knowledge. These results support previous findings from several smaller trials.10-13,15,22,23 As current methods for treating established RIX have shown little benefit, our findings indicate acupuncture may be a compelling adjunct to standard treatment for patients at risk of developing RIX, particularly since acupuncture has a low adverse effect profile and relatively low cost.

And:

On the basis of these findings, acupuncture may be considered an adjunct to standard care for patients who are interested in receiving acupuncture and at risk of developing RIX.

The acupuncture treatment was indistinguishable from sham treatment, but whatever.

There is a wrinkle, though.

Culture and the efficacy of acupuncture

The aforementioned wrinkle comes in the form of a posthoc analysis that the authors did. I know, I know, posthoc analyses are always to be suspected, but this one is particularly revealing. To drive the point home, here are Tables 2 and 3 from the paper, with Table 2 breaking out the results by institution for XQ scores at one year and Table 3 doing the same for the incidence of clinically significant xerostomia:

Xerostomia, symptoms
Xerostomia, clinically significant

Notice anything? The results for the combined group from both institutions is as described above. However, look at the results from M.D. Anderson! The study there was very negative, with TA not even close to statistically significantly different from SA for XQ scores or incidence of xerostomia. Now look at the Fudan Cancer Center results. There, there is a difference in XQ score between TA and SA that’s statistically significant (p=0.004) and no difference between SA and SCC (p-0.92), or no apparent placebo effects. There’s also a huge difference in the incidence of xerostomia between TA (22.7%) and the SCC and SA groups (48.7% and 46.4%, respectively; p=0.003). How can we explain this? The authors certainly do some major handwaving to try:

This study had some limitations. Importantly, participants in China were treated as inpatients, whereas US participants were treated as outpatients. Owing to logistics at Fudan, the acupuncture sessions were delivered in a busy, loud, semiprivate clinical space. At MD Anderson, treatments were delivered in a quiet, private room with dimmed lighting. Although it is unclear how this may have affected the study results, it could have influenced the sham response.

And:

We did not monitor verbal interactions or document factors related to relaxation during the clinical encounter. This may explain some of the differences between centers and the lack of a placebo effect in Fudan.

And:

It is also possible that Chinese patients in this trial became unblinded, which could partially explain the significant group by institution effect. As there is greater cultural awareness of acupuncture in China, the patients may have noticed the use of more sham needle devices in the SA group, which could have changed their perception of the procedure.

Interestingly, although the authors reported the use of the Acupuncture Expectancy Scale during the treatment, there did not appear to be a questionnaire to evaluate adequacy of blinding, a major flaw in a study like this. The Acupuncture Expectancy Scale only evaluates only the patient’s expectancy that acupuncture will have a beneficial effect.

Enter the accompanying editorial:

Findings in the study by Garcia et al3 support the idea that acupuncture exerts its effects not only or not mainly by needle site activity and specific neurophysiological mechanisms but also by expectations, conditioning, and suggestibility of clinicians and patients.5 The effects of these unspecific factors may be quite large. Together with many other 3-arm acupuncture trials in Western countries, results of the study by Garcia et a.3 has disclosed what is referred to in the literature as the efficacy paradox,6 that is, even though TA and SA were similarly effective, the size of overall effect of any acupuncture was superior to standard therapy.

In a previous randomized, single-blind, placebo-controlled, multifactorial, mixed-methods clinical trial on chronic pain, the personality of individual practitioners (not the empathic behavior) and patient’s beliefs about treatment veracity independently had significant effects on outcomes.7 However, patients and acupuncturists are embedded in a larger cultural context in which acupuncture appears to support the therapeutic ritual of the patient in a unique way and plays a crucial role in the therapeutic outcome of the patient. In support of this, recent research has shown that these complex, ritual-induced biochemical and cellular changes in a patient’s brain are very similar to those induced by drugs.8

With these ideas in clinical acupuncture trials in mind, the cultural background should increasingly move to the center of attention. What was predicted in a small interview among patients with back pain came true: “In China, outcomes of active acupuncture will be still better than the outcomes of sham acupuncture.”9

The authors are quoting this paper in that last sentence.

The “efficacy paradox” is no paradox at all. It’s exactly what you would expect if acupuncture for xerostomia is pure placebo: That both interventions will result in some nonspecific symptom relief compared to no intervention controls and that there is no actual effect from “true” acupuncture versus “sham” acupuncture. Both are theatrical placeboes. So let me help the authors out here and propose a couple of far more likely explanations for the difference in results from subjects at M.D. Anderson compared to those at the Fudan site other than increased cultural awareness of acupuncture in China resulting in more inadvertent unblinding, which is pure speculation, given that the investigators didn’t assess adequacy of blinding. I understand why they didn’t, given that they were trying to claim that they weren’t deceiving patients while still hiding the existence of a sham group in the study from the participants, but it’s still a flaw.

One potential explanation could be that in China the acupuncturists there inadvertently gave the game away, not because of “cultural awareness” in their patients identifying the “sham” acupuncture but rather because they strongly believed in acupuncture (they are, after all, acupuncturists) and their bias gave the game away. This is, after all, not a double blind trial, only a single blind trial. It’s possible that acupuncturists at the Fudan site, knowing they were administering “sham” acupuncture, interacted with subjects in subtly different ways, producing different cues that patients could pick up on. We already know that placebo effects are highly dependent upon the interaction between practitioner and patient, so this possibility is not entirely implausible or an unreasonable explanation.

There’s another explanation, though, one that the investigators won’t like hearing. All it takes is to read Prof. Edzard Ernst to know why. Basically, as he and others have reported multiple times, basically, with rare exceptions, all acupuncture studies published by Chinese investigators are positive studies. Indeed, a 1998 meta-analysis of clinical trials concluded that for acupuncture “all trials originating in China, Japan, Hong Kong, and Taiwan were positive.” A more recent systematic review from 2014 looked at 840 randomized controlled clinical trials of acupuncture from China and reported that 838 studies (99.8%) reported positive results from primary outcomes and only two trials (0.2%) reported negative results. It’s also important to note that the authors of the meta-analysis were Chinese investigators sympathetic to integrative medicine who published in The Journal of Alternative and Complementary Medicine.

Authors concluded that “publication bias might be major issue in RCTs on acupuncture published in Chinese journals reported, which is related to high risk of bias. We suggest that all trials should be prospectively registered in international trial registry in future.”

Prof. Ernst concludes:

The question why all Chinese acupuncture trials are positive has puzzled me since many years, and I have quizzed numerous Chinese colleagues why this might be so. The answer I received was uniformly that it would be very offensive for Chinese researchers to conceive a study that does not confirm the views held by their peers. In other words, acupuncture research in China is conducted to confirm the prior assumption that this treatment is effective. It seems obvious that this is an abuse of science which must cause confusion.

Whatever the reasons for the phenomenon, and we can only speculate about them, the fact has been independently confirmed several times and is now quite undeniable: acupuncture trials from China – and these constitute the majority of the evidence-base in this area – cannot be trusted. The only way to adequately deal with this problem that I can think of is to discard them outright.

It’s also possible that fraud was involved. As Prof. Ernst and Mark Crislip have discussed, a 2016 survey of clinical trials in China found that more than 80% of clinical data was fabricated:

The scandal came as no surprise to industry insiders, however.

“Clinical data fabrication was an open secret even before the inspection,” the paper quoted an unnamed hospital chief as saying.

Steve Novella also wrote about the issue on his own blog just last month, discussing concerns about the published research of Xuetao Cao, a Chinese immunologist, President of Nankai University, and
chairman of research integrity in all Chinese research. In brief, a pattern of research fraud involving falsified and doctored figures was discovered by data integrity advocates in many papers spanning decades. Steve made the argument that this was clearly a systemic problem, and observed, after correctly noting that when “your head of research integrity is exposed for massive scientific fraud, you have a problem”:

There is also strong cultural pressure in China to prove that the core beliefs of their culture (traditional Chinese medicine and Qi) are real and powerful. This is ideological science. As powerful evidence for this effect is acupuncture research. Two reviews, in 1998 and again in 2014, found that 100% of acupuncture studies coming out of China were positive. This is statistically impossible, even if acupuncture worked (and it almost certainly doesn’t). This is also a bit much even for just publication bias. It could all be incredible researcher bias, but bias blurs imperceptibly into fraud. When you are fudging your research methods, and you know this isn’t pristine, how much of that is bias and how much fraud? In the end, it doesn’t really matter.

I can’t help but note that Cao’s career was first launched when he was 26 because of a paper he published in which he claimed to cure melanoma metastases in mice through energy healing with Qigong.

Whether the divergence in results between the M.D. Anderson site and the Fudan Cancer Center site were due to extreme bias leading the acupuncturists at the Fudan site to inadvertently give off cues that let patients to suspect they weren’t getting the “good acupuncture” or outright fraud, I can’t know, although I can already sense any M.D. Anderson researcher who might come across this post starting to bristle with indignation and anger that I would even bring up the possibility of fraud among their Chinese collaborators. I understand the reaction, but it’s hard not to speculate about the possibility, though, given what we know based on the Chinese government’s own inspection and report. At the very least there is a lot of bias in favor of traditional Chinese medicine among Chinese researchers, as evidenced by the near 100% positive results reported from their clinical trials of acupuncture, and, even without outright fraud, this sort of bias can affect the reporting of results in a number of ways that are very hard to pin down even if you are looking for them.

What I can know is that this result, which seemed to surprise Dr. Cohen and his M.D. Anderson colleagues so much that they had a hard time explaining it, should not have come as a surprise at all. (Of course, one does wonder why they did the post hoc analysis to begin with; they must have suspected something was…odd…about the results from the Fudan site.) The divergence between the results reported out of M.D. Anderson and out of Fudan is entirely consistent with what we have known for over 20 years about the unreliability of clinical trials of acupuncture from China, a result that has been replicated several times over the last two decades. Indeed, having briefly brought it up, for now I’m going reject the possibility of fraud at the Fudan site, barring evidence. As Steve wrote, it almost doesn’t matter if it’s bias, fraud, or a combination of the two, and outright fraud isn’t necessary to explain the different results from the US and Chinese sites. The end result is the same. So, instead, let me conclude by asking: How could anyone doing a major multinational clinical trial of acupuncture have been unaware that more than 98% of all acupuncture clinical trials from China are positive and that, because of that, many “integrative medicine” researchers consider Chinese trials of acupuncture to be unreliable? How could the investigators at M.D. Anderson have thought that it would be a good idea to partner with an institution in China to do a clinical trial of a modality like acupuncture, for which the track record of such institutions is terribly biased?

In the end, there’s no need to wrap oneself into all manner of contortions to “explain” the results of this post hoc analysis as being due to various cultural differences that enhance the “sham acupuncture” effect” in Westerners and/or tip off Asians that they’re not getting the “real acupuncture” when a much simpler explanation is more likely to be correct.