No, torturing colicky infants by sticking them with acupuncture needles won’t calm them

So I was distracted yesterday from what I had intended to write about by an irresistible target provided me courtesy of Toby Cosgrove, MD, CEO of The Cleveland Clinic, who bemoaned all those nasty pro-science advocates who had had the temerity to link the antivaccine rant by the director of the Clinic’s Wellness Institute to the quackery practiced there of whose affinity for antivaccine quackery Cosgrove appears to be oblivious. So I took care of that target, and now I’m back to the topic I had wanted to apply some Insolence to. Yes, there was no way I was going to allow this pseudoscientific and utterly unethical study pass uncommented on. No way. This study was nothing less than infant torture in the name of pseudoscience.

I’m referring to a study that started popping up on the news two days ago with headlines like this one, Colic study stirs prickly debate on acupuncture. (Gotta love those acupuncture puns and jokes.) Then there was this gem, The Soothing Benefit of Acupuncture for Babies, which was utterly credulous to the point of what I consider embarrassing. Seriously, TIME and Amanda McMillan? You couldn’t even include a token skeptic viewpoint? Yours wasn’t false balance; it wasn’t even close to balance at all. You even ended the story with a quote from an “integrative medicine” advocate recommending that parents only take their babies to acupuncturists or traditional Chinese medicine practitioners who are “trained, licensed, and experienced.”

The study being referred to was published in a journal that falls under the BMJ group of journals, namely Acupuncture in Medicine (AIM). It’s a journal we’ve seen before, a journal that is vast repository of pseudoscience or, as Steve Salzberg referred to it, a fake medical journal. The study that provoked these stories was published by Kajsa Landgren and Inger Hallström in the Department of Health Sciences, Lund University, Sweden entitled Effect of minimal acupuncture for infantile colic: a multicentre, three-armed, single-blind, randomised controlled trial (ACU-COL). This study is the very epitome of quackademic medicine and, as Harriet Hall likes to call it, Tooth Fairy science.

First of all, there is no prior plausibility. The authors claim that there is:

Standardised acupuncture grounded in neurophysiology is a Western approach, while the hypothesis in Traditional Chinese Medicine (TCM) is that the effect of acupuncture depends on which points are used. While some argue that needling sensation (de qi) is necessary, others have demonstrated beneficial effects with minimal stimulation. There is a need for further research to compare standardised and individualised treatment, as well as to determine the optimal stimulation parameters and treatment intervals.

It is plausible that acupuncture may have positive effects in infantile colic as it is recognised to reduce pain, restore gastrointestinal function and have a calming effect. No serious side effects have been reported to date.

However, there really isn’t a plausible (or even semi-plausible) physiological mechanism that would lead a scientist to believe that sticking needles into infants in locations defined by ancient, prescientific, vitalistic beliefs that have no anatomic structures associated with them that would likely to be positively impacted by sticking needles into them. Moreover, the thought of sticking needles into babies makes my skin crawl. This is not vaccination, where there is a definite and demonstrable benefit from the jab, but rather quackery rooted in pseudoscience.

Before I get into the details of the study itself, I can’t help but point out that the study itself is highly unethical. All the documents that govern the ethics of human subjects in research, such as Belmont Report, the Common Rule, and the Declaration of Helsinki, emphasize that children are considered a vulnerable population because they cannot provide consent and have to rely on their parents to look out for their best interests, as well as how fragile they can be. That means that the protections for them in human subjects research must be even more rigorous than they are for adults. Similarly, Helsinki Declaration requires that there be compelling preclinical evidence from basic science and animal studies to justify human subjects research. There is no such body of evidence supporting using acupuncture to treat infants for colic—or to treat anything, for that matter. Yet, here we have a study subjecting babies to needles being poked into them in the service of trying to “prove,” against all physiology, that ancient mystical vitalistic medicine “works.” Having known parents with children with colic, which is excessive crying, I have some idea how desperate such parents can be, so much so that they might be willing to have a stranger stick needles into their children if they’re told that this will stop the crying. I can’t help but wonder whether the investigators to some extent took advantage of that desperation to sign up patients.

On to the study itself. First, infantile colic is not a minor problem. Basically, when a baby cries more than three hours a day for more than three days a week, it’s considered colic, and colic is the cause of 10-20% of early pediatric visits and usually peaks at around six weeks of age. Its natural history is to get better, which makes it a perfect condition for quacks because almost anything they do to intervene will appear to cause improvement in any given infant.

This study was basically designed as a multicenter three-arm randomized single blind clinical trial:

Alongside usual care at their regular CHCs, recruited infants visited a study CHC [child health center] twice a week for 2 weeks. Infants were randomly allocated to one of three groups: group A received standardised MA at LI4; group B received semi-standardised individualised acupuncture inspired by TCM; and group C received no acupuncture. The CHC nurse and the parents were blinded to group allocation.

LI4 is an acupuncture point located here:

LI4 acupuncture point

So basically, in Group A, babies were being needled in their hand, roughly where the thumb and forefinger meet because supposedly it has something to do with the large intestine. But get a load of the indications for this point:

Aversion to Cold. Fever. Sneeze. Regulates transpiration. Analgesia. Spasms in the channel, stomach, intestines and uterus. Toothache, head, eyes, arm and ear. Trismus. Bi syndrome. Hemiplegia. Contraction of the fingers. Tinnitus. Deafness. Redness and swelling of the eyes. Blurred vision. Nosebleed. Congestion and runny nose. Aphtha. Tension on the lips. Amenorrhea. Promotes labour. Retention of dead fetus. Stiff neck. Cold. Flu. Rhinitis. Conjunctivitis. Stye. Sinusitis. Epistaxis. Trigeminal neuralgia. Facial paralysis. Anxiety.

Well, that’s clear.

Group B received “individualized” acupuncture, which means that the acupuncturist could basically do whatever he wanted. The authors describe it as

Following a manual, the acupuncturists were able to choose one point, or any combination of Sifeng, LI4 and ST36, depending on the infant’s symptoms, as reported in the diary. A maximum of five insertions were allowed per treatment.

ST36 is on the lateral aspect of the leg just under the knee and is supposed to be used for gastric pain, vomiting, dysphagia, abdominal distention, borborygmus, diarrhea, indigestion, dysentery, constipation, abdominal pain, acute mastitis, emaciation due to general deficiency, palpitation, shortness of breath, poor appetite, lassitude, dizziness, insomnia, cough and asthma, pain in the knee joint, apoplexy, hemiplegia, beriberi, edema, depressive psychosis, and madness. In any case, whichever group they were in, infants receiving acupuncture received usual care and their parents usual advice plus treatment twice a week for two weeks, while infants in Group C received usual care and advice. Parents filled out questionnaires at four points

I initially was tempted to criticize this study over the lack of a true “sham” acupuncture group, and technically the authors should have, I can’t help but note that, if there’s one thing acupuncture studies have shown us, it’s that it doesn’t matter where you stick the needles and it doesn’t even matter if you stick the needles in. Besides, there are so many other things to criticize it for. Be that as it may, as far as the subjects went, there were 426 infants screened, of which 157 were randomized and ultimately 147 started the intervention. There were 49 in each group, of which two dropped out in group A and one in group C. All infants were between two and eight weeks of age and had to be healthy with appropriate weight gain. Also, before inclusion they were required to have tried a diet excluding cow’s milk protein from breastfeeding mothers and/or appropriate formula for at least 5 days.

So what were the results? Let’s compare. First, here is what the authors say in the abstract:

The effect of the two types of acupuncture was similar and both were superior to gold standard care alone. Relative to baseline, there was a greater relative reduction in time spent crying and colicky crying by the second intervention week (p=0.050) and follow-up period (p=0.031), respectively, in infants receiving either type of acupuncture. More infants receiving acupuncture cried <3 hours/day, and thereby no longer fulfilled criteria for colic, in the first (p=0.040) and second (p=0.006) intervention weeks.

I could have predicted this result based just on the design of the study. First, I fully expected that both acupuncture groups would be superior to the control group, because that’s how acupuncture studies always turn out because it really doesn’t matter where you stick the needles. Next, notice the p-values. Now, I realize that p-values are not the be-all and end-all of clinical research, but these are some truly sad p-values, barely statistically significant, other than the last one.

However, these results are even wimpier than that if you look at the full dataset presented in Tables 2, 3, and 4 (which you can do if you like because this is an open access paper). Notice how many comparisons there are. Notice how the authors use either the Kruskal-Wallis H test, Mann-Whitney U test, or Fisher’s exact test. The Kruskal-Wallis H test is a test that is used for more than two groups, while the Mann-Whitney U test is only used for two groups, an alternative to the t-test when the data is not normally distributed. Basically, the authors break the crying down into crying, fussing, colicky crying, and total crying and then measure percent decrease from baseline. They then use the Kruskal-Wallis H test for the three groups, A, B, and C. They also do the same analysis combining groups A and B (acupuncture groups) and comparing to group C. This latter move raised an eyebrow because it appears to have been done post hoc. In fairness, the authors did this because the trial stopped “when acupuncture became available without randomisation, before such time as a sufficient number of infants had been recruited to test the original hypothesis,” but that doesn’t make it less dicey.

In any case, there are a lot of comparisons, and only a handful of “statistically significant” p-values. David Colqhoun and Edzard Ernst were not impressed, either. Although I disagree with Colqhoun that “no correction has been used for multiple comparisons” (after all, the Kruskal-Wallis H test is basically an ANOVA test for non-normal distributions), but I do agree that the control for multiple comparisons were inadequate. The authors might have controlled adequately for multiple comparisons for each individual time point for each measure (e.g., crying or fussing), but they didn’t control for the multiple comparisons over the four time points for all the variables measured. I also agree that the multiple comparisons with so few significant results (for example, 24 p-values reported in Table 2, with only three just below 0.05) raise numerous red flags and almost certainly mean little or nothing. I also agree with Colqhoun’s other assessments, particularly the likelihood of Type 1 error (false positives).

But what about blinding? This is a single-blind test, in which the parents were supposed to be blinded to the treatment group but the acupuncturists knew the allocation. However, in the experimental design, the acupuncturists never interacted with the parents. A nurse would take the baby from the parents to the acupuncturists, who did their thing (or not in the case of the control group) in a back room. This is actually not bad. The authors also assessed blinding:

On four occasions the parents answered the question ‘Do you think your infant received acupuncture?’ with ‘yes’, ‘no’ or ‘don’t know’. In group C, 21–29% of the parents guessed correctly (no acupuncture) on all four occasions, while the rest (71–79%) either thought that their infant had received acupuncture or didn’t know (table 5). The percentage of parents in merged group A+B who correctly guessed that their infant had received acupuncture increased from 45% to 72% between the first and last occasions, while 28–55% were incorrect or didn’t know.

By random chance alone, you’d expect a 50-50 mix of parents thinking their children had acupuncture and thinking their children didn’t have acupuncture. Yet, somewhere between 71% and 79% of parents of infants in the control group correctly guessed that their children didn’t receive acupuncture. This alone suggests to me a problem with blinding. The effect is similar in the acupuncture groups, with way more than 50% of the parents guessing correctly that their babies received acupuncture.

Finally, Ernst noted:

There was a (presumably random) imbalance in breastfeeding between the acupuncture group (62%) and the control group (49%) (Table 1). As the first paragraph implies that diet is an important risk factor for colic it is surprising that this imbalance is not discussed more in the paper or controlled for in the analysis.

This alone probably wasn’t enough to explain the results, but when you add it to all the other problems with the study, the reality is that this is almost certainly a negative study. The handful of “statistically significant” results represent nothing more than noise, particularly given that the study size is pretty small. Let’s just put it this way. An editor at AIM basically admitted that this is a negative study because its primary outcome prespecified in the protocol showed no difference:

So what we have here is a negative study that doesn’t provide anything more than the weakest suggestion of any effect on colic due to acupuncture. Unfortunately, that hasn’t stopped the acupuncture believers from being all over the press to point to the results as evidence that it is acceptable to stick needles into babies for any reason other than vaccination. Unfortunately, as the authors note, there are several of these studies already published that were “equivocal” and, taken together, are “conflicting.” These are, when examined critically, actually negative studies. Acupuncture studies in children are child abuse, and this infant torture needs to stop.