Scientific fraud and journal article retractions

A week ago, I took someone who has normally been a hero of mine, Brian Deer, to task for what I considered to be a seriously cheap shot at scientists based on no hard data, at least no hard data that he bothered to present. To make a long, Orac-ian magnum opus short, Deer advocated increased governmental regulation of science in the U.K. based apparently on anecdotes like that of Andrew Wakefield. Worse, rather than presenting even the limited data that exist regarding the prevalence of scientific fraud, he chose instead to devote too much of his limited word count to characterizing scientists as “screeching” and likening their “silence” on the issue to that of the Roman Catholic Church over pedophile priests. Not cool.

I hadn’t planned on writing about this again for a while, but then I saw a couple of posts and articles that might bear on the issue. First, there was an article in the Wall Street Journal entitled Mistakes in Scientific Studies Surge. In addition, Derek Lowe weighed in, as did Pharmalot. Let’s take a look at the WSJ paper first:

Since 2001, while the number of papers published in research journals has risen 44%, the number retracted has leapt more than 15-fold, data compiled for The Wall Street Journal by Thomson Reuters reveal.

Just 22 retraction notices appeared in 2001, but 139 in 2006 and 339 last year. Through seven months of this year, there have been 210, according to Thomson Reuters Web of Science, an index of 11,600 peer-reviewed journals world-wide.

In a sign of the times, a blog called “Retraction Watch” has popped up to monitor the flow.

Science is based on trust, and most researchers accept findings published in peer-reviewed journals. The studies spur others to embark on related avenues of research, so if one paper is later found to be tainted, an entire edifice of work comes into doubt. Millions of dollars’ worth of private and government funding may go to waste, and, in the case of medical science, patients can be put at risk.

The WSJ article has several graphs and some analysis designed to try to argue that the rise in retractions is real and that it’s not just because of the increase in papers published in scientific research journals. For instance, as noted above, the number of papers has increased less than 50% since 2001, but the number of retractions has climbed 15-fold. In considering data like this, there are two questions to consider:

  1. Is the incidence of scientific fraud truly increasing, or is this primarily due to better policing?
  2. Is the number of retracted papers a good gauge of the prevalence of scientific misconduct?

The first question is probably the easier of the two to answer in that we can at least look at numbers. The problem is that the answer to the first question depends a lot on the answer to the second. The problem, of course, is that papers can be retracted for a variety of reasons, most of which don’t have anything to do with fraud. For instance, there is this study, which found that 73% of retracted papers are retracted for reasons of error or an undisclosed reason. A common story in undisclosed reasons is that another investigator couldn’t reproduce the results, leading the original investigators to try to reproduce their results again, at which point they find they can’t. Whatever the reasons covering the 73% of papers retracted for error or undisclosed reasons, however, the remainder, approximately 27%, were retracted for fraud. Interesting trends noted in this paper included that total retractions have increased sharply but that retractions specifically for fraud have also increased. This graph tells the story in that it indicates that the number of retractions per 100,000 published papers is rising. On the other hand, even though the number has increased several-fold, it’s hard not to note that the current number of retractions over the last three or four years ranges from 30-35 retractions per 100,000 scientific papers. That’s 0.035% of scientific papers. Certainly, the sharp increase in retractions is a reason to be concerned, but even today the number is quite low.

Also telling the story are a series of graphs in the WSJ paper. Of note, the vast majority of retractions are in medicine, biology, and chemistry. One reason for this could be that these sciences tend to impact most on health, medicine, and materials science, meaning that the stakes tend to be higher. That’s not to say that the stakes aren’t high in other sciences, but in medicine and materials science, there is also more than the incentive of fame and respect; there are real financial rewards added to the prestige, potential grants, and fame. As the WSJ article notes:

Why the backpedaling on more and more scientific research? Some scientific journals argue that the increase could indicate the journals have become better at detecting errors. They point to how software has made it easier to uncover plagiarism.

Others claim to find the cause in a more competitive landscape, both for the growing numbers of working scientific researchers who want to publish to advance their careers, and for research journals themselves.

“The stakes are so high,” said the Lancet’s editor, Richard Horton. “A single paper in Lancet and you get your chair and you get your money. It’s your passport to success.”

While this is true, I’m not sure whether it’s any more true today than it was 10 years ago. Science has always been competitive, and getting published in high profile journals like The Lancet, Nature, Science, Cell, and the like has always had the potential to make an investigator’s career. Also, in the WSJ article itself, it looks as though the very top tier journals, such as Science, Nature, and the New England Journal of Medicine, haven’t had a significant increase in retractions. In fact, it looks as though the journals on the next tier down (still high ranking journals but just not top tier) are the ones having the problem. On the other hand, over the last five years, in the U.S. at least, the funding situation has deteriorated to the point where it hasn’t been this difficult to obtain N.I.H. funding in 20 years. Indeed, it’s become two to three times harder to earn a grant than it was just five or six years ago, to the point where at the National Cancer institute, the pay lines are around 7% right now. In such an environment, the pressure to get that high impact publication might well be considerably higher than it was ten years ago.

There’s another factor to be considered as well, and that’s the proliferation of journals produces competition to publish the most groundbreaking science, science that will get the journal noticed and encourage the heavy hitters in the field to submit papers to it. Horton, whose survival as editor of The Lancet after his behavior in the wake of the Wakefield scandal still puzzles me and whose being cited as some sort of oracle about scientific fraud puzzles me even more, given his demonstrated rank incompetence in dealing with it, dismisses this as a likely explanation:

The Lancet’s Dr. Horton dismisses that notion. He says journals hit by fraud and error are becoming more conservative about publishing provocative research. But he also says journals and research institutions don’t have adequate systems in place to properly investigate misconduct.

The apparent rise in scientific fraud, said Dr. Horton “is a scar on the moral body of science.”

So what is going on? In summary, it is apparent that retractions are on the rise, and that they are on the rise for scientific fraud as well as error. The numbers appear to be quite clear on that. The question, however, is what this means. Is it because the incidence of scientific fraud is increasing, because the scientific community is getting better at catching scientific fraud, or a combination of both? Although I don’t have much hard data to back it up, I’d say that it’s probably both. Policing is definitely better, as there is now software that allows the detection of some of the lazier sorts of scientific misconduct, such as plagiarism or image manipulation–even data fabrication, such as detecting when non-random clustering of data that should be randomly distributed about a mean. However, it also wouldn’t surprise me if scientific misconduct in the form of research fraud is also on the rise, although this is a much harder conclusion to solidify because we don’t have a firm grasp of how common research misconduct was before; i.e., we don’t have a baseline to compare to. One reason is that surveillance for scientific fraud was much laxer, which means that the much lower retraction numbers from ten years ago and before might be due more to not bothering to look hard enough than to a real, lower rate of scientific misconduct. Even so, one thing that is clear is that, even given the rise in retractions, the overall number still appears to be very low compared to the hundreds of thousands of scientific papers published every year, again roughly 0.035%.

The final question is whether the number of retracted papers is a good surrogate for the prevalence of scientific fraud. Certainly, at best it’s a very crude measure. Also, it tends to catch a specific kind of fraud, specifically fraud that is occurring in science that’s interesting enough for other investigators to try to replicate it. Seemingly unimportant or uninteresting results are likely never to be found out because no one will bother to check them. Of course, interesting and important research tends to be where the most incentive is for fraud, given that it has the most potential to bring fame, glory, and, above all, grant money to the person committing it; so it could be argued that this is exactly the sort of fraud that we as scientists and physicians most want most to catch.

Whatever the true prevalence of scientific fraud, perhaps the two most disturbing pieces of information are that (1) the time between publication and retraction appears to be increasing, and (2) more importantly journals often fail to alert the naïve reader to a retraction. Indeed, this latter observation is disturbing indeed; as this study reports, 31.8% of retracted papers are not noted as retracted in any way. Clearly, we as a scientific community need to do better. Fraud, once detected, needs to lead to real consequences, beginning with retraction of the involved papers, but most importantly it needs to be absolutely clear what papers have been retracted, so that an unwary investigator doesn’t inadvertently take a retracted paper as being a useful basis for further research.

Overall, it would appear that we are doing a better job at detecting fraud. It would also appear that the prevalence of fraud, at least as measured by retractions of journal articles, remains low, which makes me wonder about all this press about this somehow being a “crisis.” Unfortunately, that being said, there is also clearly considerable room for improvement, particularly in detecting scientific fraud.