If there’s one thing that lay people (and, indeed, many physicians) don’t understand about screening for cancer is that it is anything but a simple matter. Intuitively, it seems that earlier detection should always be better, and it can be. However, as I explained in two lengthy posts last year, such is not always the case. To understand why requires an understanding of cancer biology. The reason is the extreme heterogeneity of tumor behavior and prognosis. This variability was well described in a study from about a month ago, in which it was observed that the doubling time of breast cancers of approximately the same size can range from 1.2 months to 6.3 years. I’ve said before how much the misconception that cancer is just one disease when in fact it is many irritates me, particularly when it is a frequent misconception used by purveyors of quackery to suggest a “cure for cancer.” Indeed, based on its extreme variability in behavior, I could even argue that breast cancer is not even one disease, and there is evidence to support such a view.
One of the key questions when developing a screening program for any cancer is whether earlier detection does actually improve prognosis and survival for that cancer. It’s not as easy as it might seem to design and carry out studies that demonstrate whether a given screening test can in fact result in this desired outcome or not. When that test is mammography, there is no doubt that tumors picked up on mammography have a better prognosis than those detected by symptoms (for example, feeling a lump). The question that has not been so clear is how much of this benefit is due to the screening itself and how much is due to other factors. Indeed, some critics of mammography even argue that there is no survival benefit at all from mammography and the reported benefit is instead all due to lead time bias. A recent study out of the U.K. published in the British Journal of Cancer1 seeks to answer this very question; i.e., whether a cancer detected by mammographic screening confirms an additional survival benefit over and above that caused by the shift downward in tumor stage due to earlier detection.
This particular study examined women between the ages of 50 and 70 in eastern England. Most likely the reason that, unlike the case in the U.S., in most European countries mammographic screening does not begin before the age of 50. In any case, this study examined the records of 5,604 women diagnosed with invasive breast cancer between the years of 1998 and 2003 and identified by the Eastern Cancer Registration and information Centre ECRIC). Using multivariate analysis, the investigators examined the effect of age, mammographic screening status (the tumor detected by mammography or by symptoms), along with standard clinical parameters, such as tumor stage and the presence and absence of positive lymph nodes. They also examined the Nottingham prognostic index (NPI) of each tumor. This is an index based on the size of the primary tumor, the presence or absence of positive lymph nodes, and the grade of the tumor, which is a measure that pathologists make by looking at the tumor cells under the microscope and estimating how “bad” or undifferentiated they look. High grade is bad, and low grade is better. Similarly, the NPI is divided into five different prognostic groups, excellent (NPI<2.4), very good (2.4
In the initial analysis, consistent with previous studies, there was a strong effect of screening on prognosis overall, and the effect was more pronounced in the worst NPI classes. Next, the investigators adjusted for NPI and other prognostic factors. When this was done, it was noted that screening status remained as an independent risk factor for survival. However, when the investigators quantified the relative contribution to this difference, they found that adjusting for size, nodal status, age, and NPI took away 72% of this effect, but left a real effect, with tumors detected by mammographic screening being only 79% as likely to result in death as a tumor detected by symptoms. When the authors graphed survival as a function of NPI, they obtained this graph:
The natural first question on seeing this graph is: Why does detection by screening mammography apparently confer such a small advantage in survival. What this likely means is that most of the benefits of screening come from factors I’ve discussed before: stage migration and lead time bias. In lead time bias, screening detects the biological process at an earlier point in its evolution, leading to an apparent and artificial increasin in survival time from diagnosis even if treatments of the disease in question have zero effect. In essence, the time between when the tumor would have been detected by physical examination and when it was detected by screening is added to the apparent prognosis. So, let’s say that screening allows a tumor to be detected two years before it would have become symptomatic. Even if treatment has no effect whatsoever on patient survival, such early detection would produce an apparent increase in survival of two years. That’s of course the “Cliff Notes” version; I’ve discussed the concept of lead time bias in great detail before. The other relevant concept is length bias. Put in its “Cliff Notes” version, length bias is a phenomenon in which slower growing cancers remain in a preclinical detectable phase for a longer period of time and thus are more likely to be detected by screening programs. What this means is that by their very nature screening programs tend to detect proportionally more slow-growing, good prognosis tumors. Most likely, therefore, the majority of the difference in prognosis between tumors detected by mammography versus tumors detected by symptoms comes down to lead time bias and length bias.
Most, but clearly not all.
One thing that the data show is that there is definitely a small but very real survival benefit when a tumor is detected mammographically. Indeed, one point that the authors make is that perhaps whether or not a tumor was detected by mammography rather than by symptoms should be a prognostic factor factored into treatment decisions:
These data confirm the known survival advantage for patients with screen-detected cancers. They show that although most of this advantage is due to a shift in NPI, the mode of detection does impact on survival in patients with equivalent NPI scores. This residual survival benefit is small but significant, and is likely to be due to differences in tumour biology between screen-detected and symptomatic cancers. Current prognostication tools that do not include known biological markers may overestimate the benefit of systemic treatments in screen-detected cancers and lead to overtreatment of these patients. A prognostic tool combining clinical, pathological and biological factors might allow more accurate prognostication, and more appropriate systemic therapy, for all patients with breast cancer regardless of their mode of detection.
The point that the mode of detection of a breast cancer does indeed appear to have prognostic significance, aside from differences in NPI or differences solely due to lead time bias or length bias was also driven home in an accompanying editorial by Dr. Berry from the M.D. Anderson Cancer Center2 and further explains the issue of lead time and length bias:
Length bias is more important than lead-time bias, at least in breast cancer. But neither its importance nor the concept itself is easy to understand. ‘Length’ refers to the tumour’s presymptomatic period when the tumour is mammographically detectable. The length of this period is the tumour’s sojourn time. Sojourn time varies from one tumour to another. (There is an obvious relationship between lead time and sojourn time; lead time is shorter because it requires actually finding the tumour during the presymptomatic period.) Sojourn time is typically positive, but it is negative for tumours that become symptomatic without being detectable on a mammogram. Breast tumours are heterogeneous, even after accounting for stage and other known clinical and biological characteristics. Aggressive tumours have shorter sojourn times because they grow faster. Indolent tumours have longer sojourn times. Screening finds tumours in proportion to their sojourn times, and therefore longer times and slower growing tumours are preferentially selected. This is length bias. (There are many analogues: when you look in the sky and see a shooting star, it is more likely to be one with a longer arc; when you reach into a newly opened bag of potato chips and select one, it is more likely to be big.) A special case of length bias is overdiagnosis, when screening finds a tumour with a sojourn time so long that the tumour would not kill the woman even if it was never found.
As he further explains, accounting for NPI partially removes lead time bias but does not remove length bias. Some tumors will naturally grow faster than others, and by their natures such tumors will have worse prognoses than tumors that grow slowly. One consequence of not taking mode of diagnosis into account could be overtreatment if tumors detected mammographically are indeed less aggressive than those detected by symptoms. My take on this issue is that there is a lot of biology that is not well understood underlying the differences in aggressiveness of different tumors. New technology that can look at the gene expression profile of tumors or to profile the levels of large numbers of proteins will lead us to understand what gene “signatures” control aggressiveness and metastasis in tumors. However, as Dr. Barry explains, there is a paradox:
Therefore, although the authors are correct in worrying that screen-detected cancers may be overtreated, a greater concern is that some screen-detected cancers should never have been detected in the first place! The rub is that just as with treatment, we do not yet have a good understanding regarding which cancers we do not want to detect. Mammography is too crude a tool to make this distinction.
Indeed it is. The paradox of breast cancer screening is that there are indeed some tumors whose sojourn time is so long that they will never harm the patient and it is these tumors that we tend to detect more with intense screening. The price of detecting tumors early and realizing the benefit of screening is that some tumors that would never develop will be detected and, because we do not have reliable tests to differentiate highly aggressive from indolent tumors, treated. Despite its proven ability to decrease mortality from breast cancer in women over 50, mammography does remain a pretty crude tool. The reason it persists is because it is inexpensive, at least compared to newer modalities. Unfortunately, the major problem that was not mentioned in either the article or the editorial is that newer, more sensitive modalities like MRI suffer from the same problem in spades, as I’ve discussed before. Indeed, because of the sensitivity of MRI, it is even less able to distinguish between tumors. That’s why I tend to believe that ever more sensitive detection modalities are not the answer. Rather, the development of better molecular diagnostic tests that more accurately distinguish between aggressive tumors and tumors that are unlikely ever to trouble the patient will be far more likely to improve the “signal-to-noise” ratio and decrease the unwanted phenomenon of overtreatment.
REFERENCES:
1, Wishart, G.C., Greenberg, D.C., Britton, P.D., Chou, P., Brown, C.H., Purushotham, A.D., Duffy, S.W. (2008). Screen-detected vs symptomatic breast cancer: is improved survival due to stage migration alone?. British Journal of Cancer, 98(11), 1741-1744. DOI: 10.1038/sj.bjc.6604368
2. Berry, D.A. (2008). The screening mammography paradox: better when found, perhaps better not to find. British Journal of Cancer, 98(11), 1729-1730. DOI: 10.1038/sj.bjc.6604349