Abstract
There has been a tremendous evolution in our thinking about cancer since the 1880s. Breast cancer is a particularly good example to evaluate the progress that has been made and the new challenges that have arisen due to screening that inadvertently identifies indolent lesions. The degree to which overdiagnosis is a problem depends on the reservoir of indolent disease, the disease heterogeneity, and the fraction of the tumors that have aggressive biology. Cancers span the spectrum of biological behavior, and population-wide screening increases the detection of tumors that may not cause harm within the patient's lifetime or may never metastasize or result in death. Our approach to early detection will be vastly improved if we understand, address, and adjust to tumor heterogeneity. In this article, we use breast cancer as a case study to demonstrate how the approach to biological characterization, diagnostics, and therapeutics can inform our approach to screening, early detection, and prevention. Overdiagnosis can be mitigated by developing diagnostics to identify indolent disease, incorporating biology and risk assessment in screening strategies, changing the pathology rules for tumor classification, and refining the way we classify precancerous lesions. The more the patterns of cancers can be seen across other cancers, the more it is clear that our approach should transcend organ of origin. This will be particularly helpful in advancing the field by changing both our terminology for what is cancer and also by helping us to learn how best to mitigate the risk of the most aggressive cancers.
See all articles in this CEBP Focus section, “NCI Early Detection Research Network: Making Cancer Detection Possible.”
Introduction
The initial management of breast cancer emphasized a highly aggressive surgical approach epitomized by the Halsted radical mastectomy. This approach was championed as the most effective means of reducing breast cancer mortality. However, despite aggressive local control, patients continued to die from metastatic disease. This led to the understanding that breast cancer is a systemic disease. Bernard Fisher, a surgeon, was one of the first to promulgate the concept that surgery alone was not sufficient to cure women with breast cancer (1). Medical oncology subsequently became a standard component of breast cancer treatment. The assumption that all breast cancer was at risk for systemic spread led to treatment guidelines that encouraged chemotherapy for all tumors greater than 1 cm in size. Chemotherapy, however, represented a one-size-fits all approach that delivered indiscriminate harsh toxicity in the pursuit of cancer eradication. Advances in our understanding of biology and the rise of molecular markers have led to paradigm shifts in treatment. We now assign treatment based both on risk, the timing of risk, and the chance that a given therapy will affect outcome. Tumor heterogeneity should also affect how we screen and prevent disease. The parallel in prostate cancer, which we will briefly discuss at the end of this review, highlights the generalizability of these concepts. Multidisciplinary investigation helps create a better framework for understanding broader concepts in cancer treatment and screening that arise.
Breast Tumor Biomarkers
The identification of tumor markers such as estrogen and progesterone receptors permitted the development of differential treatment algorithms based on tumor biology. The risk-reducing role of endocrine therapy was established through large multi-institutional clinical trials. Initially, cooperative groups like the National Surgical Adjuvant Breast and Bowel Project (NSABP) treated all women with endocrine therapy. However, treatments began to be tailored once it became clear that endocrine therapy only reduced risk of recurrence and death in hormone-positive tumors.
The discovery of the HER2-neu oncogene was another major advance in our ability to differentially treat cancers based upon their molecular profile. Tumors that overexpressed HER2-neu were found to be associated with worse prognosis in terms of local and metastatic recurrence and breast cancer–specific mortality. Targeting the overexpression of this protein with a monoclonal antibody led to longer progression-free and overall survival in the metastatic setting and then to a significant decrease in breast cancer mortality when applied in the early/adjuvant setting (2, 3).
Over time, the challenge has become identifying which patients to treat with which combinations of agents and how to reduce morbidity. In each of the disciplines, surgery, radiotherapy, and medical oncology, a targeted reduction in treatment based on biology has led to a reduction in morbidity without sacrificing recurrence-free survival. Partial and skin-sparing mastectomies have replaced radical mastectomies, and sentinel lymph node dissection has largely replaced axillary dissections (4–6). Trials of older women with early stage cancers treated with breast conservation did well with endocrine treatment regardless of whether they had radiation therapy, without decreasing survival, or total mastectomy rates. This has led to de-escalation in the field of radiation oncology where shorter, more targeted, or no radiotherapy can be used to optimize outcomes and avoid overtreatment (7–9).
In the field of medical oncology, the ability to target therapy has been advanced with the development of multigene assays such as OncoType DX and MammaPrint. OncoType DX is a 21-gene recurrence score that provides predictive information in hormone receptor–positive breast cancer about the benefit chemotherapy adds to endocrine therapy with high scores predicting substantial benefit and low-to-intermediate scores predicting little added benefit (10, 11). MammaPrint, a 70-gene assay, provides an index that is prognostic for early recurrence in the absence of any systemic therapy (12).
Over time, we have come to understand that patients defined as low risk based upon molecular profile have late recurrence risk. Furthermore, within the low-risk population, there is a group of women with ultralow risk disease that do not carry the same kind of systemic risk. Again, biology transcends stage and dictates both the extent and the timing of risk as well as what interventions lower that risk. The MINDACT study demonstrated that women with molecular low risk scores, even for stage 2 and 3 tumors (up to 5 cm and with up to 3 positive nodes), do not benefit from chemotherapy (13). In a recent update at the American Society of Clinical Oncology in 2020, the estimated distant metastasis-free survival (DMFS) gain for chemotherapy administration in clinical-high and genomic-low risk is 2.6% and thus must be balanced with harmful side effects of chemotherapy (14). This small benefit was almost exclusively seen in women in their 40s, potentially reflecting the impact of ovarian suppression, which was also suggested in the node-negative intermediate risk group in the TAILORx Trial, and for postmenopausal women, DMFS gain is low (11, 14). Neoadjuvant clinical trials such as I-SPY have also demonstrated that even in the setting of molecular high-risk disease, response to chemotherapy is not uniform; there is heterogeneity, and novel treatments differentially affect the outcomes of these tumors. Clinical trials that use a novel adaptive trial design incorporating patients' clinical and molecular subtype to further personalize treatment and identify innovative neoadjuvant regimens (15, 16) have helped us to better understand the underlying set of diseases that comprise “breast cancer” (17).
Breast Cancer Screening
As our breast cancer treatments have evolved, so has our method of cancer detection. In the 1980s, Ingvar Andersson started the first trial of mammographic screening in an attempt to reduce the incidence of late stage disease and find cancers at an early stage where they could be cured (18). This was based on the epidemiologic observation that cancers that presented at an earlier stage had a better outcome than those presenting at more advanced stages and the hypothesis that earlier detection would lead to reduced mortality. Multiple screening trials were conducted demonstrating that screening and early detection could reduce the relative risk of mortality from breast cancer by 20% to 30% with an absolute risk reduction of about 4% to 6% (19, 20). Based upon these results, most countries embarked upon national screening programs and instituted aggressive campaigns based on the concept that “Early Detection Saves Lives.”
These screening trials were first conducted when we assumed that cancer was one disease, and these studies predated our knowledge of biomarkers. Figure 1 shows the progress made with screening, treatment, and prevention across the various disciplines involved with the management of breast cancer. In the background is the change in incidence over the corresponding time period. Changes in the prevalence of risk factors such as alcohol consumption, later age of child-bearing, earlier onset of menses, and use of hormone therapy after menopause contribute to an increase in incidence as well; however, the most significant change, and what we will focus on, is screening. Over the course of several decades, we have learned that screening is a complicated story. With screening, there was a shift in the proportion of early stage cancers detected. Initially, the higher fraction of cancers detected at a lower stage was thought to be a success and would drive down mortality from breast cancer. However, the identification of more early stage cancers was not accompanied by a concomitant decline in the rates of stage 2 and 3 cancers as shown in Fig. 2A, which illustrates the rates of breast cancer by stage from 1983 to 1997. In fact, there has only been a modest decrease in stage 2B cancers, while stage 2A cancers remained relatively constant, stage 3 cancers increased slightly, and de novo metastatic disease has remained constant over time since screening was introduced. This has led us to understand that there are different types of breast cancer and that not all lead to metastatic spread and subsequent death.
Overdiagnosis
Overdiagnosis is defined as the detection of indolent disease that would otherwise not pose a risk to patients, and overtreatment exists when we approach these indolent lesions with the same therapeutic strategies used to manage aggressive cancers. The challenge is to define and, therefore, recognize three separate classes of tumor at the time of diagnosis: (i) indolent, (ii) slower growing but lethal, and (iii) rapidly growing and aggressive. Breast cancer mortality has decreased, in part due to screening and in part due to the development of adjuvant systemic therapies. Although the exact proportion is still debated, likely about one third of this decrease can be attributed to screening (21). The types of cancer that are most likely to benefit from screening are those with slow to moderate growth. Unfortunately, more aggressive cancers are faster growing and more likely to present between screening events as so-called interval cancers. Screening has not eliminated stage 2 and 3 cancers. In fact, in a trial of women with stage 2 and 3 cancers, many are younger than the recommended screening age. And, of those with more advanced disease that have undergone screening, over 80% presented with interval cancers (22). This suggests that later stage tumors have different biology than those that are small and low grade. Figure 3 illustrates the characteristics of various tumors with differing rates of progression and the role of screening in their detection and treatment. The comparison of breast and prostate cancer in the 2009 JAMA viewpoint “Rethinking Screening” (23) originated through a collaboration in the Early Detection Research Network (EDRN) and set the course of a new approach to early detection (24, 25).
What is not properly appreciated, even today, is how the nature of breast cancer has been affected by the increase in screening (Fig. 1). Clinical trials progressed within an evolving screening landscape that changed the incidence and distribution of the type of cancers detected. Meanwhile, studies from an earlier period with a different population of tumors were informing treatment guidelines. No one knew that the underlying biology of tumors in the population was changing. Screening inadvertently led to the unanticipated sequence of events of both overdiagnosis and overtreatment. Importantly, overdiagnosis also leads to a dilution of the more consequential cancers in a population and affects the ability to detect differences in therapeutic effect. As event rates decreased, adjuvant trials got larger and therapeutic effect differences became narrower. This is what fueled, in part, the revolution brought about by the molecular subtyping of tumors. There was a need to separate out who had what type of disease. This mattered for prognosis, prediction of treatment benefit, and establishment of trial eligibility. The understanding of variable progression depending upon tumor subtype is key when interpreting results of screening and deciding among treatment options (7, 13, 25).
The estimated burden of “over-diagnosed breast cancers” ranges from 25% to 50%. In 2014, the Canadian National Breast Cancer Screening Study results suggested that 50% of nonpalpable screen-detected lesions were likely indolent in nature (26). Reviews of the Surveillance Epidemiology and End Results (SEER) database as well as Canadian data put the rate of overdiagnosis at around 20% to 30% of all screen-detected breast cancers (26). Screening mammography allowed for smaller tumors to be detected. Thus, women were more likely to have breast cancer that was overdiagnosed than to have earlier detection of a tumor that was destined to become large (27). A recent study from the Netherlands, where there is 85% compliance with screening recommendations to screen every other year starting at 49, demonstrates that, over time, the incidence of in situ and stage 1 cancers rises sharply at age 49 and stops at age 74, corresponding to the screening ages and further supporting the notion that screening leads to overdiagnosis (28).
Although our methods and periodicity of screening continue to evolve, the scientific underpinning of a shift in the approach to early detection raises a few key questions. First, who is at risk for systemic recurrence? We need to identify patients who may be safely treated with less intervention. And, if we can identify these patients, how can this inform our screening and prevention approaches? For treatment decisions, we need to elucidate who is at risk for early versus late recurrence. In higher risk patients, we need to figure out early in their course which treatments will be optimal.
The solutions to improving our approach to early detection and avoiding overdiagnosis and overtreatment are 4-fold. First, we must definitively demonstrate that indolent tumors exist and that they can be recognized at the time of diagnosis. The second is to improve our approach to screening whereby we tailor screening to those at higher risk and specifically improve the ability to identify those who are at risk for developing aggressive tumors. The third is to adjust pathologic classification of tumors, and the fourth is to adjust the approach to in situ disease just as we have with invasive cancer, through improved classification and tailored interventions. We will take each of these topics in turn. And, finally, we will reflect on how this approach to overdiagnosis and overtreatment is mirrored in other cancers.
Ultralow-Risk Indolent Lesions of Epithelial Origin (IDLE Conditions)
If screening has increased the detection of more biologically indolent lesions, there should be a shift over time in the biology of the types of tumors detected. An important opportunity to ask that question was presented by the development of a molecular diagnostic using tumors banked prior to the use of adjuvant therapy in the Netherlands.
The MammaPrint 70-gene assay, as described above, was first created as a prognostic tool for women with early stage breast cancer who did not receive adjuvant therapy. The low- and high-risk thresholds were initially described in 2002 and then further validated with data from patients across five European centers through the TRANSBIG consortium (12, 29). A demonstration project, called RASTER, used MammaPrint to characterize all tumors detected in the Netherlands from 2004 to 2008, a period where screening was routine in 85% of the population. A comparison of TRANSBIG and RASTER found that for women under the age of 49, where there was no screening, there was no shift in the fraction of high- and low-risk tumors (75% were high risk in both datasets). However, in the population ages 50 to 70, the fraction of women with high-risk/low-risk disease shifted from 60%/40% to 40%/60%, respectively, in TRANSBIG versus RASTER (30). We further analyzed the original dataset and set a threshold above which no metastatic events occurred, designating this as the ultralow or IDLE threshold. In the prescreening era, 10% of all tumors met this threshold, and in the postscreening era, 30% met this threshold. This represents a 200% increase in ultralow-risk tumors in the screening era.
One of the challenges in identifying a truly indolent or ultralow risk tumor is that hormone-positive, especially lower-grade, tumors can recur 10 to 20 years after diagnosis. We need to be able to differentiate those cancers that do not possess malignant potential from those that merely recur much later from the time of initial diagnosis. By 2010, we had collected much longer follow-up from patients included in the initial 70-gene MammaPrint assay dataset. An ultralow-risk threshold was defined by setting a cutoff above which there were no late recurrences or deaths with 18.5 years of median follow-up from the Netherlands Cancer Institute NKI295 series (12, 31, 32). This threshold was further validated with data from the Stockholm 3 (STO) randomized trial conducted in Sweden with rigorous 20-year follow-up. In this study, postmenopausal patients with early stage node-negative breast cancer were randomized to no adjuvant treatment or adjuvant endocrine therapy. The ultralow-risk threshold was associated with 100% breast cancer–specific survival at 15 years and 97% at 20 years (33). Ultralow-risk tumors are hormone positive, HER2 negative, generally luminal A subtype, and have a lower proliferative rate as represented by a Ki67 less than 15%. However, only 25% of luminal A tumors were ultralow risk. Using recursive partitioning, a tool that rank orders the conditional probabilities associated with outcome, we analyzed factors associated with 20-year survival in these breast cancer patients and found that MammaPrint ultralow-risk was the most significant prognostic indicator of overall survival (33). Interestingly, the second most important factor was tumor size. Figure 4 shows the partitioning of the survival tree that best predicts 20-year survival. If these ultralow-risk lesions can be identified and removed from the denominator, then the impact of screening and early detection on patient outcomes would likely be much improved. For tumors that are not ultralow risk, early detection does, in fact, save lives. Ultralow-risk/indolent lesions can be identified at the time of diagnosis, using molecular diagnostics, and this allows a real change in clinical care by reducing the incidence of overtreatment. The next step of this research is to determine if this ultralow signature can identify tumors, regardless of organ of origin, that are “indolent.” This objective is a major focus of the work of the MCL (Molecular and Cellular Characterization of Screen Detected Lesions) consortium funded by the NCI (34).
Precision Screening
It is now well accepted that cancer is a heterogeneous disease that ranges in severity from more indolent lesions to more aggressive tumors with risk varying among individuals. Therefore, it is no longer logical to simply continue to screen as if everyone was at the same risk for the same kind of cancer.
If we can identify those at risk for the most aggressive cancers, we can enable targeted prevention, more frequent screening, and/or screening with more sensitive modalities such as MRI. This is currently how we screen mutation carriers whose lifetime risk/5-year risk is thought to be in the range of 40% to 85%/6% and who choose intensive surveillance over prophylactic surgery (35). These patients undergo mammograms and MRIs annually at 6-month intervals.
The U.S. Preventive Services Task Force recommends every other year screening starting at age 50. However, they also recognize the need to generate new and better data about how to screen smarter and generate the most substantial benefit with the least harm. Entertaining new ways of screening and re-evaluating our current understanding must be put to the test.
The Women Informed to Screen Depending on Measures of Risk (WISDOM) trial is a randomized clinical trial comparing a comprehensive risk-based or personalized approach to traditional annual breast cancer screening. WISDOM seeks to define an innovative, personalized, and dynamic approach to breast cancer screening that is safe and reduces morbidity (36). A companion study in Europe, where screening is already every other year starting at age 50, My Personalized Breast Cancer Screening (MyPeBS), is evaluating ways to both increase the frequency of screening for those at highest risk and decrease screening for those at lowest risk, including eliminating screening for those with the lowest 20% of risk in the population by age (ClinicalTrials.gov Identifier: NCT03672331).
Does Diagnostic Terminology Need to Change?
If we are able to identify ultralow-risk and indolent cancers, this lower risk disease needs to be differentiated from its higher risk counterparts. This not only allows patients to better understand and comprehend their disease but also serves to enhance and define our research questions, parameters, and treatment populations.
Fifty years ago, breast cancers were categorized by pattern without correlation to outcomes or treatment choices. Thirty to 40 years ago, evidence-based terminology was proposed with most breast cancers lumped into the category of “invasive mammary carcinoma, of no special type” to distinguish the minority of special types of cancer with specific clinical implications (37). Invasive lobular carcinoma, for example, was known to be associated with a good response to antiestrogen therapy and no worse prognosis despite an infiltrative and hard to detect growth pattern (38). Tubular carcinomas were associated with no systemic spread despite occasional axillary lymph node involvement (39), and medullary carcinomas were associated with a good prognosis despite highly dysplastic cytologic features (40). Almost simultaneously, we began testing all breast cancers for estrogen receptor status to predict response to antiestrogen therapies. The progesterone receptor was added to increase sensitivity of detection of ER signaling based on an understanding that the progesterone receptor gene expression is regulated in part by estrogen-responsive elements in the promoter (41). We soon after developed assays for evaluation of ERBB2/HER2-neu overexpression to predict response to anti–HER2-directed therapies (42). This effectively divides invasive mammary carcinomas into the three categories: (i) ER+, (ii) HER2 amplified, and (iii) “Triple Negative.”
Although these are practical divisions, they are clearly an oversimplification of the variation in cancer biology. Combined histologic grade, which summarizes important tumor features of proliferative rate, architecture, and cytologic differentiation (or dedifferentiation), can be low, intermediate, or high in any of the categories (43, 44). Notably, high-grade “special type” cancers do not exist as they are excluded by diagnostic criteria to retain the “special properties” of the type, despite molecularly defined categories that might be included. For example, invasive lobular carcinoma might be defined by loss of E-cadherin expression, a cell adhesion molecule, and we might include the high-grade variants in the invasive lobular carcinoma category. Of the grade components, proliferative rate is the most informative feature with regard to prognosis and benefit of cytotoxic therapy (45), but the tumor mitotic count is difficult to reproducibly apply (46). This led to the development of multigene assays, such as OncoType DX, mentioned above, that use RNA expression profiling of either 21 genes (OncoType) or 70 genes (MammaPrint) as a more objective measure of proliferative rate intended to inform clinical decision-making (11, 13, 47).
So, today, we are left with a matrix of breast cancer diagnoses with grade/oncotype risk stratification, hormonal receptor expression, and HER2 amplification status. But this is an oversimplification as indicated by surveys of gene expression signatures that show at least six intrinsic subtypes with nonoverlapping associations to this matrix (48–50). The ER+ cancers, for example, seem to fall primarily into two categories based on proliferative rate (Luminal A = low-risk and Luminal B = high-risk), but this is not exclusive with some ER+ cancers falling into basal, normal-like, claudin-low, and even HER2 subtypes. Similarly, a HER2+ tumor does not necessarily fall into the HER2-intrinsic subtype, and “triple negative” does not necessarily equate to basal subtype.
Meanwhile, the diagnostic categories are soon to be augmented by new biomarker testing to accompany new therapies. Germline or somatic BRCA mutations are already being used to predict response to PARP inhibitors (51), PI3KCA mutations are being used to predict response to specific targeted molecular inhibitors (52), and immune checkpoint markers are being used to predict prognosis and response to various immunotherapeutic agents (53). Despite the importance of these new therapies, these tests have not yet become routine for all breast cancers.
The evolution of diagnostic terminology has followed, piecemeal, behind the development of new therapies, as outlined above. Today, more is known about the differences in breast lesion biology than ever before. Most of this knowledge has not been incorporated into routine diagnostics, however, apparently waiting for actionable consequences of each additive assessment. We are, thus, left asking the question of whether there is a better, more complete way to classify breast cancers with a more precise diagnostic categorization. As a secondary question, how can pathology diagnostics contribute to lower overdiagnosis and less overtreatment? New gene expression analyses show that some of the traditionally overdiagnosed “cancers” have gene expression signatures that define an “ultralow” risk of mortality, essentially equal to the risk for a woman with no cancer diagnosis at all (33). These “ultralow” lesions might properly be downgraded from “carcinoma” or “cancer” to something better reflecting their nonlethal potential such as “indolent lesion” or “mammary neoplasm” (24). Meanwhile, by combining proliferative rate, oncogenic mechanisms, tumor–host interactions, and specific drug companion biomarkers at initial diagnosis using gene expression and/or sequence analyses, breast cancers can be properly categorized for the purpose of making better therapy decisions, including the choice not to treat.
Understanding Ductal Carcinoma In Situ in the context of Overdiagnosis and Overtreatment
As is true for invasive breast cancer, improved understanding of the biology of ductal carcinoma in situ (DCIS) is critical to develop treatment algorithms that prioritize targeted therapy for high-risk lesions while de-escalating surveillance and intervention for indolent disease. The detection of DCIS increased 500% after the implementation of routine screening mammography (34). DCIS consists of multiple patterns of intraductal neoplasia with varying proliferative rates, cytologic dysplasia, necrosis, and calcification (54). Both by microscopic and molecular analysis, the cells themselves look malignant. Yet these neoplasms remain in situ, noninvasive, and retained within layers of myoepithelium, basement membrane, and a nonpermissive microenvironment. The major difference between DCIS and invasive carcinomas appears to be a tumor-permissive microenvironment in the latter. DCIS lesions are largely detected because of the appearance of clustered calcifications on mammography but also with clumped nonmass enhancement on breast MRI and rarely as radiographic or clinical mass lesions. DCIS is regarded as premalignant based on the definition that includes all patterns associated with an increased risk of subsequent breast cancer at the site/region of the diagnostic biopsy (55, 56). This is in direct distinction from other forms of proliferative breast disease such as lobular carcinoma in situ (LCIS) and atypical ductal hyperplasia (ADH), which confer increased risk of invasive breast cancer throughout the ipsilateral and contralateral breast—not associated with the site of the biopsy (57, 58). The assumption with DCIS, even after excision, is that clonally contiguous areas of the ductal tree that are not excised can lead to recurrence, either as DCIS or as invasive carcinoma.
Several confounding problems with the definitions and assumptions surrounding DCIS have arisen. Intraobserver diagnostic distinctions between ADH and DCIS, in particular, but also between LCIS and nonatypical hyperplasia are not reproducible, and this may contribute to the upgrading of these lesions when initial biopsy is followed by excision (59). Despite high recurrence risk, the disease-specific mortality after initial diagnosis of DCIS is extremely low (60). Finally, in addition to the site-specific risk, it is clear that DCIS is also a risk marker for other sites including the contralateral breast. In fact, in contrast to the original data derived from an era of more limited excisional biopsies, the recurrence rate of DCIS among more recent cohorts of patients who underwent lumpectomy and adjuvant radiation therapy is nearly equal across both breasts at 10 years.
As our understanding and treatment of invasive cancer continue to evolve, so must our understanding and treatment of DCIS. Most women with DCIS receive therapy similar to women with invasive cancer (61). But, after several decades of removing DCIS lesions from women, the invasive cancer rates have not decreased, calling into question whether the majority of these lesions are real precursors that should be treated surgically. Without treatment it is estimated that only 20% of DCIS progresses to invasive cancer (62), suggesting that the majority are overtreated with the attendant physical and psychologic consequences (61). Given that the mortality of DCIS only amounts to 3.3% over 20 years, identifying lower risk or indolent disease and then reducing overtreatment is imperative (63). Low-risk lesions may be better characterized as risk marker lesions and treated with active surveillance alone with risk-reducing endocrine therapy (64).
Several studies suggest that low-risk DCIS, defined as hormone-positive, low grade, and found in women over 40 years of age, represents a substantial fraction of all DCIS cases (65). There are four clinical trials currently underway comparing observation with and without endocrine therapy to surgical therapy for low-risk DCIS: COMET (NCT02926911); LORD (NCT02494607); LORIS (ISRCTN27544579); and LORETTA (UMIN000028298) as well as active surveillance studies to help refine our approach in low-risk patients.
On the other end of the spectrum, high-risk features may distinguish the small fraction of DCIS lesions that are destined to develop into invasive breast cancer over a relatively short period of time. A recent meta-analysis suggested that African American race, premenopausal status, involved margins, high p16 expression, detection by palpation, and high grade were prognostic markers for poor outcomes, and a different study showed HER2 positivity with high COX2 expression increased risk for the development of an ipsilateral breast cancer (66, 67).
Clearly there is a need to identify DCIS lesions that possess invasive potential (68). The multigene assay, Oncotype DCIS, was developed to determine the benefit of adjuvant radiation (69, 70), based on the Eastern Cooperative Oncology Group E5194 study that evaluated treatment using surgical excision alone without radiation (7). This test suggests that radiation should be given to only a fraction of DCIS.
For women with DCIS with high-risk features, our approach also needs to change. Very large DCIS lesions have a greater potential to evolve into aggressive invasive cancer, and some even have metastatic potential (71, 72). We are currently conducting a clinical trial of neoadjuvant intralesional pembrolizumab in women with high-risk DCIS based on observation of an altered immune environment in high-risk DCIS lesions, especially those that are hormone negative and HER2 positive. Combinations of immune cell populations, specifically those that included low numbers of activated CD8 T-lymphocytes and high numbers of CD115 macrophages, were associated with a high risk for recurrence. Importantly, all cases with metastatic recurrence were correctly predicted using the combined immune cell profiles (71). Pembrolizumab, FDA-approved in its intravenous formulation for treatment of a variety of solid tumors, is an antibody designed to directly block the interaction between PD-1 and its ligands, used by tumors to suppress immune control. Intralesional pembrolizumab eliminates systemic toxicity and may have the capacity to boost immunity and reduce progression. In our phase I study, intralesional pembrolizumab was well-tolerated and generated a robust increase in total T and cytotoxic CD8+ T-cell populations but not an antitumor response to date. The study is ongoing and will test both longer exposure and other intralesional drug combinations.
We are also working to understand the innate mechanisms employed to prevent DCIS from developing into invasive cancer. Even among patients with biologically aggressive DCIS, their risk of dying from metastatic breast cancer, after surgery alone, is only 3.3% compared with 30% to 50% among patients with biologically aggressive invasive cancer (72). Investigators from UC Davis, UC San Diego, UC San Francisco, Stanford, and the University of Vermont are investigating difference in biology of large DCIS (>5 cm) high-risk lesions compared with molecular- and age-matched invasive high-risk breast cancer patients enrolled on the I-SPY2 trial. A comprehensive analysis, including whole-exome DNA sequencing, SMART-3SEQ RNA sequencing, multiplex immunohistochemistry, and stromal profiling, is being employed to determine if we can identify unique ways in which DCIS is held in check. In order to improve our management of DCIS, we must better understand its variable biology, de-escalate treatment of low-risk disease, and find new approaches to reduce risk in the more aggressive lesions.
Parallel Lessons in Prostate and Other Cancers
The NCI recognizes that overdiagnosis occurs and, when not recognized, leads to overtreatment and occurs across a variety of cancers such as melanoma, thyroid, lung, and prostate, and noted to increase when populations of people are being screened (Fig. 5; refs. 73–76). Even abdominal CT scans, used for the diagnosis of other conditions, lead to the incidental identification of pancreatic intraductal papillary mucinous neoplasms and aggressive treatment (77). On the other hand, cervical and colon cancers are examples where efforts to de-escalate screening and the management of early lesions have maintained screening benefits.
Perhaps the most recognizable example of overdiagnosis is prostate cancer. Like breast cancer, it represents a disease with variable biological risk that has been subject to both overdiagnosis and overtreatment. Unlike breast cancer, the recognition of overdiagnosis started two decades ago, and now, active surveillance is the standard of care in approaching low-risk invasive prostate cancer. There are lessons from the history of prostate cancer management that can inform our approach to breast cancer screening.
The evolution of our understanding of prostate cancer biology and epidemiology over the past few decades parallels in many ways the history of breast cancer, with the greatest differences driven by the development and promulgation of PSA, a blood test with few parallels anywhere in oncology in terms of its organ—if not cancer—specificity. PSA is remarkably accurate at predicting the risk of potentially lethal prostate cancer up to 30 years in the future when men are tested at young ages (roughly 45–55), before benign prostatic hyperplasia becomes a prevalent alternate explanation for PSA elevation (78, 79). Unfortunately, when PSA became broadly available in the 1990s, testing occurred most commonly among men ages 65 to 80 (80), among whom the test is far less specific and less able to identify lethal disease within the window of opportunity for cure.
Prostate cancer can be risk stratified with respect to potential lethality with over 80% accuracy based on standard parameters (PSA, cancer grade and stage, patient age, and extent of biopsy involvement; ref. 81). However, as incidence rates soared at the dawn of the PSA era (82), treatment was inconsistently based on risk, such that for the next two decades low-risk cancers were pervasively overtreated and high-risk cancers often undertreated (82). The rates of prostate cancer metastasis and mortality fell sharply in the PSA era—more sharply, in fact, than breast cancer (82, 83)—but at the cost of a great deal of avoidable surgery, radiation, and other treatments associated with substantial risks of long-term adverse effects on quality of life (84).
In more recent years, it has become increasingly clear that prostate cancer reflects tremendously heterogeneous biology. At least a plurality of prostate cancers are indolent lesions of epithelial origin (IDLE) and would never cause any threat to life or health if undetected. Many others reflect a “Halsteadian” biology and can be identified and cured with local or locoregional treatment before the onset of metastasis. Finally, some are “Fisherian” and establish very early, albeit usually slow-growing, metastases that are often inapparent clinically for years after diagnosis. We are increasingly able to distinguish the IDLEs, and in the last decade, active surveillance—monitoring with PSA and periodic biopsies—has finally been endorsed as standard of care for most low-risk disease (85).
Differentiating the “Halsteadian” from “Fisherian” tumors remains more challenging. Prostate cancer includes molecular subtypes based on gene expression similar in many respects to breast cancer (86–88), but these are not the basis for differential treatment decisions. An OncoType DX test analogous to the breast assay (89) has been developed for prostate cancer along with other expression-based assays (90, 91) to offer prognostic information. However, trials analogous to TAILORx (10) have not been completed, and the markers have not been shown predictive of response to treatment. Overall, although use of surveillance for low-risk disease in prostate cancer is likely outpacing breast cancer, use of multimodal treatment for high-risk disease continues to lag behind similar standards for breast cancer, and prostate cancer has a longer way to go in establishing initial treatment selection based on molecular profiles.
Overdiagnosis and overtreatment in prostate cancer led to a 2012 recommendation by the USPSTF that no men should routinely be offered screening for prostate cancer (92), despite prostate cancer incidence falling (82, 93). In 2018, this recommendation was modulated to a “C,” suggesting shared decision-making around risks and benefits of screening, and a consensus is finally growing around a smarter approach to screening and treatment based on: (i) early baseline PSA testing for men in their late 40s or early 50s with earlier starting ages for those at higher risk (e.g., African American men or those with strong family history); (ii) deferred or infrequent follow-up testing for the large majority of the population found to have low baseline values; (iii) judicious use of secondary blood, urine, and imaging tests to help make decisions about biopsy; and (iv) risk-stratified treatment decision-making. Overdiagnosis was recognized first in prostate cancer, and the lessons learned over the last decade will inform every other cancer where screening is routine.
New Directions
The improved ability to identify circulating tumor DNA has led to studies to reduce false positives and improve specificity of screening. It is not clear whether such tests will be able to avoid indolent tumor detection, or sufficiently organ specific. Biomarker assays have been developed, but not adopted, to reduce false-positive breast biopsies (94). It remains to be seen as to whether these tests will have a place in the early detection armamentarium and whether they can be sufficiently sensitive, specific, and low cost to be applied to the entire population, or even a very high-risk population.
Summary
Overdiagnosis, when unrecognized as a consequence of screening, can lead to overtreatment. In the pursuit of the development of tests for early detection, it is critical to ask whether a diagnostic test can determine not only cancer versus not, but what type. Increasingly, this concept has been incorporated into programs like the EDRN. The need to systematically study the phenomenon of screen detected versus interval cancers led to the creation of the Molecular and Cellular Laboratory for the Study of Indolent and Aggressive Tumors, an NCI initiative started in 2015 (24, 25, 76) with the goal of finding commonalities among the aggressive and indolent conditions across organ of origin. The time has come to apply precision medicine to screening, doing more for those at high risk and less for those who have less to gain (36). Avoiding overdiagnosis and overtreatment is a worthy goal and will provide significant benefits to the population (95). A systematic approach that integrates diagnostic data, molecular classification of tumors, and outcomes into feedback that directly affects screening policies would accelerate our ability to improve early detection efforts.
Disclosure of Potential Conflicts of Interest
L.J. Esserman reports grants from Merck/Moderna [grant for testing the ability of pembrolizumab to alter the Time Immune MicroEnvironment (TIME) of high-risk DCIS] outside the submitted work, is an unpaid member of the board of directors for Quantum Leap Healthcare Collaborative, has received grant funding from QLHC for the I-SPY TRIAL, and is a member of the Blue Cross/Blue Shield Medical Advisory Panel. M.R. Cooperberg reports personal fees from Astellas (registry steering committee), Bayer (advisory board), Mdx Health (advisory board), Myriad Genetics (consultant), Dendreon (registry steering committee), Steba Biotech (advisory board), AstraZeneca (data and safety monitoring board), and AbbVie (consultant) outside the submitted work. No potential conflicts of interest were disclosed by the other authors.
Acknowledgments
The authors thank Mamta Shah for her help in assembling Fig. 1. Their work has been supported by both the Early Detection Research Network and the Molecular and Cellular Laboratory U-01 grants (U01CA196406 and U01CA111234).