Abstract
Using biomarkers to select the most at-risk population, to detect the disease while measurable and yet not clinically apparent has been the goal of many investigations. Recent advances in molecular strategies and analytic platforms, including genomics, epigenomics, proteomics, and metabolomics, have identified increasing numbers of potential biomarkers in the blood, urine, exhaled breath condensate, bronchial specimens, saliva, and sputum, but none have yet moved to the clinical setting. Therefore, there is a recognized gap between the promise and the product delivery in the cancer biomarker field. In this review, we define clinical contexts where risk and diagnostic biomarkers may have use in the management of lung cancer, identify the most relevant candidate biomarkers of early detection, provide their state of development, and finally discuss critical aspects of study design in molecular biomarkers for early detection of lung cancer. Cancer Prev Res; 5(8); 992–1006. ©2012 AACR.
Introduction
Lung cancer is the leading cause of cancer-related death in the United States (1). More than 60% of patients are diagnosed at advanced stages when a cure is unlikely (2). Five-year survival rate for patients with advanced disease is less than 10%, whereas 5-year survival rate in patients with stage I disease is greater than 70% (3). The annual mortality rate for lung cancer exceeds the annual rate for breast, prostate, and colon cancer combined, all of which have successful clinical screening tools for the detection of early-stage disease (4). For this reason, the search for diagnostic strategies for early lung cancer detection has intensified.
Until recently, the case for early detection in lung cancer was extrapolated from other cancers such as colon and breast. Clinicians and scientists continued to hypothesize that the earlier lung cancer is diagnosed, the opportunity for improved survival increases. Historically, lung cancer has been difficult to detect early and thus survival advantages were difficult to ascertain. In the 1970s and 1980s, chest x-ray and sputum cytology were tested in screening trials for lung cancer. Although this approach increases the number of lung cancers diagnosed, it did not improve lung cancer–specific mortality (5, 6). The Early Lung Cancer Action Program (ELCAP) was a large lung cancer screening trial started in the 1990s using chest computed tomographic (CT) imaging (7). It showed an improved detection rate and survival of early-stage lung cancers, which prompted the design of a large randomized National Lung Screening Trial (NLST). Exciting results of the recently completed study showed a 20% reduction in lung cancer–specific mortality using low-dose CT screening for patients at high risk for lung cancer after a median follow up of 6.5 years, compared with chest x-ray (8, 9). This is the first large randomized screening study of lung cancer by low-dose chest CT to show an improvement in overall survival, thus giving new hope in the survival for this cancer. Extrapolating from the NLST results, a screening method that reduces lung cancer–specific mortality by 20% could save an estimated 11,074 lives annually in the United States., which is far greater than 2,303, the number currently estimated to be saved with adjuvant chemotherapy (see Supplementary Data), therefore providing a strong rationale to pursue efforts in early detection.
How do we define early detection?
Early detection involves a high-risk population, a screening test, and a testing schedule. Within this context, one must distinguish populations of individuals at-risk before or after the disease becomes measurable (Fig 1).
Clinical contexts for biomarker development in early detection of lung cancer. This diagram illustrates 4 clinical contexts within 4 windows of time. The period during which lung cancer is nonmeasurable and precedes the diagnosis characterizes the context of risk assessment. It represents a long window of time during which the disease develops and corresponds to an opportunity for chemoprevention. When the disease becomes measurable but remains asymptomatic, we enter the context of early diagnosis. Two other clinical contexts relate to clinical diagnosis, that is, when the disease is measurable and patients symptomatic, and to detection of recurrence. These windows of time correspond to the different contexts for which different biomarker targets can be developed (109). Reprinted with permission of the American Thoracic Society. Copyright © 2012 American Thoracic Society.
Clinical contexts for biomarker development in early detection of lung cancer. This diagram illustrates 4 clinical contexts within 4 windows of time. The period during which lung cancer is nonmeasurable and precedes the diagnosis characterizes the context of risk assessment. It represents a long window of time during which the disease develops and corresponds to an opportunity for chemoprevention. When the disease becomes measurable but remains asymptomatic, we enter the context of early diagnosis. Two other clinical contexts relate to clinical diagnosis, that is, when the disease is measurable and patients symptomatic, and to detection of recurrence. These windows of time correspond to the different contexts for which different biomarker targets can be developed (109). Reprinted with permission of the American Thoracic Society. Copyright © 2012 American Thoracic Society.
What clinical endpoints do the biomarker candidates of early detection address?
A distinction is made between risk biomarkers to assess the risk of developing lung cancer (individuals at risk but with no measurable disease) and diagnostic biomarkers to determine whether cancer is present (individuals at risk with measurable asymptomatic disease such as lung nodules). Prognostic biomarkers in patients with early-stage disease can identify individuals with an aggressive phenotype and shorter survival regardless of the type of treatment provided and may help select populations who may benefit from adjuvant therapy. Biomarkers of risk of developing lung cancer in the absence of measurable disease are only discussed when originally developed as diagnostic biomarkers. The literature on biomarkers of risk of developing lung cancer based on proteins or single-nucleotide polymorphism (SNP) and including genome-wide association studies (GWAS) is beyond the scope of this review. Likewise, prognostic biomarkers will not be discussed further.
What are the benchmarks for clinical utility?
To be useful in the clinical setting, biomarkers go through careful phases of development as discussed below and should respond to specific criteria. The biomarkers should (i) be quantifiable and reproducible, (ii) have good testing performance [with good positive predictive value (PPV) and negative predictive value (NPV)], (iii) be measurable in accessible material, in small amounts and with little preparation, (iv) indicate a disease state, (v) have proven clinical use, (vi) be adopted by the community-at-large to take advantages of the benefits testing affords, (vii) be cost-effective; and (viii) be reimbursed by health insurers.
The Clinical Context of Early Detection
To be successful at improving lung cancer detection, biomarkers must address a specific clinical question. Two pressing clinical needs are identified, biomarkers that will address the risk of developing lung cancer and others that are diagnostic in nature and will distinguish malignant from benign nodules.
The risk of developing lung cancer
Biomarkers of risk for lung cancer have the potential to improve early detection beyond the use of CT scans that suffer from lack of sensitivity (particularly among never-smokers), specificity (high false-positive rate), and from high cost. Several published models exist that predict an individual's risk of developing lung cancer (10–15). These models were developed to select patients who may benefit from additional radiographic screening. Identifying a risk biomarker for developing lung cancer would further define the at-risk population, decrease the overall number of screening CTs conducted, and ultimately limit the downstream consequences of discovering these “false positive” nodules.
Distinguishing benign from malignant lung nodules
Diagnostic biomarkers that may assist in distinguishing a benign nodule from a malignant one would be invaluable. Depending on geographic location, up to 30% of indeterminate pulmonary nodules are ultimately found to have benign pathology when surgically resected (16). In the NLST, 24% of patients who underwent a diagnostic operation (mediastinoscopy, thoracoscopy, or thoracotomy) had benign disease (17). Thus, additional testing with a biomarker could decrease the number of surgical resections for ultimately benign disease. Current guidelines recommend that providers use models to assist with determining the likelihood that a particular nodule identified by CT scan is malignant and thus should be resected (18). However, as the use of low-dose CT for lung cancer screening evolves, better predictive models that incorporate biomarkers would assist the clinical provider in determining which patients have lung cancer. We recently validated a blood-based proteomic signature for lung cancer diagnosis and showed that it may provide added value to the clinical and imaging assessment of indeterminate lung nodules (19). Hopefully as biomarkers are developed, they will assist in identifying not only those individuals without malignancy but also help in determining those who are malignant and amenable to surgical resection.
Current Status of Early Detection Biomarkers for Lung Cancer
We will review the most recent advances made to date in the field of molecular biomarkers for risk assessment and diagnosis of lung cancer, as well as discuss the clinical use and limitations of different approaches.
In an effort to identify the most relevant lung cancer biomarkers of early detection, we selected published reports from PubMed on the basis of the keywords biomarkers, risk, diagnosis, early detection, and lung cancer. To narrow our search, we further applied the following 2 filters. First, the proposed marker, or panel of markers, must be quantitatively measurable and its performance tested in at least one sample set of clinically relevant specimens. Second, the report adhered to the PRoBE biomarker validation guidelines discussed later. As a result, we have selected original reports summarized in Tables 1 to 3. We recognize the limitation of selection, outcome reporting, and publication biases.
Characteristics and performance of most recent tissue-based candidate biomarkers for the early detection of lung cancer
References . | Specimens . | Type of marker . | Analyte . | Clinical purpose . | No. of markers . | Pathologic subtype . | Assay platform . | Preclinical samples . | BM dev. phase . | Training set . | Validation set . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Halling (26) | Bronchial specimens | DNA | 5p15, 7p12 (EGFR), 8q24 (C-MYC), CEP6 | Diagnosis | 4 | NSCLC | FISH + cytology | n/a | I | n/a | 137 | 61–75a | 83–100a | n/a |
Massion (27) | Bronchial biopsies | DNA | TP63, MYC, CEP3, CEP6 + sputum cytology + demographics | Diagnosis | 4 | NSCLC | FISH | n/a | II | n/a | 70 | n/a | n/a | 92.6 |
Feng (38) | Tumors and normal tissues | DNA methylation | RARB, BVES, CDKN2A, KCNH5, RASSF1, CDH13, RUNX, CDH1 | Diagnosis | 8 | NSCLC | Methylation array | n/a | I | 49 | n/a | n/a | n/a | n/a |
Anglim (22) | Tumors and normal tissues | DNA methylation | GDNF, MTHFR, OPCML, TNFRSF25, TCF21, PAX8, PTPRN2, and PITX2 | Diagnosis | 8 | SCC | Methylation array | n/a | I | 43 | n/a | 95.6a | 95.6a | n/a |
Schmidt (25) | Bronchial aspirates | DNA methylation | SHOX2 | Diagnosis | 1 | NSCLC | PCR | n/a | II | n/a | 523 | 68 | 95 | 86 |
Richards (24) | Tumors and normal tissues | DNA methylation | TCF21 | Diagnosis | 1 | NSCLC | PCR | n/a | II | 42 | 63 | 76 | 98a | n/a |
Spira (37) | Airway epithelium | mRNA | Gene expression signature | Diagnosis | 80 | NSCLC | Affy array | n/a | II | 77 | 52 | 80 | 84 | n/a |
Beane (28) | Airway epithelium | mRNA | Gene expression signature + clinical factors | Diagnosis | 80 | NSCLC and SCLC | Affy array | n/a | II | 76 | 62 | 100 | 91 | 97 |
Kim (111) | Tumors and normal tissues | mRNA | CBLC, CYP24A1, ALDH3A1, AKR1B10, S100P, PLUNC, LOC147 | Diagnosis | 7 | NSCLC | qRT-PCR | n/a | II | 32 | 36b | n/a | n/a | n/a |
Blomquist (30) | Tumors and normal tissues | mRNA | CAT, CEBPG, E2F1, ERCC4, ERCC5, GPX1, GPX3, GSTM3, GSTP1, GSTT1, GSTZ1, MGST1, SOD1, XRCC1 | Diagnosis | 14 | NSCLC | RT-PCR | n/a | II | n/a | 49, 40 | n/a | n/a | 82–87 |
Rahman (36) | Bronchial biopsies | MALDI signature | TMLS4, ACBP, CSTA, cyto C, MIF, ubiquitin, ACBP, Des-ubiquitin | Diagnosis | 9 | NSCLC | MALDI/MS | n/a | II | 51 | 60 | 66 | 88 | 77 |
References . | Specimens . | Type of marker . | Analyte . | Clinical purpose . | No. of markers . | Pathologic subtype . | Assay platform . | Preclinical samples . | BM dev. phase . | Training set . | Validation set . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Halling (26) | Bronchial specimens | DNA | 5p15, 7p12 (EGFR), 8q24 (C-MYC), CEP6 | Diagnosis | 4 | NSCLC | FISH + cytology | n/a | I | n/a | 137 | 61–75a | 83–100a | n/a |
Massion (27) | Bronchial biopsies | DNA | TP63, MYC, CEP3, CEP6 + sputum cytology + demographics | Diagnosis | 4 | NSCLC | FISH | n/a | II | n/a | 70 | n/a | n/a | 92.6 |
Feng (38) | Tumors and normal tissues | DNA methylation | RARB, BVES, CDKN2A, KCNH5, RASSF1, CDH13, RUNX, CDH1 | Diagnosis | 8 | NSCLC | Methylation array | n/a | I | 49 | n/a | n/a | n/a | n/a |
Anglim (22) | Tumors and normal tissues | DNA methylation | GDNF, MTHFR, OPCML, TNFRSF25, TCF21, PAX8, PTPRN2, and PITX2 | Diagnosis | 8 | SCC | Methylation array | n/a | I | 43 | n/a | 95.6a | 95.6a | n/a |
Schmidt (25) | Bronchial aspirates | DNA methylation | SHOX2 | Diagnosis | 1 | NSCLC | PCR | n/a | II | n/a | 523 | 68 | 95 | 86 |
Richards (24) | Tumors and normal tissues | DNA methylation | TCF21 | Diagnosis | 1 | NSCLC | PCR | n/a | II | 42 | 63 | 76 | 98a | n/a |
Spira (37) | Airway epithelium | mRNA | Gene expression signature | Diagnosis | 80 | NSCLC | Affy array | n/a | II | 77 | 52 | 80 | 84 | n/a |
Beane (28) | Airway epithelium | mRNA | Gene expression signature + clinical factors | Diagnosis | 80 | NSCLC and SCLC | Affy array | n/a | II | 76 | 62 | 100 | 91 | 97 |
Kim (111) | Tumors and normal tissues | mRNA | CBLC, CYP24A1, ALDH3A1, AKR1B10, S100P, PLUNC, LOC147 | Diagnosis | 7 | NSCLC | qRT-PCR | n/a | II | 32 | 36b | n/a | n/a | n/a |
Blomquist (30) | Tumors and normal tissues | mRNA | CAT, CEBPG, E2F1, ERCC4, ERCC5, GPX1, GPX3, GSTM3, GSTP1, GSTT1, GSTZ1, MGST1, SOD1, XRCC1 | Diagnosis | 14 | NSCLC | RT-PCR | n/a | II | n/a | 49, 40 | n/a | n/a | 82–87 |
Rahman (36) | Bronchial biopsies | MALDI signature | TMLS4, ACBP, CSTA, cyto C, MIF, ubiquitin, ACBP, Des-ubiquitin | Diagnosis | 9 | NSCLC | MALDI/MS | n/a | II | 51 | 60 | 66 | 88 | 77 |
NOTE: Data organized by year of publication and type of marker considered.
Abbreviations: AUC, area under the curve; BM dev. phase, biomarker development phase; n/a, not available; qPCR, quantitative real-time PCR; RT-PCR, reverse transcriptase PCR; SCC, squamous cell carcinoma; SCLC, small cell lung cancer.
aValues derived from training set only.
bValidation and training sets overlap.
Characteristics and performance of most recent blood-based candidate biomarkers for the early detection of lung cancer
References . | Specimens . | Type of marker . | Analyte . | Clinical purpose . | No. of markers . | Pathologic subtype . | Assay platform . | Preclinical samples . | BM dev. phase . | Training set . | Validation set . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Zhong (60) | Serum | AutoAB | Phage peptide clones | Diagnosis | 5 | Lung cancer | ELISA | n/a | II | 46 | 56 | 91a | 91a | 99a |
Chapman (64) | Serum | AutoAB | p53, cmyc, HER2, NY-ESO-1, CAGE, MUC1, GBU4-5 | Diagnosis | 7 | Lung cancer | ELISA | n/a | I | 154 | n/a | n/a | n/a | n/a |
Qiu (65) | Serum | AutoAB | Annexin I, 14-3-3 theta, LAMR1 | Diagnosis | 3 | NSCLC | Protein array | 170 | III | 170 | 51 | 82 | 73 | |
Wu (63) | Serum | AutoAB | Phage peptide clones | Diagnosis | 6 | NSCLC | ELISA | n/a | II | 20 | 180 | 92 | 92 | 96 |
Farlow (112) | Serum | AutoAB | IMPDH, PGAM1, ubiquilin, ANXA1, ANXA2, HSP70-9B | Diagnosis | 6 | NSCLC | ELISA | n/a | II | 196 | n/a | 94.8a | 91.1a | 96.4a |
Boyle (113) | Serum | AutoAB | p53, NY-ESO-1, CAGE, GBU4-5, Annexin 1, and SOX2 | Diagnosis | 6 | NSCLC | ELISA | n/a | II | 241 | 255 | 32 | 91 | 64 |
Greenberg (48) | Serum | DNA methylation | S-Adenosylmethionine | Diagnosis | 1 | Lung cancer | HPLC | n/a | I | 68 | n/a | 92–100a | 91–97a | 94–99a |
Begum (49) | Serum | DNA methylation | APC, CDH1, MGMT, DCC, RASSF1A, AIM | Diagnosis | 6 | NSCLC | qPCR | n/a | II | 32–639 | 106 | 84 | 57 | n/a |
Chen (114) | Serum | miRNA | miRNA signature | Diagnosis | 10 | NSCLC | qRT-PCR | n/a | II | 310 | 310 | 93 | 90 | 97 |
Bianchi (54) | Serum | miRNA | miRNA signature | Diagnosis | 34 | NSCLC | qRT-PCR | n/a | II | 64 | 64 | 71 | 90 | 89 |
Kulpa (115) | Serum | Protein | CEA, CYFRA 21-1, SCC-Ag, NSE | Diagnosis | 4 | SCC | ELISA | n/a | II | 420 | 20–62 | 95 | 71–90 | |
Patz (55) | Serum | Protein | CEA, RBP4, hAAT, SCCA | Diagnosis | 4 | Lung cancer | ELISA | n/a | II | 100 | 97 | 78 | 75 | n/a |
Takano (116) | Serum | Protein | Nectin-4 | Diagnosis | 1 | NSCLC | ELISA | n/a | II | 295 | 54 | 98 | n/a | |
Yildiz (56) | Serum | Protein | MALDI/MS signature | Diagnosis | 7 | NSCLC | MALDI/MS | n/a | II | 185 | 106 | 58 | 85.7 | 82 |
Pecot (19) | Serum | Protein | Model: MALDI/MS signature + clinical and imaging data | Diagnosis | 7 | Indeterm. lung nodule | MALDI/MS | n/a | II | 100 | n/a | n/a | 72 | |
Diamandis (117) | Serum | Protein | Penatraxin-3 | Diagnosis | 1 | Lung cancer | ELISA | n/a | I | 426 | 37–48 | 80–90 | 60–74 | |
Ostroff (118) | Serum | Aptamers | cadherin-1, CD30 ligand, endostatin, HSP90a, LRIG3, MIP-4, pleiotrophin, PRKCI, RGM-C, SCF-sR, sL-selectin, and YES | Diagnosis | 6 | NSCLC | Aptamers | n/a | II | 985 | 341 | 89 | 83 | 90 |
Zhong (119) | Plasma | AutoAB | TAA signature | Diagnosis | 5 | NSCLC | Protein microarray | 5 | I | 81 | n/a | 90a | 95a | n/a |
Kneip (120) | Plasma | DNA methylation | SHOX2 | Diagnosis | 1 | NSCLC | qPCR | n/a | II | 40 | 371 | 60 | 90 | 78 |
Shen (121) | Plasma | miRNA | miRNA-21, -126, -210, -486-5p | Diagnosis | 4 | NSCLC | qRT-PCR | n/a | II | 28 | 87 | 86 | 97 | 93 |
Wei (122) | Plasma | miRNA | miR-21 | Diagnosis | 1 | NSCLC | qRT-PCR | n/a | I | 93 | n/a | 76a | 70a | 77.5a |
Taguchi (123) | Plasma | Protein, 2 panels | EGFR, SFTPB, WFDC2, ANGPTL3, ANXA1, YWHAQ, Lmr1 | Diagnosis | 7 | NSLCL | ELISA | 52 | III | n/a | n/a | n/a | 89 | |
Boeri (53) | Plasma/tissues | miRNA | miRNA signature | Diagnosis | 13 | Lung cancer | miRNA array and RT-PCR | n/a | II | 19 | 22 | 75 | 100 | 88 |
Boeri (53) | Plasma/tissues | miRNA | miRNA signature | Diagnosis | 15 | Lung cancer | miRNA array and RT-PCR | 25 | III | 20 | 25 | 80 | 90 | 85 |
References . | Specimens . | Type of marker . | Analyte . | Clinical purpose . | No. of markers . | Pathologic subtype . | Assay platform . | Preclinical samples . | BM dev. phase . | Training set . | Validation set . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Zhong (60) | Serum | AutoAB | Phage peptide clones | Diagnosis | 5 | Lung cancer | ELISA | n/a | II | 46 | 56 | 91a | 91a | 99a |
Chapman (64) | Serum | AutoAB | p53, cmyc, HER2, NY-ESO-1, CAGE, MUC1, GBU4-5 | Diagnosis | 7 | Lung cancer | ELISA | n/a | I | 154 | n/a | n/a | n/a | n/a |
Qiu (65) | Serum | AutoAB | Annexin I, 14-3-3 theta, LAMR1 | Diagnosis | 3 | NSCLC | Protein array | 170 | III | 170 | 51 | 82 | 73 | |
Wu (63) | Serum | AutoAB | Phage peptide clones | Diagnosis | 6 | NSCLC | ELISA | n/a | II | 20 | 180 | 92 | 92 | 96 |
Farlow (112) | Serum | AutoAB | IMPDH, PGAM1, ubiquilin, ANXA1, ANXA2, HSP70-9B | Diagnosis | 6 | NSCLC | ELISA | n/a | II | 196 | n/a | 94.8a | 91.1a | 96.4a |
Boyle (113) | Serum | AutoAB | p53, NY-ESO-1, CAGE, GBU4-5, Annexin 1, and SOX2 | Diagnosis | 6 | NSCLC | ELISA | n/a | II | 241 | 255 | 32 | 91 | 64 |
Greenberg (48) | Serum | DNA methylation | S-Adenosylmethionine | Diagnosis | 1 | Lung cancer | HPLC | n/a | I | 68 | n/a | 92–100a | 91–97a | 94–99a |
Begum (49) | Serum | DNA methylation | APC, CDH1, MGMT, DCC, RASSF1A, AIM | Diagnosis | 6 | NSCLC | qPCR | n/a | II | 32–639 | 106 | 84 | 57 | n/a |
Chen (114) | Serum | miRNA | miRNA signature | Diagnosis | 10 | NSCLC | qRT-PCR | n/a | II | 310 | 310 | 93 | 90 | 97 |
Bianchi (54) | Serum | miRNA | miRNA signature | Diagnosis | 34 | NSCLC | qRT-PCR | n/a | II | 64 | 64 | 71 | 90 | 89 |
Kulpa (115) | Serum | Protein | CEA, CYFRA 21-1, SCC-Ag, NSE | Diagnosis | 4 | SCC | ELISA | n/a | II | 420 | 20–62 | 95 | 71–90 | |
Patz (55) | Serum | Protein | CEA, RBP4, hAAT, SCCA | Diagnosis | 4 | Lung cancer | ELISA | n/a | II | 100 | 97 | 78 | 75 | n/a |
Takano (116) | Serum | Protein | Nectin-4 | Diagnosis | 1 | NSCLC | ELISA | n/a | II | 295 | 54 | 98 | n/a | |
Yildiz (56) | Serum | Protein | MALDI/MS signature | Diagnosis | 7 | NSCLC | MALDI/MS | n/a | II | 185 | 106 | 58 | 85.7 | 82 |
Pecot (19) | Serum | Protein | Model: MALDI/MS signature + clinical and imaging data | Diagnosis | 7 | Indeterm. lung nodule | MALDI/MS | n/a | II | 100 | n/a | n/a | 72 | |
Diamandis (117) | Serum | Protein | Penatraxin-3 | Diagnosis | 1 | Lung cancer | ELISA | n/a | I | 426 | 37–48 | 80–90 | 60–74 | |
Ostroff (118) | Serum | Aptamers | cadherin-1, CD30 ligand, endostatin, HSP90a, LRIG3, MIP-4, pleiotrophin, PRKCI, RGM-C, SCF-sR, sL-selectin, and YES | Diagnosis | 6 | NSCLC | Aptamers | n/a | II | 985 | 341 | 89 | 83 | 90 |
Zhong (119) | Plasma | AutoAB | TAA signature | Diagnosis | 5 | NSCLC | Protein microarray | 5 | I | 81 | n/a | 90a | 95a | n/a |
Kneip (120) | Plasma | DNA methylation | SHOX2 | Diagnosis | 1 | NSCLC | qPCR | n/a | II | 40 | 371 | 60 | 90 | 78 |
Shen (121) | Plasma | miRNA | miRNA-21, -126, -210, -486-5p | Diagnosis | 4 | NSCLC | qRT-PCR | n/a | II | 28 | 87 | 86 | 97 | 93 |
Wei (122) | Plasma | miRNA | miR-21 | Diagnosis | 1 | NSCLC | qRT-PCR | n/a | I | 93 | n/a | 76a | 70a | 77.5a |
Taguchi (123) | Plasma | Protein, 2 panels | EGFR, SFTPB, WFDC2, ANGPTL3, ANXA1, YWHAQ, Lmr1 | Diagnosis | 7 | NSLCL | ELISA | 52 | III | n/a | n/a | n/a | 89 | |
Boeri (53) | Plasma/tissues | miRNA | miRNA signature | Diagnosis | 13 | Lung cancer | miRNA array and RT-PCR | n/a | II | 19 | 22 | 75 | 100 | 88 |
Boeri (53) | Plasma/tissues | miRNA | miRNA signature | Diagnosis | 15 | Lung cancer | miRNA array and RT-PCR | 25 | III | 20 | 25 | 80 | 90 | 85 |
NOTE: Data organized by year of publication, specimen type, and type of marker considered.
Abbreviations: ADC, adenocarcinoma; AUC, area under the curve; AutoAB, autoantibody; BM dev. phase, biomarker development phase; HPLC, high-performance liquid chromatography; n/a, not available; qPCR, quantitative real-time PCR; RT-PCR, reverse transcriptase PCR; SCC, squamous cell carcinoma.
aValues derived from training set only.
Characteristics and performance of most recent sputum, EBC, and peripheral blood cells candidate biomarkers for the early detection of lung cancer
References . | Specimens . | Type of marker . | Analyte . | Clinical purpose . | No. of markers . | Pathologic subtype . | Assay platform . | Preclinical samples . | BM dev. phase . | Training set . | Validation set . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Palmisano (124) | Sputum | DNA methylation | p16, MGMT | Diagnosis | 2 | SCC | PCR | 11 | III | 144 | n/a | n/a | n/a | n/a |
Belinsky (91) | Sputum | DNA methylation | p16, MGMT, DAP, RASSFIA | Diagnosis | 4 | NSCLC | PCR | n/a | I | 141 | n/a | n/a | n/a | n/a |
Varella-Garcia (125) | Sputum | DNA | Chromosomal aneusomy + cytology | Diagnosis | 4 | Lung cancer | FISH + cytology | 36 | III | 66 | n/a | 83a | 80a | n/a |
Li (87) | Sputum | DNA | HYAL2, FHIT, SFTPC | Diagnosis | 3 | NSCLC | FISH | n/a | I | 102 | n/a | 76a | 92a | n/a |
Yu (92) | Sputum | miRNA | miR-21, miR-486, miR-375, miR-200b | Diagnosis | 4 | NSCLC | qRT-PCR | n/a | II | 72 | 122 | 70 | 80 | 84 |
Showe (50) | Peripheral blood cells | mRNA | Gene expression signature | Diagnosis | 29 | NSCLC | cDNA array | n/a | II | 228 | 55 | 76 | 82 | n/a |
Tanaka (126) | Peripheral blood cells | CTC | CTCs | Diagnosis | 1 | NSCLC | Cell search system | n/a | I | 150 | n/a | 30a | 88a | 60a |
Phillips (77) | EBC | VOCs | VOCs signature | Diagnosis | 22 | NSCLC | GC-MS | n/a | I | 108 | n/a | 100a | 81a | n/a |
Bajtarevic (79) | EBC | VOCs | VOCs signature | Diagnosis | 50 | Lung cancer | GC-MS | n/a | I | 96 | 52–80a | 100a | n/a | |
Gessner (84) | EBC | Protein | VEGF, bFGF, ANG, TNF-α, IL-8 | Diagnosis | 5 | NSCLC | ELISA | n/a | I | 74 | n/a | n/a | 99–100 |
References . | Specimens . | Type of marker . | Analyte . | Clinical purpose . | No. of markers . | Pathologic subtype . | Assay platform . | Preclinical samples . | BM dev. phase . | Training set . | Validation set . | Sensitivity . | Specificity . | AUC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Palmisano (124) | Sputum | DNA methylation | p16, MGMT | Diagnosis | 2 | SCC | PCR | 11 | III | 144 | n/a | n/a | n/a | n/a |
Belinsky (91) | Sputum | DNA methylation | p16, MGMT, DAP, RASSFIA | Diagnosis | 4 | NSCLC | PCR | n/a | I | 141 | n/a | n/a | n/a | n/a |
Varella-Garcia (125) | Sputum | DNA | Chromosomal aneusomy + cytology | Diagnosis | 4 | Lung cancer | FISH + cytology | 36 | III | 66 | n/a | 83a | 80a | n/a |
Li (87) | Sputum | DNA | HYAL2, FHIT, SFTPC | Diagnosis | 3 | NSCLC | FISH | n/a | I | 102 | n/a | 76a | 92a | n/a |
Yu (92) | Sputum | miRNA | miR-21, miR-486, miR-375, miR-200b | Diagnosis | 4 | NSCLC | qRT-PCR | n/a | II | 72 | 122 | 70 | 80 | 84 |
Showe (50) | Peripheral blood cells | mRNA | Gene expression signature | Diagnosis | 29 | NSCLC | cDNA array | n/a | II | 228 | 55 | 76 | 82 | n/a |
Tanaka (126) | Peripheral blood cells | CTC | CTCs | Diagnosis | 1 | NSCLC | Cell search system | n/a | I | 150 | n/a | 30a | 88a | 60a |
Phillips (77) | EBC | VOCs | VOCs signature | Diagnosis | 22 | NSCLC | GC-MS | n/a | I | 108 | n/a | 100a | 81a | n/a |
Bajtarevic (79) | EBC | VOCs | VOCs signature | Diagnosis | 50 | Lung cancer | GC-MS | n/a | I | 96 | 52–80a | 100a | n/a | |
Gessner (84) | EBC | Protein | VEGF, bFGF, ANG, TNF-α, IL-8 | Diagnosis | 5 | NSCLC | ELISA | n/a | I | 74 | n/a | n/a | 99–100 |
NOTE: Data organized by year of publication, specimen type, and type of marker considered.
Abbreviations: ADC, adenocarcinoma; AUC, area under the curve; AutoAB, autoantibody; BM dev. phase, biomarker development phase; n/a, not available; qPCR, quantitative real-time PCR; SCC, squamous cell carcinoma; SCLC, small cell lung cancer.
aValues derived from training set only.
We have organized our report on the basis of specimen types, either tissue-based (Table 1) or biofluids-based markers. We have further subcategorized biofluids-based biomarkers into blood-based (Table 2), sputum, white blood cells, and peripheral blood cells (Table 3). The phases of biomarker development are assessed following the Early Detection Research Network (EDRN) classification (20). This classification was designed for biomarkers of early detection in the context of screening and therefore may not directly address phases of development of diagnostic biomarkers. These tables also include efforts from investigators to integrate biomarkers in models of risk prediction or diagnosis. The validation sets reported in the tables correspond to an attempt to test the biomarker (or signature) in a true independent population, also described as clinical validation (4), to evaluate the performance of the test.
Tissue-based candidate biomarkers
Numerous studies have adapted large-scale analytic approaches to profile the full spectrum of molecular aberrations associated with lung cancer malignancy in tumors. These studies have yielded valuable information that has unraveled several key molecular events of lung cancer tumorigenesis, including mapping the genomic loci associated with high risk of developing lung cancer, hypermethylation of a number of tumor suppressor genes (21–25), regions of chromosomal amplification (26, 27), mRNA expression variation (28–31), the differential expression of several microRNAs (32), and the proteomic signature of invasive (33, 34) and preinvasive lesions in lung tissues (35, 36).
Several research groups, including ours, have been testing genetic and proteomic alterations in surrogate tissues such as bronchial brushings and biopsies to determine the probability of having lung cancer (25–27, 36, 37), see Table 1. While this approach requires bronchoscopy, molecular markers obtained from the airways may pair with the recently proven CT screening to provide additional benefit when evaluating individuals at high risk for lung cancer. Although much of the early biomarker discovery efforts have used fresh-frozen samples as a primary source, acquiring these specimens is costly and laborious. Because surgical pathology specimens stored as formalin-fixed, paraffin-embedded (FFPE) blocks are widely available, many researchers are attempting to profile genomic and proteomic aberrations in such specimens. Some of these aberrations include hypermethylation of genes (38) and microRNA (miRNA) expression (39), which can be successfully extracted as candidate biomarkers.
miRNAs are a class of small noncoding RNA genes that are thought to regulate gene expression. They are abnormally expressed in several types of cancer (40, 41) and involved in a variety of biologic and pathologic processes with tissue specificity (41, 42) with the potential for clinical application (43). Another advantage of miRNA is that it is well preserved in formalin-fixed tissue, making it ideal for use in routinely processed material (44). Previous studies have identified differences in miRNA expression between squamous cell carcinoma and adenocarcinoma in lung cancer (32), as well as in other cancers (42, 43, 45). The current trend toward using FFPE samples will allow for a greater number of available samples and thus will increase statistical power and generalization of results.
Tissue-based biomarkers that reflect the molecular changes associated with specific histologic subtypes of non–small cell lung cancer (NSCLC) may provide the means to differentiate tumors originating in the lung from metastases from other organ sites. Furthermore, using immunohistochemical profiling of lung cancer tissue markers in conjunction with well-established histologic examination can provide more accurate subclassification of lung malignancies and thus may directly impact the clinical decision making of antitumor therapy. The molecular changes associated with progression from normal to malignant tissue may lead to the discovery of novel markers that can be detected in circulation or other biofluids. Limited access to early-stage tumor tissue samples, tumor heterogeneity combined with the complexity of the genome and the proteome, and the low abundance of potential biomarkers represent some of the challenges that translational researchers face when attempting to bring these biomarker candidates from the bench to the bedside.
Therefore, the potential use of tissue-based biomarkers is highly dependent on the accessibility of the specimens and the robustness of the assay offered. FFPE samples are preferred by scientists because of their availability, but their molecular analyses remain more challenging. Although noninvasive diagnostic approaches are also preferred, it may take additional time to refine an airway epithelium–based biomarker versus one that can derive the same information from a less invasive sample. For example, developing surrogate biomarkers of early-stage disease from tissues in the field of cancerization (bronchial brushings or biopsies) may require testing in more proximal and less invasive samples (e.g., nasal epithelium). This problem may be less acute for prognostic biomarkers and biomarkers predictive of response to therapy because tumor samples will be generally available for analysis. Although molecular analysis of lung tumor tissues holds great promise to revolutionize our understanding of the disease development and progression, tissue-based biomarkers from the bronchial airway have significant limitations related to tissue acquisition that may be overcome by the translation of that knowledge to more accessible specimens and by guiding the development of biofluids-based early detection strategies.
Biofluids-based markers
The underlying premise of biofluids-based biomarker research is that molecular alterations of tumor cells lead to the synthesis of distinct molecular species that can be detected in biofluids. Biofluids-based detection strategies are an attractive approach for screening, namely due to their ease of acquisition. Biofluids including peripheral blood and its components (circulating cells, plasma, and serum), exhaled breath condensate (EBC), urine, and sputum offer noninvasive access to large quantities of samples available for analysis. These alterations can lead to the generation of disease-specific molecular species such as altered or methylated DNA, overexpressed mRNA, miRNA, or proteins that can potentially be released into the extracellular microenvironment. Therefore, molecular analyses of early-stage lung cancer-related biofluids represent an attractive choice for the discovery and validation of diagnostic biomarkers (46, 47).
Blood-based markers.
Blood is a complex and dynamic medium whose components can reflect various physiologic or pathologic states such as the presence of some cancers. Detectable moieties of the blood are currently the subject of many investigations and include cellular elements such as circulating tumor cells (CTC), cell-free DNA and RNA, proteins, peptides, and metabolites. Changes of the cell-free genomic components of the blood, including DNA methylation (48, 49), DNA amplification, and gene expression (50), have been reported in the circulation of patients with lung cancer. These candidates are reported in Table 2.
microRNAs.
More recently, miRNAs have also been identified in the blood of patients with lung cancer (51, 52). In an effort to test the validity of miRNA as biomarkers able to predict lung tumor development, diagnosis, and prognosis, an extensive miRNA profiling was conducted in paired lung tumor and normal lung tissue and in plasma collected at the time of diagnosis by spiral CT. A signature of 15 miRNAs present in the blood was able to identify subjects at high risk of developing lung cancer in 2 independent cohorts of patient with 80% sensitivity and 90% specificity (53). These results suggest that miRNA expression ratios may be molecular predictors of lung cancer development and aggressiveness and may have clinical implication for lung cancer management in the future. In a separate study, a test included 34 serum miRNAs that could identify patients with early-stage NSCLCs in a population of asymptomatic high-risk individuals with 80% accuracy (54). These provocative results will have to be validated in independent cohorts.
Proteomic profiles.
Recent proteomic studies have focused on rapid proteomic profiling of blood with minimal sample preparation. One of these approaches uses matrix-assisted laser desorption/ionization—time-of-flight/mass spectrometric (MALDI-TOF/MS) patterns of abundant proteins or peptide fragments that correlate with early disease stage. Several other studies used MALDI/MS to identify proteins and peptides in serum. For example, Patz and colleagues were able to identify 4 differentially expressed serum proteins (transferrin, retinol-binding protein, antitrypsin, and haptoglobin) that discriminate between NSCLCs and controls (55). Using the same MALDI/MS approaches, several other groups including ours (56) have reported serum protein expression profiles that distinguish patients with various cancers from control subjects (57). Recently, we validated the proteomic signature in 2 prospective cohorts of patients with lung nodules and showed that it may provide added value to the clinical and imaging assessment of indeterminate lung nodules (19).
Autoantibodies.
Other promising development in the blood biomarkers field is the discovery of autoantibodies directed against tumor proteins. Alterations of the protein production in cancer cells by overexpression, mutations, misfolding, truncation, or proteolysis break immunologic tolerance and generate tumor-specific antigens which in turn elicit a host immune response (58). Autoantibodies generated against these tumor-associated antigens (TAA) during the course of disease progression can be further amplified by the immune network, making them attractive candidates for the early detection and diagnosis of cancer. Many TAA targets have been identified in patient sera in several immunologic diseases and malignancies using high-throughput screening platforms, such as cDNA expression libraries, phage display, and protein microarrays (59). For example, 2 separate groups identified several potential immunoreactive peptides for autoantibodies using a T-7 cDNA-based phage library to screen the sera of patients with NSCLCs (60, 61). Using similar techniques, Chen and colleagues also identified and validated ubiquilin-1 peptides as a potential autoantibody target in lung adenocarcinoma from sera of patients with early-stage lung cancer (62). More recently, Wu and colleagues reported the identification of 6 peptide clones discriminatory of NSCLCs using phage display techniques, but only one protein has been confirmed (63). Recent improvements in blood fractionation techniques and liquid chromatography led to the identification of several other autoantibodies (47). Autoantibodies against known lung cancer–associated proteins such as autoantibodies against p53, c-Myc, HER2, MUC1, CAGE, GBU4-5, NY-ESO-1 (64) or annexin I, PGP9.5, and 14-3-3 theta, LAMAR1 (65), or IMPDH, PGAM1 and ANXA2 (66) have also been reported as independent signatures. These recent autoantibody studies are particularly provocative because some allow the detection of cancer-specific markers in the preclinical phase of lung cancer progression. Similar to other circulating protein markers, the low abundance of autoantibodies and the complexity of the blood proteome are still substantial challenges facing these discovery efforts.
Circulating tumor cells.
The ability to capture and study CTCs is an emerging and interesting development in the field that carries the potential to become a noninvasive tool for early detection and diagnosis of cancer, measuring response to therapy, as well as for understanding the basic biology of cancer progression and metastasis (67–71). CTCs are rare cells that originate from a malignancy and circulate freely in the peripheral blood. CTCs are usually captured by immobilized antiepithelial cell adhesion molecule (EpCAM, also known as TACSTD1) antibodies either in chip or bead platforms (72, 73). The technology of rare CTCs capture is still in the early phase of development and requires more specific surface markers to increase its specificity for circulating lung cancer cells.
In summary, peripheral blood is a rich medium for cancer-specific markers from small molecules such as miRNAs to whole cells, all of which represent a great opportunity for developing a minimally invasive diagnostic test of lung cancer. Significant challenges are still preventing the clinical success of blood markers, including the extreme complexity of the blood matrix, the scarce quantity of any given marker, and the lack of sensitive, reproducible, and high-throughput verification modalities, in particular in proteomics research. New and innovative fractionation techniques, more sensitive and specific detection reagents, and well-validated assays will increase our chances of capturing blood-based biomarkers.
Exhaled breath condensate.
The analysis of EBC represents another noninvasive method of diagnosing lung cancer. The analysis of volatile organic compounds (VOC) that are linked to cancer is likely to provide a novel opportunity for the identification of diagnostic cancer biomarkers because such a large volume of sample can be collected easily and inexpensively (74–76). The underlying rationale of this approach is based on the observation that tumor cell growth is accompanied by the alteration of protein expression pattern that may lead to peroxidation of the cell membrane and thus to the emission of VOCs (76). Several recent studies have used gas chromatography combined with mass spectrometric analysis (GC-MS) of VOCs as both discovery and validation platforms (77–81). Other groups have used the analytic power of GC-MS and the sensitivity of custom-designed nanosensors in which changes in electrical resistance from organic compounds contained in exhaled breath of patients can be detected by these sensors and recorded. For example, in a study by Peng and colleagues a VOC signature that distinguished patients with lung, colorectal, and breast cancers from healthy individuals was recently identified from exhaled alveolar breath (82). These candidates are reported in Table 3. Other studies attempted to identify volatile proteins and peptides present in EBC and used them as potential markers for the early detection of lung cancer (83–85). The results of these studies provide evidence for feasibility of this strategy to isolate and identify proteins useful for early detection of lung cancer. Further studies are still needed to standardize a collection device, to further show specificity of any test, and to determine the use of this approach in clinical practice.
Sputum and urine.
Cigarette smoking leads to the increased production of sputum with glycoproteins, inflammatory cells, and exfoliated cells from the bronchial tree. Because sputum is so readily available, particularly in current and ex-smokers, that its molecular analysis has been an active area of research for lung cancer biomarkers (21). Although detecting lung cancer using sputum cytology alone has low sensitivity (86), several studies showed that combining cytology with analysis of genetic abnormalities improves diagnosis accuracy. Several types of genetic abnormalities have been detected in the sputum of patients with lung cancer, such as deletions of HYAL2, FHIT, and SFTPC (87), chromosomal aneusomy (88, 89), DNA methylation (90, 91), and miRNA (92, 93). These candidates are reported in Table 3. Also recently, measurements of genomic aneuploidy when combined with pulmonary function can significantly improve lung cancer risk prediction (94). The performance of most of these potential markers has not been tested in large-scale validation studies, and whether these markers will add value to standardized sputum cytology remains to be seen.
Urine, much like blood, EBC, and sputum, is another easily accessible biofluid that could be an important source of cancer-specific markers. Some recent proof-of-principle studies have attempted to profile the molecular changes of urine using mass spectrometric analysis. The molecular species that were detected in urine include VOCs previously identified in an animal model of lung cancer (95), and their investigation in patients with lung cancer is just beginning (96).
Study Design for Early Detection Biomarkers Validation
Appropriate study design is crucial for the successful validation of a promising biomarker for clinical use. Validation of a biomarker useful for lung cancer screening should be conducted using a nested case–control study design within a prospective longitudinal cohort following the PRoBE design (97). Specifically, random sampling of cases and controls identified from within a well-defined cohort population allows both cases and controls to be sampled from the same source population, thus providing validity to the case–control design. Matching strategies may be considered, such as using incidence density sampling to sample controls at the same time each case occurs, so that cases and controls are matched on time. While there are advantages to matching, the potential pitfalls of matching should be carefully considered before implementation (97, 98).
Generalizability of biomarkers to the appropriate clinical setting and populations is requisite for a clinically useful biomarker. The prospective cohort from whom the cases and controls are sampled must be representative of the targeted clinical population to which the biomarker will be applied. Thus, the cohort study population should comprise individuals with conditions found in the target population, such as inflammatory disease, granulomas, or benign tumors, so that false-positives can be minimized and individuals developing lung cancer can be differentiated from those not developing the disease (99).
Biospecimens necessary for biomarker development should be collected at the initiation of the prospective cohort study, before ascertainment of lung cancer status (97), and potentially over multiple time points if the biomarker changes with age and with progression to disease (99). These biospecimens are then evaluated in patients who develop biopsy-proven cancer (cases) and those who do not (controls) to develop a biomarker for clinical use as a cost-efficient approach. Importantly, the outcome should be clearly defined (100), and the biomarker assay development should be blinded to case–control status to avoid information bias (97). To validate the usefulness of a biomarker for early detection of lung cancer, diagnostic validation of the biomarker should be conducted in a different population than the one in which the biomarker was developed. Finally, this should be followed by early diagnosis validation using a screening trial with lung cancer mortality as the endpoint (101).
Assessing whether a biomarker has clinical validity requires estimation of sensitivity and specificity, which can be summarized with the receiver operator characteristic (ROC) curve (102). Two additional clinically relevant measures that can be measured by ROC include NPV and PPV, which are estimated using sensitivity and specificity. These clinically important indices describe the probability of developing or having disease given a positive test. Estimates of PPV and NPV are influenced by the prevalence of the disease and consequently will vary by patient age, target population, and disease stage. Merely targeting the screening to a high-risk population based on demographic factors can alter the screening test performance characteristics (103). Thus, for a biomarker to be clinically valid and generalizable, the biomarker validation process must be applied to multiple populations having different demographic characteristics for determining the clinical validity and use of a biomarker.
The use of lung cancer diagnosis prediction models will grow as the models accuracy improves. When a patient presents in the clinic and undergoes imaging, for example, CT resulting in a detected pulmonary nodule, current predictive models for assessing lung cancer malignancy include those developed by Cummings and colleagues (104), Gurney and colleagues (105), Swensen and colleagues (106), and Gould and colleagues (18). However, these models suffer from relatively poor accuracy in particular for indeterminate pulmonary nodules and do not provide accurate prediction of malignancy among patients referred for surgery (16). While these models may include predictors such as patient age, smoking status, duration and intensity of smoking, cancer history, gender, race, asbestos exposure, chronic obstructive pulmonary disease (COPD)/emphysema, and pulmonary nodule characteristics by CT (size, shape, density, and location), the addition of biomarkers may provide additional classification accuracy to the current lung cancer malignancy prediction models. Recent interest has focused on the potential for molecular tools to improve models predicting lung cancer diagnosis, yet most studies have shown little improvement with added gene expression profile in cytologically normal large airway epithelium obtained via bronchoscopic brushings (28) or a serum proteomic profile in patients presenting with pulmonary nodules (19). While molecular markers are not yet fully incorporated into lung cancer malignancy prediction models, it is likely that a profile of molecular markers will be necessary to be clinically useful as biomarkers for early detection of lung cancer (107). Future development of predictive models should incorporate previously identified predictors and newly identified biomarkers (100).
Current Challenges in Lung Cancer Biomarker Development and Implementation
One of the main objectives of molecular medicine in lung cancer is to identify biomarkers that discriminate between low- versus high-risk individuals and between benign and malignant lung tumors. Ultimately, these biomarkers can potentially be translated to noninvasive, simple, and reliable diagnostic tests for early detection of the disease. The underlying assumption behind these efforts is that tumor-specific or overexpressed proteins can be detected simply and accurately in complex clinical samples such as surrogate tissues and biofluids. The intensive research in genomics and proteomics aimed at identifying these biomarkers has yielded a large number of potential diagnostic biomarkers, although few have progressed to the level of U.S. Food and Drug Administration (FDA) approval for diagnostics (108).
This disappointingly slow pace of lung cancer biomarkers discovery and validation is attributed to a host of technologic and methodologic factors. The gap between promise and product can partially be explained by the fact that the current discovery methods are neither reliable nor efficient. One reason is that the current analytic technologies still suffer from the limited power to detect low-abundant cancer markers against a high background of high-abundance molecular species such as proteins in very complex matrices such as plasma or serum. These low-abundance markers in biofluids may be the most promising cancer biomarkers. Consequently, many of the best candidates may thus be missed during the discovery phase.
Another quandary is the limited capacity to verify and validate analytically existing candidate markers in a high-throughput manner. This is particularly true in proteomics research. The lack of available quality reagents such as antibodies or methodologies to translate the discovery of candidates in tissue specimens and measure their concentration in the circulation remains an enormous challenge. Therefore, it is possible that biomarkers have already been “discovered” but not yet validated. Furthermore, once a long list of candidate biomarkers is compiled, no current standardized method exists for selecting those that are most promising for systematic validation. In addition, the reproducibility of biomarker data has been flawed because of the poor design [e.g., underrepresentation of studies using a nested case–control design (ref. 97)], model overfitting, and the lack of cross-validation and independent validation. Changing technology, low concentration of signals combined with very few prospective studies, and a low incidence disease make the area of biomarker research challenging.
Conclusions and Future Clinical Implications
The molecular analysis of a variety of biospecimens has allowed the discovery of relevant candidate biomarkers and consequently the identification of novel proteins that may have a role in the development of lung cancer. A high volume of data from multiple high-throughput biochemical analyses of clinical material from “-omics” sources has been accumulating at an exponential rate in the last few years, generating large number of biomarker candidates. None of the published candidate biomarkers of risk or of lung cancer diagnosis are ready for clinical use, and few have moved to phase III of biomarker development. Lung cancer is recognized as a complex and heterogeneous disease, not only at the biochemical level (genes, proteins, metabolites) but also at the tissue, organism, and population level. There is a need for incorporating findings from multiple discovery platforms into a mathematical framework that can improve our level of understanding of the disease process. A biofluids-based molecular test may improve the selection of high-risk individuals for CT screening, distinguish those with malignant nodules from benign lesions, and identify patients with particularly aggressive cancer. Clinical benefit could include further reductions in mortality and thus provide significant cost-savings to the health care system.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Acknowledgments
The authors thank Mr. William Alborn for his insightful comments and suggestions and Judith Roberts from the Vanderbilt Tumor Registry for her help in providing data to support the estimates of lives saved from adjuvant chemotherapy in the United States.
Grant Support
The authors thank the following funding source for support of this work: NIH (U01CA152662 and RO1 CA102353) awarded to P.P. Massion, Veterans Health Administration (VA-CDA2) awarded to E.L. Grogan, and A Vanderbilt Clinical and Translational Research Scholars award to M.C. Aldrich.