Lung cancer is the leading cause of cancer death due, in part, to lack of early diagnostic tools. Bronchoscopy represents a relatively noninvasive initial diagnostic test in smokers with suspect disease, but it has low sensitivity. We have reported a gene expression profile in cytologically normal large airway epithelium obtained via bronchoscopic brushings, which is a sensitive and specific biomarker for lung cancer. Here, we evaluate the independence of the biomarker from other clinical risk factors and determine the performance of a clinicogenomic model that combines clinical factors and gene expression.

Training (n = 76) and test (n = 62) sets consisted of smokers undergoing bronchoscopy for suspicion of lung cancer at five medical centers. Logistic regression models describing the likelihood of having lung cancer using the biomarker, clinical factors, and these data combined were tested using the independent set of patients with nondiagnostic bronchoscopies. The model predictions were also compared with physicians' clinical assessment.

The gene expression biomarker is associated with cancer status in the combined clinicogenomic model (P < 0.005). There is a significant difference in performance of the clinicogenomic relative to the clinical model (P < 0.05). In the test set, the clinicogenomic model increases sensitivity and negative predictive value to 100% and results in higher specificity (91%) and positive predictive value (81%) compared with other models. The clinicogenomic model has high accuracy where physician assessment is most uncertain.

The airway gene expression biomarker provides information about the likelihood of lung cancer not captured by clinical factors, and the clinicogenomic model has the highest prediction accuracy. These findings suggest that use of the clinicogenomic model may expedite more invasive testing and definitive therapy for smokers with lung cancer and reduce invasive diagnostic procedures for individuals without lung cancer.

Lung cancer is the leading cause of cancer death in the United States and the world, with more than 1 million deaths worldwide annually (1). Eighty-five to ninety percent of subjects with lung cancer in the United States are current or former smokers, with 10% to 20% of heavy smokers developing this disease (2). Lack of effective tools to diagnose lung cancer at an early stage before it has spread to regional lymph nodes or metastasized beyond the lung has resulted in a 5-year mortality rate of 80% to 85% (3).

Smokers are often suspected of having lung cancer based on abnormal radiographic findings and/or symptoms that are not specific for lung cancer. Fiberoptic bronchoscopy represents a relatively noninvasive initial diagnostic test used in this setting, with cytologic examination of materials obtained via endobronchial brushings, bronchoalveolar lavage, and endobronchial and transbronchial biopsies of the suspect area (4, 5). Whereas cytopathology is 100% specific for lung cancer, the sensitivity of cytologic examination of materials obtained at bronchoscopy ranges from 30% for small peripheral lesions to 80% for centrally located endobronchial tumors (6). Given the relatively low sensitivity of bronchoscopy, additional and more invasive diagnostic tests are routinely needed, which are costly, incur risk, and prolong the diagnostic evaluation of patients with suspect lung cancer. Determining which suspect lung cancer patients with lung cancer-negative bronchoscopies should undergo these additional diagnostic tests is currently a matter of clinical judgment. We have recently reported a gene expression profile in cytologically normal large airway epithelial cells obtained via brushing at the time of bronchoscopy, which serves as a diagnostic biomarker for lung cancer (7). This biomarker is an accurate predictor of lung cancer at an early and potentially curable stage, and the sensitivity of the biomarker could substantially reduce the number of individuals requiring additional invasive diagnostic testing following a lung cancer–negative bronchoscopy.

Many groups have developed gene expression profiles that can be used to distinguish between different diagnostic and prognostic subgroups in a variety of cancers. An unexplored issue for many of these biomarkers is whether the gene expression patterns are independent of other clinical risk factors. If so, it presents an opportunity to create clinicogenomic models that incorporate both clinical and gene expression predictors of disease likelihood. There are several examples of such clinicogenomic approaches. Pittman et al. (8) have shown improved prediction accuracy for breast cancer recurrence through an integrative clinicogenomic model. Similarly, Li (9) combined genomic and clinical data in a survival model to predict the outcome of patients with diffuse large B-cell lymphoma after chemotherapy. Stephenson et al. (10) integrated gene expression and clinical data using logistic regression modeling to predict prostate carcinoma reoccurrence after radial prostatectomy, and showed that a combined model had the highest predictive accuracy. In the near future, diverse sources of data such as gene expression, genetic, proteomic, and clinical data will likely be integrated to make accurate diagnoses or prognostic predictions for complex diseases such as cancer (11).

With ∼90 million former and current smokers in the United States (12) and the emergence of sensitive but nonspecific chest imaging technologies, patients increasingly present to clinicians with abnormal radiographic findings that are suspicious for lung cancer. Whereas no definitive predictive model for lung cancer exists for use in this setting, numerous clinical and radiographic variables have been associated with the likelihood of lung malignancy: age (13), smoking history (ref. 14; including number of pack-years, age started, intensity of smoking and years since quitting), history of asbestos exposure, clinical symptoms including hemoptysis and weight loss (15), size of the nodule or mass and radiographic appearance on chest imaging (15, 16), presence of lymphadenopathy, clinical or radiographic evidence for metastatic disease, evidence of airflow obstruction on spirometry (16), and uptake of fluorodeoxyglucose on positron emission tomography scan (17, 18). Several groups have developed predictive models using combinations of the above variables in the setting of solitary pulmonary nodules (15, 19, 20). Swensen et al. (21) compared such a model for the presence of solitary pulmonary nodules with predictions made by physicians and found that there was no significant difference, although they suggested that the model had potential in the management of patients with benign nodules. In addition, risk prediction models for lung cancer, including a recent large case-control study of never, former, and current smokers, have been reported (22).

In this study, we sought to evaluate whether the lung cancer predictions made by our large airway gene expression biomarker are independent of other clinical risk factors and, if so, to determine the relative performance of a clinicogenomic model that combines clinical risk factors with the biomarker. We show that the biomarker provides information about the likelihood of a patient having lung cancer beyond that which is contained in the available clinical data, despite the clinical model predictions being highly associated with the subjective clinical assessment of patient risk made by pulmonary physicians. Furthermore, we find that the clinicogenomic model has better diagnostic accuracy than either the clinical model or the gene expression biomarker alone. Our data suggest that the clinicogenomic model could be efficacious in predicting the likelihood of lung cancer in those patients where physicians are most uncertain about the likelihood of disease.

Patient population

The present study cohort consists of patients who participated in our study to develop the large airway gene expression biomarker (7). In that study, we recruited current and former smokers undergoing flexible bronchoscopy for clinical suspicion of lung cancer at four tertiary medical centers between January 2003 and April 2005 as previously described (7). All subjects were >21 y of age and had no contraindications to flexible bronchoscopy. Never smokers and subjects who only smoked cigars were excluded from the study. All subjects were followed after bronchoscopy until a final diagnosis of lung cancer or an alternative diagnosis was made (mean follow-up time, 52 d). One hundred twenty-nine subjects (60 smokers with lung cancer and 69 smokers without lung cancer) who achieved final diagnoses as of May 2005 and had high quality microarray data were included in the primary sample set. Seventy-seven of these samples were randomly assigned to the training set. The training set for the current study (n = 76) excluded one of these training set samples due to incomplete smoking history (Fig. 1). After completion of the primary study, a second set of samples (n = 35) was collected prospectively from smokers undergoing flexible bronchoscopy for clinical suspicion of lung cancer at five medical centers between June 2005 and January 2006. Inclusion and exclusion criteria were identical to the primary sample set. The test set samples in the current study (n = 87) combined both the remaining samples from the primary sample set (n = 52) and this prospective test set (n = 35), but we chose to limit the test set to the subset of patients that did not have a definitive diagnosis following bronchoscopy (n = 62), as is shown in Fig. 1 and described in more detail below. Demographic information on all subjects is detailed in Table 1 and information about the cell type, stage, and location of the lung tumors (n = 78) in the study cohort is shown in Table 2. The study was approved by the Institutional Review Boards of the five medical centers at which patients were recruited (Boston University Medical Center, Boston, MA; Boston Veterans Administration, West Roxbury, MA; Lahey Clinic, Burlington, MA; St. James's Hospital, Dublin, Ireland; and St. Elizabeth's Medical Center, Boston, MA) and all participants provided written informed consent

Fig. 1

Training and test sample sets. The training and test samples were derived from a previously published study assaying airway epithelial gene expression from current and former smokers undergoing bronchoscopy for the clinical suspicion of lung cancer. A, we previously constructed a gene expression biomarker that predicts the presence of lung cancer using a training set of 77 patients. For the current study, one of these samples was removed due to incomplete smoking history, resulting in the logistic regression models being trained with data from 76 patients. The models were subsequently tested on the subset of training samples (n = 56) that had cytopathology that was nondiagnostic of lung cancer. B, the biomarker was also tested on the subset of independent samples with nondiagnostic cytopathology (n = 62) from the combined test and prospective validation sample sets (n = 87) used in our previous study.

Fig. 1

Training and test sample sets. The training and test samples were derived from a previously published study assaying airway epithelial gene expression from current and former smokers undergoing bronchoscopy for the clinical suspicion of lung cancer. A, we previously constructed a gene expression biomarker that predicts the presence of lung cancer using a training set of 77 patients. For the current study, one of these samples was removed due to incomplete smoking history, resulting in the logistic regression models being trained with data from 76 patients. The models were subsequently tested on the subset of training samples (n = 56) that had cytopathology that was nondiagnostic of lung cancer. B, the biomarker was also tested on the subset of independent samples with nondiagnostic cytopathology (n = 62) from the combined test and prospective validation sample sets (n = 87) used in our previous study.

Close modal
Table 1

Demographic, clinical, and biomarker characteristics stratified by cancer status or membership in the training and test sets

FactorOverall (n = 163)Cancer (n = 78)No cancer (n = 85)P*Train (n = 76)Test (n = 62)P*
Age 58.1 ± 14.3 64.5 ± 9.6 52.3 ± 15.4 <0.001 57.3 ± 14.0 57.5 ± 15.3 0.91 
Male 122/163 (74.8) 60/78 (76.9) 62/85 (72.9) 0.59 59/76 (77.6) 42/62 (67.7) 0.25 
Caucasian 110/163 (67.5) 67/78 (85.9) 43/85 (50.6) <0.001 52/76 (68.4) 36/62 (58.1) 0.22 
Smoked within 10 y 130/163 (79.8) 60/78 (76.9) 70/85 (82.4) 0.44 62/76 (81.6) 47/62 (75.8) 0.53 
Pack-years 44.9 ± 32.0 54.9 ± 26.8 35.7 ± 33.7 <0.001 45.8 ± 30.2 39.9 ± 35.4 0.3 
Diagnostic bronchoscopy 45/163 (27.6) 40/78 (51.3) 5/85 (5.9) <0.001 20/76 (26.3) 0/62 (0) <0.001 
Cancer 78/163 (47.9) 78/78 (100.0) 0/85 (0.0) <0.001 40/76 (52.6) 17/62 (27.4) 0.003 
Lymphadenopathy 43/163 (26.4) 36/78 (46.2) 7/85 (8.2) <0.001 17/76 (22.4) 10/62 (16.1) 0.4 
Hemoptysis 15/163 (9.2) 6/78 (7.7) 9/85 (10.6) 0.6 10/76 (13.2) 2/62 (3.2) 0.07 
Mass size >3 cm 48/163 (29.4) 43/78 (55.1) 5/85 (5.9) <0.001 24/76 (31.6) 10/62 (16.1) 0.047 
Biomarker −0.35 ± 8.93 4.65 ± 7.04 −4.94 ± 7.98 <0.001 0.34 ± 8.97 −2.72 ± 9.12 0.05 
FactorOverall (n = 163)Cancer (n = 78)No cancer (n = 85)P*Train (n = 76)Test (n = 62)P*
Age 58.1 ± 14.3 64.5 ± 9.6 52.3 ± 15.4 <0.001 57.3 ± 14.0 57.5 ± 15.3 0.91 
Male 122/163 (74.8) 60/78 (76.9) 62/85 (72.9) 0.59 59/76 (77.6) 42/62 (67.7) 0.25 
Caucasian 110/163 (67.5) 67/78 (85.9) 43/85 (50.6) <0.001 52/76 (68.4) 36/62 (58.1) 0.22 
Smoked within 10 y 130/163 (79.8) 60/78 (76.9) 70/85 (82.4) 0.44 62/76 (81.6) 47/62 (75.8) 0.53 
Pack-years 44.9 ± 32.0 54.9 ± 26.8 35.7 ± 33.7 <0.001 45.8 ± 30.2 39.9 ± 35.4 0.3 
Diagnostic bronchoscopy 45/163 (27.6) 40/78 (51.3) 5/85 (5.9) <0.001 20/76 (26.3) 0/62 (0) <0.001 
Cancer 78/163 (47.9) 78/78 (100.0) 0/85 (0.0) <0.001 40/76 (52.6) 17/62 (27.4) 0.003 
Lymphadenopathy 43/163 (26.4) 36/78 (46.2) 7/85 (8.2) <0.001 17/76 (22.4) 10/62 (16.1) 0.4 
Hemoptysis 15/163 (9.2) 6/78 (7.7) 9/85 (10.6) 0.6 10/76 (13.2) 2/62 (3.2) 0.07 
Mass size >3 cm 48/163 (29.4) 43/78 (55.1) 5/85 (5.9) <0.001 24/76 (31.6) 10/62 (16.1) 0.047 
Biomarker −0.35 ± 8.93 4.65 ± 7.04 −4.94 ± 7.98 <0.001 0.34 ± 8.97 −2.72 ± 9.12 0.05 

NOTE: Data are means ± SDs for continuous variables and proportions with percentages for dichotomous variables.

*P values are for the comparison of patients with and without cancer and for the comparison of the training and test sets. Two-sample t tests with unequal variances were used for continuous variables; Fisher's exact test was used for dichotomous variables.

Table 2

Cell type, stage, and location information for lung cancer samples (n = 78)

n%Samples with diagnostic bronchoscopy
Cell Type   
 SCLC 14 64.3 
 NSCLC (unknown subtype) 15 60.0 
 Squamous 27 55.6 
 Adenocarcinoma 18 33.3 
 Large cell carcinoma 25.0 
Stage   
 Unknown 0.0 
 I 14 35.7 
 II 50.0 
 III 25 52.0 
 IV 22 54.5 
Location   
 Central 28 71.4 
 Peripheral 49 40.8 
 Other* 0.0 
n%Samples with diagnostic bronchoscopy
Cell Type   
 SCLC 14 64.3 
 NSCLC (unknown subtype) 15 60.0 
 Squamous 27 55.6 
 Adenocarcinoma 18 33.3 
 Large cell carcinoma 25.0 
Stage   
 Unknown 0.0 
 I 14 35.7 
 II 50.0 
 III 25 52.0 
 IV 22 54.5 
Location   
 Central 28 71.4 
 Peripheral 49 40.8 
 Other* 0.0 

NOTE: The percentage of samples in each grouping where bronchoscopy yielded diagnostic cytopathology for lung cancer is reported.

Abbreviations: SCLC, small cell lung cancer; NSCLC, non–small cell lung cancer.

*Radiographic findings that cannot be characterized as central versus peripheral.

Large airway gene expression biomarker for lung cancer

Using the Affymetrix HG-U133A microarray, we have previously developed a gene expression biomarker for lung cancer using gene expression profiles in cytologically normal large airway epithelial cells collected from brushing the right mainstem bronchus of smokers undergoing bronchoscopy for suspicion of lung cancer (Gene Expression Omnibus accession no. GSE4115; ref. 7). The biomarker was developed using the training set of the current study (n = 76) with the addition of one sample that did not have a complete smoking history (Fig. 1). The biomarker was constructed from the expression levels of 80 probe sets (72 unique genes, 7 unannotated transcripts, and 1 redundant probe set) using the weighted-voting algorithm (23) that combines these expression levels into a biomarker score. A positive score is predictive of cancer and a negative score is predictive of no cancer. In this study, we use the biomarker score as a starting point for the following statistical analyses: (a) building three logistic regression models to determine the likelihood of lung cancer using the clinical risk factors alone, the biomarker alone, or the likelihood of lung cancer using the clinical risk factors and biomarker combined; (b) comparison of the predictive values of these three models on a test set of patients not used in the initial model building phase; and (c) comparison of the clinical models with assessments made by expert clinicians.

Construction of logistic regression models

Logistic regression models to quantify the probability of a patient having lung cancer were generated using the training set samples (n = 76). This training set included patients who had cytopathology findings that confirmed a diagnosis of either lung cancer or alternate noncancer pathology. Patients with diagnostic bronchoscopies were included in the training set to maximize the number of samples and because exclusion of these samples was unnecessary to develop models capable of accurately predicting the lung cancer status of patients with nondiagnostic bronchoscopies (data not shown). For the clinical and clinicogenomic models, the available clinical variables (Table 1) included age, pack-years of smoking, and the following dichotomous variables: gender (male, 1; female, 0), race (1, Caucasian; 0, otherwise), smoking status (1, former smokers that quit ≥10 y ago; 0, otherwise), hemoptysis (1, presence; 0, otherwise), lymphadenopathy (1, mediastinal or hilar lymph nodes >1 cm on computed tomography chest scan; 0, otherwise), and mass size (1, having a mass size >3 cm; 0, otherwise). Positron emission tomography scan information was only available for 15 patients and was not included in the model. Backward stepwise model selection using Akaike's information criterion (24) was used to select the optimal clinical model for the probability of a patient having lung cancer.

To create an integrated clinicogenomic model and determine the independence and magnitude of the contribution of the gene expression biomarker after adjusting for the effects of the clinical variables, we first added the biomarker to the optimal clinical model. The biomarker scores and all of the available clinical variables were then used with backward stepwise model selection by Akaike's information criterion to select the optimal model. Both approaches yielded the same combined model. To verify that the biomarker score performs similarly in logistic regression as in the weighted-voting prediction algorithm used in our previous work (7), the accuracy, sensitivity, specificity, positive predictive value, and negative predictive value were compared for the weighted-voting predictions and the predictions made by a logistic regression model that included only the biomarker score across the independent test samples.

Comparison of model performance on independent patients

The performance of the logistic regression models (clinical, biomarker, and clinicogenomic) was initially evaluated on the subset of patients in the training set (n = 76) in which the cytopathology of materials obtained at bronchoscopy was nondiagnostic (n = 56; Fig. 1). We chose to focus on nondiagnostic bronchoscopies to specifically assess the utility of the gene expression biomarker and clinical parameters in the setting of patients that require further diagnostic evaluation for lung cancer. More importantly, we also tested the models in the nondiagnostic bronchoscopy test set (n = 62; Fig. 1). For each of the models, patients that had a probability of lung cancer ≥0.5 were classified as having lung cancer, and patients with a probability <0.5 were classified as not having lung cancer. Receiver operating characteristic (ROC) curves were also used to compare the clinical model with the clinicogenomic model in the training set patients with nondiagnostic bronchoscopies, the independent test set, and combined set of all patients with nondiagnostic bronchoscopies (n = 118). To assess whether or not two ROC curves based on the same set of samples were significantly different, methods developed for comparing ROC curves derived from the same cases were used (25, 26). To compare ROC curves based on different sample sets, we used a two-sample z test. The ROC curves serve as a common scale for evaluating the additional merit of variables added to the model because odds ratios for two different variables may not be comparable (27). The accuracy, sensitivity, specificity, positive predictive value, and negative predicate value were also calculated across the independent test set for the clinical model, the biomarker model, and the clinicogenomic model.

Subjective clinical assessment

Three independent pulmonary clinicians practicing at a tertiary medical center, blinded to the final diagnoses, evaluated each patient's clinical history at the time of bronchoscopy. The history included, but was not limited to, age, smoking status, cumulative tobacco exposure, comorbidities, symptoms/signs, radiographic findings, and positron emission tomography scan results if available. Based on this information, the clinicians classified each patient into one of the three risk groups: low (<10% assessed probability of lung cancer), medium (10-50% assessed probability of lung cancer), and high (>50% assessed probability of lung cancer). The final subjective assignment for each subject was decided by choosing the median opinion. The interrater reliability for the clinical classification of patients' nondiagnostic bronchoscopies was significant, indicating that the level of agreement between the clinicians was greater then would be expected by chance as measured by the κ statistic (κ = 0.57; P < 0.001; ref. 28).

Comparison of subjective clinical assessment with the clinicogenomic model

The sample size for building a comprehensive clinical model to predict the risk of having lung cancer was limited as was the scope of variables that were available for inclusion in the clinical and clinicogenomic models. We therefore sought to determine if the clinical model performs similarly to the subjective clinical assessment made by pulmonary specialists because this assessment is (a) “trained” on the large number of patients seen over each clinician's career and (b) considers all of the information contained within a patient's medical records. A Wilcoxon test was used to assess whether or not the clinical model–derived probability of having lung cancer varied between samples classified as low, medium, or high cancer risk by the clinicians.

Statistical analysis

All statistical analyses were conducted using R statistical software version 2.2.1.

Evaluating the gene expression biomarker as an independent predictor of lung cancer

The demographic and clinical characteristics as well as the mean and SD for the biomarker scores stratified by cancer status and membership in the training or test sets are shown in Table 1. Age, race, pack-years of smoking, lymphadenopathy, mass size, and the biomarker score were significantly different (P < 0.001) between patients with and without lung cancer. The test and training sets, however, were well balanced for the variables used in the analyses (although the incidence of having a mass size >3 cm was somewhat lower in the test set compared with the training set; P = 0.047). Information about the cell type, stage, and location of the tumors in the cancer patients, as well as the fraction of diagnostic bronchoscopies for each subgroup, is shown in Table 2. Effect estimates and derived odds ratios for the variables in each of the three logistic regression models are shown in Table 3. We found that the optimal clinical model for this cohort did not include pack-years. This is likely due to the strong correlation between age and pack-years (r = 0.56; P < 0.001) in the training set. Clinical and clinicogenomic models constructed with pack-years instead of age yielded similar results when tested on the independent test set (n = 62), with accuracies of 87% and 85% and area under the ROC curves of 0.94 and 0.86, respectively. The optimal clinical model did not include smoking status (former versus current smokers) regardless of how time since quitting was dichotomized. In addition, dichotomizing mass size using a threshold value of 2 cm (instead of 3 cm) produced clinical and clinicogenomic models with similar overall accuracy.

Table 3

Logistic regression models fitted on training set samples

ModelRangeCoefficientOR (95% CI)P
Biomarker alone     
 Intercept NA 0.07 NA 0.776 
 Biomarker −18.88 to 16.91 0.13 1.14 (1.06-1.21) 0.00017 
Clinical variables alone     
 Intercept NA −5.01 NA 0.003 
 Age 23-79 0.07 1.07 (1.02-1.13) 0.008 
 Mass size 0-1 2.19 8.91 (2.08-38.25) 0.003 
 Lymphadenopathy 0-1 2.09 8.12 (1.45-45.63) 0.017 
Biomarker + clinical variables     
 Intercept NA −4.9 NA 0.014 
 Biomarker −18.88 to 16.91 0.13 1.13 (1.04-1.24) 0.005 
 Age 23-79 0.07 1.07 (1.00-1.14) 0.036 
 Mass size 0-1 1.85 6.38 (1.39-29.34) 0.017 
 Lymphadenopathy 0-1 2.75 15.69 (2.23-110.28) 0.006 
ModelRangeCoefficientOR (95% CI)P
Biomarker alone     
 Intercept NA 0.07 NA 0.776 
 Biomarker −18.88 to 16.91 0.13 1.14 (1.06-1.21) 0.00017 
Clinical variables alone     
 Intercept NA −5.01 NA 0.003 
 Age 23-79 0.07 1.07 (1.02-1.13) 0.008 
 Mass size 0-1 2.19 8.91 (2.08-38.25) 0.003 
 Lymphadenopathy 0-1 2.09 8.12 (1.45-45.63) 0.017 
Biomarker + clinical variables     
 Intercept NA −4.9 NA 0.014 
 Biomarker −18.88 to 16.91 0.13 1.13 (1.04-1.24) 0.005 
 Age 23-79 0.07 1.07 (1.00-1.14) 0.036 
 Mass size 0-1 1.85 6.38 (1.39-29.34) 0.017 
 Lymphadenopathy 0-1 2.75 15.69 (2.23-110.28) 0.006 

NOTE: The range, regression coefficients, odds ratio (OR), 95% confidence interval for the odds ratio (95% CI), and the P value of the variables across the training set samples (n = 76) are reported.

A logistic regression model describing the likelihood of having lung cancer derived from the biomarker score produced equivalent results to the weighted-voting algorithm predictions of lung cancer status previously described (4), resulting in eight versus seven incorrect classifications, indicating that the biomarker score is an accurate way to model the original biomarker prediction algorithm in the clinicogenomic model. The biomarker score is a significant predictor of lung cancer likelihood both in the biomarker only model (P < 0.001) and in the clinicogenomic model (P < 0.005). In the clinicogenomic model, the coefficients of the clinical variables are largely unchanged from the clinical model, and the coefficient of the biomarker is largely unchanged from the biomarker only model. These data suggest that the gene expression biomarker and the clinical variables are independent predictors of lung cancer risk.

Evaluating the performance of the clinicogenomic model

The three models were used to predict the cancer status of a subset of the training samples with nondiagnostic bronchoscopies (n = 56), the independent test samples (n = 62), and these two sets combined (n = 118). ROC curves were used to compare the performance of the clinical model with that of the clinicogenomic model (Fig. 2). The clinicogenomic model had better performance than the clinical model in all three sample sets. Whereas this difference in performance does not reach statistical significance in the test set, when the training and test sets were combined, there was a significant difference in the area under the curve between the clinicogenomic and clinical models (P < 0.05). The performance of the models in the training set samples does not seem to be any better than in the test set samples (P = 0.25, for the difference in the area under the ROC curves; the area under the curve difference is 0.065; 95% confidence interval, −0.046 to 0.174). This suggests that the models do not overfit the training data and that it is therefore reasonable to combine the training and test sets to assess the significance of the difference in the performance of the clinical and clinicogenomic models.

Fig. 2

ROC curves for the clinical model and the clinicogenomic model across the different sample sets. The clinical model (gray line) includes the following variables: age, mass size, and lymphadenopathy; the clinical and biomarker model includes the above variables and the biomarker score (black line). Both models were derived using the training set samples (n = 76). A, ROC analysis of the nondiagnostic training set samples (n = 56). The area under the curve for the clinical and clinicogenomic model is 0.84 and 0.90, respectively. B, ROC analysis of the test samples (n = 62). The area under the curve for the clinical and clinicogenomic model is 0.94 and 0.97, respectively. C, ROC analysis of the combined training and test sets (n = 118). The area under the curve for the clinical and clinicogenomic model is 0.89 and 0.94, respectively, which represents a significant difference between the two curves (P < 0.05).

Fig. 2

ROC curves for the clinical model and the clinicogenomic model across the different sample sets. The clinical model (gray line) includes the following variables: age, mass size, and lymphadenopathy; the clinical and biomarker model includes the above variables and the biomarker score (black line). Both models were derived using the training set samples (n = 76). A, ROC analysis of the nondiagnostic training set samples (n = 56). The area under the curve for the clinical and clinicogenomic model is 0.84 and 0.90, respectively. B, ROC analysis of the test samples (n = 62). The area under the curve for the clinical and clinicogenomic model is 0.94 and 0.97, respectively. C, ROC analysis of the combined training and test sets (n = 118). The area under the curve for the clinical and clinicogenomic model is 0.89 and 0.94, respectively, which represents a significant difference between the two curves (P < 0.05).

Close modal

The sensitivity, specificity, positive predictive value, and negative predictive value for each of the three models were evaluated across the test set (Fig. 3). The combined clinicogenomic model increases the sensitivity and negative predictive value to 100% and results in higher specificity and positive predictive value compared with the other models. Cancer subjects with peripheral lesions were well represented in the test set (70.6%), and the clinicogenomic model was equally accurate among peripheral or central lung tumors. The clinicogenomic model also accurately predicted lesions with a mass size <3 cm as well as poorly defined radiographic infiltrates in the test set (Table 4). In addition, the performance of the clinical and clinicogenomic models does not seem to be specific to samples with nondiagnostic bronchoscopies because these models had sensitivities of 90% and 95% on independent samples with diagnostic bronchoscopies (n = 25). Finally, training the clinical and clinicogenomic models across only the training samples with nondiagnostic bronchoscopies (n = 56) resulted in similar accuracies in the test set (82% and 91%, respectively) and a significant difference in the area under the ROC curves between the models (P < 0.05).

Fig. 3

Performance of three logistic regression models across the test set samples. Samples with model-derived probabilities of having lung cancer 0.5 were classified as cancer, and samples with probabilities <0.5 were classified as noncancer. Orange, samples with a final diagnosis of cancer; blue, samples with a final diagnosis of no cancer. The saturation of the colors is representative of the proportion of each final diagnosis group classified as having cancer or no cancer by each of the models. For each model, the sensitivity (Sens), specificity (Spec), positive predictive value (PPV), and the negative predictive value (NPV) are shown. A, clinical model; B, biomarker model; C, clinicogenomic model. The clinical model and the biomarker model each perform similarly with accuracies of 84% and 87%, respectively. The clinicogenomic model has a greater accuracy (94%), specificity, and positive predictive value than either of the other two models.

Fig. 3

Performance of three logistic regression models across the test set samples. Samples with model-derived probabilities of having lung cancer 0.5 were classified as cancer, and samples with probabilities <0.5 were classified as noncancer. Orange, samples with a final diagnosis of cancer; blue, samples with a final diagnosis of no cancer. The saturation of the colors is representative of the proportion of each final diagnosis group classified as having cancer or no cancer by each of the models. For each model, the sensitivity (Sens), specificity (Spec), positive predictive value (PPV), and the negative predictive value (NPV) are shown. A, clinical model; B, biomarker model; C, clinicogenomic model. The clinical model and the biomarker model each perform similarly with accuracies of 84% and 87%, respectively. The clinicogenomic model has a greater accuracy (94%), specificity, and positive predictive value than either of the other two models.

Close modal
Table 4

The accuracy of the clinicogenomic model stratified by cancer status and mass size or location in the test set (n = 62)

CancerNo cancer
nAccuracy (%)nAccuracy (%)
Mass size (cm)     
 >3 100.0 0.0 
 ≤3 100.0 37 91.9 
 Poorly defined infiltrate 100.0 100.0 
Location     
 Central 100.0 100.0 
 Peripheral 12 100.0 17 76.5 
 Other* NA 25 100.0 
CancerNo cancer
nAccuracy (%)nAccuracy (%)
Mass size (cm)     
 >3 100.0 0.0 
 ≤3 100.0 37 91.9 
 Poorly defined infiltrate 100.0 100.0 
Location     
 Central 100.0 100.0 
 Peripheral 12 100.0 17 76.5 
 Other* NA 25 100.0 

*Radiographic findings that cannot be characterized as central versus peripheral.

Comparing the clinicogenomic model with the clinical subjective assessment

To evaluate whether or not the clinical model is comprehensive given the relatively small number of variables it contains, we assessed whether it correlates with the median subjective assessment of three pulmonary physicians. There was an association between the clinical model predictions and the clinical subjective assessment across the test set samples (Fig. 4). The clinical model probabilities were significantly different between the three physician-assessed risk groups (P < 0.01).

Fig. 4

Association between the probability of having lung cancer as predicted by the clinical model and physician's subjective assessment across the test set samples (n = 62). The model-derived probabilities are shown on the y-axis, and the subjective clinical assessment on the x-axis. Red circles, complete agreement among three clinicians; black, agreement between two clinicians; green, no agreement. There are significant differences (P < 0.01, Wilcoxon test) between the probabilities in the low versus medium group, the medium versus high group, and the low versus high group. Cancer status of each subject stratified by subjective risk assessment is shown in Fig. 5.

Fig. 4

Association between the probability of having lung cancer as predicted by the clinical model and physician's subjective assessment across the test set samples (n = 62). The model-derived probabilities are shown on the y-axis, and the subjective clinical assessment on the x-axis. Red circles, complete agreement among three clinicians; black, agreement between two clinicians; green, no agreement. There are significant differences (P < 0.01, Wilcoxon test) between the probabilities in the low versus medium group, the medium versus high group, and the low versus high group. Cancer status of each subject stratified by subjective risk assessment is shown in Fig. 5.

Close modal

Given the association between the clinical model and subjective clinical assessment, we examined the predictions made by the clinicogenomic model stratified by cancer status and subjective clinical assessment category in the test set samples (Fig. 5). The physician's opinion is the most uncertain based on all the clinical data for the 11 samples in the medium risk category. The clinical model is able to classify 7 of the 11 samples correctly; however, the clinicogenomic model correctly classifies all 11 samples.

Fig. 5

The clinicogenomic model–derived lung cancer predictions stratified by cancer status and the physician's subjective assessment across the test set samples (n = 62). Dark gray, a final diagnosis of cancer; light gray, a final diagnosis of noncancer. Squares, correct clinicogenomic model predictions; circles, incorrect model predictions. Each of the samples classified as having a medium risk of lung cancer by physicians was correctly predicted by the clinicogenomic model.

Fig. 5

The clinicogenomic model–derived lung cancer predictions stratified by cancer status and the physician's subjective assessment across the test set samples (n = 62). Dark gray, a final diagnosis of cancer; light gray, a final diagnosis of noncancer. Squares, correct clinicogenomic model predictions; circles, incorrect model predictions. Each of the samples classified as having a medium risk of lung cancer by physicians was correctly predicted by the clinicogenomic model.

Close modal

A previous study by our group reported a gene expression biomarker capable of distinguishing cytologically normal large airway epithelial cells from smokers with and without lung cancer (7). These cells can be collected in a relatively noninvasive manner from bronchial airway brushings of patients undergoing bronchoscopy for the suspicion of lung cancer. The cytopathology of cells obtained during bronchoscopy is 100% specific for lung cancer, but has a limited sensitivity of between 30% and 80% depending on the stage and location of the cancer, with early-stage disease and peripheral cancers having the lowest sensitivity (6). As a result, physicians are confronted with a difficult decision on how to manage the care of patients with potentially early-stage curable disease, when bronchoscopy does not return any cells with aberrant cytopathology. Often the decision about whether to proceed with more sensitive and often more invasive diagnostic procedures or to determine if the initial suspicious radiographic finding resolves in subsequent repeat imaging studies is based on a subjective assessment of the patient's clinical and radiographic risk factors for lung cancer. As the large airway gene expression biomarker uses material that can be easily collected at the time of bronchoscopy (prolonging the procedure by only 2-3 additional minutes), this test could be a useful component of this decision-making process if the biomarker captures information about lung cancer risk that is otherwise occult.

Our results suggest that the pattern of gene expression in large airway epithelial cells reflects information about the presence of lung cancer that is independent of other clinical risk factors. This interpretation results from a comparison of models that contain either clinical variables or the biomarker with a combined clinicogenomic model. The comparison shows that the biomarker is significantly associated with the probability of having lung cancer in both the biomarker and clinicogenomic models and that the importance of each of the variables in the combined clinicogenomic model is similar to their importance in the initial uncombined models.

Given the independence of the biomarker and clinical models, it is not surprising that the clinicogenomic model is a better predictor of lung cancer than either of the initial models in an independent test set. ROC curve analysis shows that the clinicogenomic model performs significantly better than the clinical model. Furthermore, the clinicogenomic model increases the sensitivity, specificity, positive predictive value, and negative predictive value of the clinical model, and its accuracy does not seem to be influenced by the size or location of the lesion. However, these findings need to be validated in larger patient cohorts. One way to accomplish such validation would be to incorporate gene expression measurements into large epidemiologic studies investigating lung cancer risk or lung cancer screening trials involving high-risk smokers.

Despite the limitations of a small sample size and limited clinical parameters, we are encouraged that subjective clinical assessment based on a patient's complete medical record is associated with the clinical model probabilities. This is particularly important given that certain variables, such as positron emission tomography scan findings, were not included in the clinical model because these studies were done on only a small number of the subjects in our cohort. All available data, such as positron emission tomography scan findings, were, however, considered by the pulmonary physicians as part of their subjective assessment of lung cancer likelihood. Further, the clinicogenomic model seems to correctly classify patients assigned to the medium risk subgroup by the clinical subjective assessment. This subgroup of patients is one that is likely to be especially challenging to manage clinically as almost a third of these patients went on to have a final diagnosis of lung cancer.

Our data suggest that a clinicogenomic model that combines gene expression with clinical risk factors for lung cancer has high diagnostic specificity and positive predictive value among patients with nondiagnostic bronchoscopies, including those with small and/or peripheral lesions on chest imaging. This model might therefore serve to identify those patients who would benefit from further invasive testing (e.g., lung biopsy) to confirm the presumptive lung cancer diagnosis and thereby expedite the diagnosis and treatment for their underlying malignancy. In addition, the clinicogenomic model also results in modest increases in diagnostic sensitivity and negative predictive value. Utilization of this clinicogenomic diagnostic might therefore also result in a reduction in the number of individuals without lung cancer who are subjected to additional and more invasive procedures to rule out a lung cancer diagnosis following a nondiagnostic bronchoscopy. If the ultimate sensitivity and negative predictive value of the clinicogenomic model remains close to 100%, this would allow clinicians to confidently use less invasive and less costly approaches (e.g., repeat computed tomography scan in 3-6 months) to follow-up patients with a low clinicogenomic lung cancer risk score.

The ability of gene expression profiles within cytologically normal airway epithelium to serve as a biomarker for lung cancer raises questions about the underlying biology of the cancer-specific molecular changes observed in these cells. The high diagnostic accuracy for the biomarker in the setting of small peripheral lung lesions suggests that changes in airway gene expression between smokers with and without lung cancer are unlikely to be a direct effect of the tumor. The presence of antioxidant and inflammation-related genes in the gene expression biomarker (7) raises the possibility that the biomarker detects an airway-wide cancer-specific difference in response to tobacco smoke exposure. Given the hypothesis that this field of injury may provide information about the host-carcinogen interaction, alterations in gene expression could precede the development of lung cancer and explain the somewhat lower specificity of the biomarker relative to its sensitivity. If this is true, the biomarker might potentially be a useful tool to identify smokers at highest risk for disease who may benefit from chemopreventive strategies.

The gene expression pattern of histologically normal large airway epithelial cells collected at the time of bronchoscopy can be used as a biomarker that provides information that is independent from clinical parameters about lung cancer risk. In the setting of patients with suspect lung cancer that do not have a definitive diagnosis after routine cytology/pathology of materials retrieved by bronchoscopy, a clinicogenomic model that combines both clinical factors and the large airway gene expression biomarker results in improved sensitivity, specificity, positive value, and negative predictive value over the clinical model alone. This suggests that the integrative clinicogenomic model may help expedite invasive diagnostic testing for those smokers with underlying lung tumors and decrease the number of individuals without lung cancer requiring further invasive diagnostic testing to rule out suspicion of disease.

A Spira: Founder equity in ExProDx, Inc; ME Lenburg: Consultant, equity in ExProDx, Inc. The other authors disclosed no potential conflicts of interest.

We thank Gang Liu, Xuemei Yang, Sherry Zhang, Frank Schembri, and Norman Gerry for support with collection of samples and for performing the microarray experiments; Jerome Brody for guidance with the study design and for critical review of the manuscript; and George O'Connor for critical review of the manuscript.

1
Parkin
DM
,
Bray
F
,
Ferlay
J
,
Pisani
P
. 
Global cancer statistics, 2002
.
CA Cancer J Clin
2005
;
55
:
74
108
.
2
Shields
PG
. 
Molecular epidemiology of lung cancer
.
Ann Oncol
1999
;
10 Suppl 5
:
S7
11
.
3
Hoffman
PC
,
Mauer
AM
,
Vokes
EE
. 
Lung cancer
.
Lancet
2000
;
355
:
479
85
.
4
Postmus
PE
. 
Bronchoscopy for lung cancer
.
Chest
2005
;
128
:
16
8
.
5
Mazzone
P
,
Jain
P
,
Arroliga
AC
,
Matthay
RA
. 
Bronchoscopy and needle biopsy techniques for diagnosis and staging of lung cancer
.
Clin Chest Med
2002
;
23
:
137
58, ix
.
6
Schreiber
G
,
McCrory
DC
. 
Performance characteristics of different modalities for diagnosis of suspected lung cancer: summary of published evidence
.
Chest
2003
;
123
:
115
28S
.
7
Spira
A
,
Beane
JE
,
Shah
V
, et al
. 
Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer
.
Nat Med
2007
;
13
:
361
6
.
8
Pittman
J
,
Huang
E
,
Dressman
H
, et al
. 
Integrated modeling of clinical and gene expression information for personalized prediction of disease outcomes
.
Proc Natl Acad Sci U S A
2004
;
101
:
8431
6
.
9
Li
L
. 
Survival prediction of diffuse large-B-cell lymphoma based on both clinical and gene expression information
.
Bioinformatics
2006
;
22
:
466
71
.
10
Stephenson
AJ
,
Smith
A
,
Kattan
MW
, et al
. 
Integration of gene expression profiling and clinical variables to predict prostate carcinoma recurrence after radical prostatectomy
.
Cancer
2005
;
104
:
290
8
.
11
West
M
,
Ginsburg
GS
,
Huang
AT
,
Nevins
JR
. 
Embracing the complexity of genomic data for personalized medicine
.
Genome Res
2006
;
16
:
559
66
.
12
McWilliams
A
,
Lam
S
. 
New approaches to lung cancer prevention
.
Curr Oncol Rep
2002
;
4
:
487
94
.
13
Trunk
G
,
Gracey
DR
,
Byrd
RB
. 
The management and evaluation of the solitary pulmonary nodule
.
Chest
1974
;
66
:
236
9
.
14
Thurston
SW
,
Liu
G
,
Miller
DP
,
Christiani
DC
. 
Modeling lung cancer risk in case-control studies using a new dose metric of smoking
.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
2296
302
.
15
Gurney
JW
. 
Determining the likelihood of malignancy in solitary pulmonary nodules with Bayesian analysis. Part I. Theory
.
Radiology
1993
;
186
:
405
13
.
16
Mannino
DM
,
Aguayo
SM
,
Petty
TL
,
Redd
SC
. 
Low lung function and incident lung cancer in the United States: data From the First National Health and Nutrition Examination Survey follow-up
.
Arch Intern Med
2003
;
163
:
1475
80
.
17
Wahidi
MM
,
Govert
JA
,
Goudar
RK
,
Gould
MK
,
McCrory
DC
. 
Evidence for the treatment of patients with pulmonary nodules: when is it lung cancer? ACCP evidence-based clinical practice guidelines (2nd ed.)
.
Chest
2007
;
132
:
94
107S
.
18
Ung
YC
,
Maziak
DE
,
Vanderveen
JA
, et al
. 
18Fluorodeoxyglucose positron emission tomography in the diagnosis and staging of lung cancer: a systematic review
.
J Natl Cancer Inst
2007
.
19
Cummings
SR
,
Lillington
GA
,
Richard
RJ
. 
Estimating the probability of malignancy in solitary pulmonary nodules. A Bayesian approach
.
Am Rev Respir Dis
1986
;
134
:
449
52
.
20
Swensen
SJ
,
Silverstein
MD
,
Ilstrup
DM
,
Schleck
CD
,
Edell
ES
. 
The probability of malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules
.
Arch Intern Med
1997
;
157
:
849
55
.
21
Swensen
SJ
,
Silverstein
MD
,
Edell
ES
, et al
. 
Solitary pulmonary nodules: clinical prediction model versus physicians
.
Mayo Clin Proc
1999
;
74
:
319
29
.
22
Spitz
MR
,
Hong
WK
,
Amos
CI
, et al
. 
A risk model for prediction of lung cancer
.
J Natl Cancer Inst
2007
;
99
:
715
26
.
23
Golub
TR
,
Slonim
DK
,
Tamayo
P
, et al
. 
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring
.
Science
1999
;
286
:
531
7
.
24
Akaike
H
. 
A new look at the statistical model identification
.
IEEE Trans Automatic Control
1974
;
19
:
716
23
.
25
Hanley
JA
,
McNeil
BJ
. 
The meaning and use of the area under a receiver operating characteristic (ROC) curve
.
Radiology
1982
;
143
:
29
36
.
26
Hanley
JA
,
McNeil
BJ
. 
A method of comparing the areas under receiver operating characteristic curves derived from the same cases
.
Radiology
1983
;
148
:
839
43
.
27
Sullivan
PM
,
Etzioni
R
,
Feng
Z
, et al
. 
Phases of biomarker development for early detection of cancer
.
J Natl Cancer Inst
2001
;
93
:
1054
61
.
28
Cohen
J
. 
A coefficient of agreement for nominal scales
.
Educ Psychol Meas
1960
;
20
:
37
46
.

Supplementary data