Conventional clinical and pathologic risk factors in stage I non small-cell lung cancer (NSCLC) provide limited prognostic information. Novel prognostic biomarkers are needed to identify patients with highest recurrence risk who will receive the greatest absolute risk reduction from adjuvant chemotherapy/radiation. Transcriptome, epigenome, and global genetic variation analyses of resected NSCLC frozen tumor samples (n=81; 29 stage 1A/52 stage 1B) were performed in order to develop an integrated prognostic signature. Only tumors that were resected within 3 months of the first biopsy-proven diagnosis were used to reduce lead time bias in timing of removal of the primary tumor. Here, the association of gene expression with disease free survival (DSF) is described. Whole genome expression data were quantile-normalized using the Illumina GenomeStudio 2011.1 software. Log-transformed data were then processed in Biometric Research Branch (BRB)-Array Tools. Prognostic models were developed with the top ranking genes using the supervised principal component survival algorithm (PCA). Risk group membership was then assigned based on this multivariate model using leave-one-out cross validation and class prediction modeling. Both models adjusted for age, gender and tumor histology. For stage 1A cases, a 113 gene signature selected by fitting a Cox proportional hazard model (Cox) was able to discriminate between poor and good DFS (α=0.001). Upon further analysis using a class prediction algorithm support vector machine (SVM), we identified an 18 gene classifier that predicted outcome SN=0.952, SP=0.571). A 10-fold cross validation ROC curve reported a value of 0.748. For stage 1B cases, the Cox model identified 141 genes (α=0.01), but the SVM prediction algorithm showed reduced SN/SP (0.75/0.25) for a 12 gene classifier, indicative of possible increased heterogeneity among the 1B cases. Nine genes overlapped between the PCA and class prediction model gene signatures. These nine are members of common lung tumorigenesis pathways, i.e. G-protein signaling regulation (KRAS/BRAF/RASSFI); tyrosine protein kinase signaling (EGFR/ERBB2); nucleotide excision repair (ERRC5 and ERCC6) and genes implicated in cell differentiation regulated by miR200B (HGD) or miR338 (TXNRD1), miRNAs independently found to be associated with prognosis in our cohort. These initial results demonstrate that a gene expression profile can distinguish stage I NCSLC tumors and predict prognoses, but require further validation. Supported in part by NIH 5P50 CA090440, P30CA047904, UPMC Institutional Funds.

Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 103rd Annual Meeting of the American Association for Cancer Research; 2012 Mar 31-Apr 4; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2012;72(8 Suppl):Abstract nr 1722. doi:1538-7445.AM2012-1722