Abstract
Purpose: Pulmonary carcinoid tumors account for up to 5% of all lung malignancies in adults, comprise 30% of all carcinoid malignancies, and are defined histologically as typical carcinoid (TC) and atypical carcinoid (AC) tumors. The role of specific genomic alterations in the pathogenesis of pulmonary carcinoid tumors remains poorly understood. We sought to identify genomic alterations and pathways that are deregulated in these tumors to find novel therapeutic targets for pulmonary carcinoid tumors.
Experimental Design: We performed integrated genomic analysis of carcinoid tumors comprising whole genome and exome sequencing, mRNA expression profiling and SNP genotyping of specimens from normal lung, TC and AC, and small cell lung carcinoma (SCLC) to fully represent the lung neuroendocrine tumor spectrum.
Results: Analysis of sequencing data found recurrent mutations in cancer genes including ATP1A2, CNNM1, MACF1, RAB38, NF1, RAD51C, TAF1L, EPHB2, POLR3B, and AGFG1. The mutated genes are involved in biological processes including cellular metabolism, cell division cycle, cell death, apoptosis, and immune regulation. The top most significantly mutated genes were TMEM41B, DEFB127, WDYHV1, and TBPL1. Pathway analysis of significantly mutated and cancer driver genes implicated MAPK/ERK and amyloid beta precursor protein (APP) pathways whereas analysis of CNV and gene expression data suggested deregulation of the NF-κB and MAPK/ERK pathways. The mutation signature was predominantly C>T and T>C transitions with a minor contribution of T>G transversions.
Conclusions: This study identified mutated genes affecting cancer relevant pathways and biological processes that could provide opportunities for developing targeted therapies for pulmonary carcinoid tumors. Clin Cancer Res; 24(7); 1691–704. ©2018 AACR.
Translational Relevance
Pulmonary neuroendocrine tumors are classified into typical carcinoid tumors, atypical carcinoid tumors, small-cell lung cancers, and large-cell neuroendocrine carcinoma. Complete surgical resection of the tumor remains the treatment of choice for cure in patients able to tolerate surgery. For patients with tumors not amenable to surgical resection, there are currently no targeted therapies available and chemotherapeutic options have low response rates of ∼20% or less. There is therefore the need to identify genomic alterations and signaling pathways deregulated in these tumors to aid the development of alternative therapies. We performed integrated genomic analysis and identified recurrent mutated genes including ATP1A2, CNNM1, MACF1, RAB38, NF1, RAD51C, TAF1L, EPHB2, POLR3B, and AGFG1. Pathway analysis of mutated genes implicated MAPK/ERK and APP signaling pathways whereas analysis of CNV and gene expression data identified deregulation of the NF-ĸB and MAPK/ERK pathways. Targeting recurrent mutated genes as well as ERK and NF-ĸB pathways might represent therapeutic options for pulmonary carcinoid tumors.
Introduction
Pulmonary neuroendocrine tumors (NET) include a spectrum of tumors, classified as typical carcinoid (TC) tumors, atypical carcinoid (AC) tumors, large-cell neuroendocrine carcinoma (LCNEC), and small-cell lung cancer (SCLC; ref. 1). Pulmonary carcinoid tumors account for up to 5% of all lung cancers, and for unclear reasons have shown the greatest increase in incidence compared with NETs from other sites from 1973 to 2005 according to epidemiological data from the SEER database (2, 3). Despite similarities in their morphology, structure, and immunohistochemistry, there are dramatic differences in clinical behavior, prognosis, and treatment response within the spectrum of pulmonary NETs (4). Unfortunately, an immense scientific gap exists in understanding the biology of these tumors, in part due to the lack of genomic information and suitable reagents such as cell lines and tumor xenograft models. As a consequence, there have been no major advances in the treatment and prognosis for these cancers (2, 3, 5, 6). Currently, the only effective treatment to achieve cure is complete surgical excision, which may be unattainable depending upon the location and extent of the tumor. Both chemotherapy and radiation therapy have limited success in treating these tumors and therefore the outcome for patients with metastatic disease is poor. A recent study advancing the field by Fernandez-Cuesta and colleagues, examining genomic alterations in pulmonary carcinoids using copy number analysis, exome/genome, and transcriptome sequencing identified recurrent mutations in chromatin remodeling genes. The study found that 40% of the cases studied harbored mutations in histone modifier genes such as MEN1, PSIP1, and ARID1A, and 20% of cases had mutations in the components of the SWI/SNF complex (7). The study found MEN1, ARID1A, and EIF1AX to be significantly mutated (q < 0.2; ref. 7). These results remain to be confirmed in further studies.
We analyzed genomic alterations in pulmonary carcinoid tumors using a variety of approaches, including mRNA expression, SNP genotyping, and a combination of exome, and whole genome sequencing. Hierarchical clustering of differentially expressed genes clearly segregated histologies of normal lung, carcinoid, and SCLC from each other. Despite this, relatively little separation was found between the 31 TC and 11 AC tumors in the cohort. We also sequenced the exomes of 20 tumor–normal pairs and performed whole genome sequencing on a subset of 5 tumor–normal pairs and found 126 functional and disease-causing mutations in 114 genes with known biological or clinical significance. The overall number of genomic rearrangements identified from whole genome sequencing showed a relatively low frequency of genomic rearrangements compared with that present in lung adenocarcinoma from either smoker or non-smoker patients. Recurrent mutations were identified in ATP1A2, CNNM1, MACF1, RAB38, NF1, RAD51C, TAF1L, EPHB2, POLR3B, and AGFG1.
We identified cancer relevant biological processes that were affected by these mutated genes including cellular metabolism, cell division cycle, cell death and apoptosis, and immune regulation in majority of the cases analyzed. Pathway analysis of mutated genes implicated deregulation of MAPK/ERK and APP pathways as relevant to pathogenesis of pulmonary carcinoid tumors. Mutation significant analysis identified TMEM41B, DEFB127, WDYHV1, and TBPL1 to be among significantly mutated genes. Pathway analysis of differentially expressed genes with CNV changes identified the involvement of NF-κB and MAPK/ERK signaling pathways. These results suggest several potential opportunities for development of new therapeutic approaches to treat pulmonary carcinoid tumors.
Materials and Methods
This study was reviewed and approved by the Mayo Clinic Institutional Review Board and the Biospecimen Protocol Review Group. Informed consent was obtained prior to specimen collection and medical chart review for each patient. Prospectively collected samples of surgically resected TC tumors (n = 39), AC tumors (n = 12), SCLC (n = 12), and corresponding normal lung tissue (n = 26) were obtained from the Mayo Clinic Lung Tissue Registry. Patient demographics, clinical, surgical and pathologic data, including tumor recurrence, tumor metastasis, and patient survival were extracted from medical records. The tumor stage was adjusted according to the seventh edition of the TNM classification of malignant tumors (6, 8, 9). A summary of the patient characteristics is provided in Table 1 and Supplementary Table S1.
Patient characteristics
Characteristics . | . | Typical carcinoid . | Atypical carcinoid . | SCLC . | Total . | P . |
---|---|---|---|---|---|---|
Number of cases | 39 | 12 | 12 | 63 | ||
Gender | Male | 14 (35.9%) | 4 (33.3%) | 6 (50.0%) | 24 (38.1%) | P = 0.632 |
Female | 25 (64.1%) | 8 (66.7%) | 6 (50.0%) | 39 (61.9%) | ||
Mean age at surgery (years) | 59.9 ± 2.1 | 59.9 ± 3.8 | 67.0 ± 3.8 | 61.3 ± 13.4 | P = 0.260 | |
Mean follow-up (years) | 6.7 ± 5.1 | 6.6 ± 4.0 | 0.66 ± 1.12 | 5.5 ± 5.0 | P = 0.0004 | |
Smoking status | Never smoker | 25 (64.1%) | 5 (41.7%) | 1 (8.3%) | 31 (49.2%) | P = 0.0008 |
Former smoker | 11 (28.2%) | 5 (41.7%) | 4 (33.3%) | 20 (31.8%) | ||
Active smoker | 3 (7.7%) | 2 (16.6%) | 7 (58.3%) | 11 (19.1%) | ||
Stage | I | 30 (76.9%) | 7 (58.3%) | 3 (25%) | 40 (63.5%) | P = 0.0177 |
II | 6 (15.4%) | 1 (8.3%) | 3 (25%) | 10 (15.9%) | ||
III | 3 (7.7%) | 3 (25%) | 4 (33.3%) | 10 (15.9%) | ||
IV | 0 (0.0%) | 1 (8.3%) | 2 (16.7%) | 3 (4.8%) | ||
Vital stats | Dead | 14 (35.9%) | 6 (50%) | 12 (100%) | 30 (50%) | P = 0.0005 |
Alive | 25 (65.1%) | 6 (50%) | 0 (0%) | 30 (50%) |
Characteristics . | . | Typical carcinoid . | Atypical carcinoid . | SCLC . | Total . | P . |
---|---|---|---|---|---|---|
Number of cases | 39 | 12 | 12 | 63 | ||
Gender | Male | 14 (35.9%) | 4 (33.3%) | 6 (50.0%) | 24 (38.1%) | P = 0.632 |
Female | 25 (64.1%) | 8 (66.7%) | 6 (50.0%) | 39 (61.9%) | ||
Mean age at surgery (years) | 59.9 ± 2.1 | 59.9 ± 3.8 | 67.0 ± 3.8 | 61.3 ± 13.4 | P = 0.260 | |
Mean follow-up (years) | 6.7 ± 5.1 | 6.6 ± 4.0 | 0.66 ± 1.12 | 5.5 ± 5.0 | P = 0.0004 | |
Smoking status | Never smoker | 25 (64.1%) | 5 (41.7%) | 1 (8.3%) | 31 (49.2%) | P = 0.0008 |
Former smoker | 11 (28.2%) | 5 (41.7%) | 4 (33.3%) | 20 (31.8%) | ||
Active smoker | 3 (7.7%) | 2 (16.6%) | 7 (58.3%) | 11 (19.1%) | ||
Stage | I | 30 (76.9%) | 7 (58.3%) | 3 (25%) | 40 (63.5%) | P = 0.0177 |
II | 6 (15.4%) | 1 (8.3%) | 3 (25%) | 10 (15.9%) | ||
III | 3 (7.7%) | 3 (25%) | 4 (33.3%) | 10 (15.9%) | ||
IV | 0 (0.0%) | 1 (8.3%) | 2 (16.7%) | 3 (4.8%) | ||
Vital stats | Dead | 14 (35.9%) | 6 (50%) | 12 (100%) | 30 (50%) | P = 0.0005 |
Alive | 25 (65.1%) | 6 (50%) | 0 (0%) | 30 (50%) |
Sample acquisition and preparation
The specimens utilized in this study were snap frozen in liquid nitrogen in the frozen section pathology laboratory within 30 minutes of resection and then transferred to −80°C for permanent storage in the Mayo Clinic Lung Tissue Registry. Prior to processing, the histology of the tumor specimens and the nonneoplastic histology of the matched surrounding normal lung tissue were confirmed by a dedicated lung pathologist (M.C. Aubry). RNA and DNA extraction from macrodissected specimens was conducted through the institutional Biospecimen Accessioning and Processing (BAP) Shared Resource.
Gene expression profiling
RNA for gene expression analysis was extracted using the RNeasy Mini Kit and TissueLyser (Qiagen) according to the manufacturer's instructions. About 100 to 200 ng of total RNA was used for whole genome mRNA expression profiling, which was performed on 64 human lung samples including 31 TC, 11 AC, 12 SCLCs, and 9 adjacent normal lung tissues using the Illumina Human WG-6_v2 BeadArray Chip in the Mayo Clinic Advanced Microarray Shared Resource. Gene expression data were loaded into the Beadstudio v3 software (Illumina Inc.), and data were normalized using the cubic spline method. Data were analyzed with Partek Genomics suite software. The normalized data were subsequently analyzed by principal component analysis (PCA) to determine if any intrinsic clustering or outliers existed within the data set. ANOVA analysis, including fold change, P value and false discovery rate (FDR) were performed to compare the following 4 groups (i) TC versus AC; (ii) carcinoid versus normal tissue; (iii) carcinoid versus SCLC; and (iv) SCLC versus normal tissue. A total of 1,181 genes were filtered across all groups using criteria of P < 0.001 and |fold change| >3. There were 683, 416, and 612 genes filtered separately in each of groups 2, 3, and 4. The normalized and nonnormalized gene expression data have been deposited with the Gene Expression Omnibus (GEO) database with Accession Number GSE108055.
Genotyping and data analysis
Genotyping experiments were performed on a subset of 39 samples (16 TC, 11 AC, and 12 SCLC) at the Genotyping Shared Resource. DNA was extracted using the QIAamp DNA Mini Kit (Qiagen). The samples were processed in two batches. A total of 250 ng DNA was utilized per sample for genotyping with the Illumina BeadStudio HumanHap550 and HumanHap550-Duo BeadChips (Illumina Inc.) targeting 550,000 SNP assays per sample. The genomic annotations are based on NCBI human genome build 37 (UCSC Genome Browser). CNV analysis was performed using Partek Genome Suit 6.4 software in the default setting. Sex chromosomes were excluded from the analysis.
DNA sequencing
Whole exome sequencing.
Whole exome sequencing of 20 tumor/normal pairs was performed by BGI Americas Corporation (Cambridge, MA) and the Mayo Clinic Genomic Core facility to 50× coverage. Data analysis for exome sequencing was first carried out by Personal Genome Diagnostics (PGDX; John Hopkins University, Baltimore, MD) using methodology previously described (10, 11). The sequences were aligned to the human genome reference sequence (hg18) using the Eland algorithm of CASAVA 1.7 software (Illumina). The chastity filter of the BaseCall software was used to select sequence reads for subsequent analysis. Point mutations and small insertions and deletions were identified using the ELANDv2 algorithm after mapping with CASAVA 1.7 software. The ELANDv2 algorithm of CASAVA 1.7 software (Illumina) was then applied to identify point mutations and small insertions and deletions. Known polymorphisms recorded in dbSNP were removed from the analysis. Potential somatic mutations were filtered and visually inspected as described previously (10). A second analysis was performed using Agilent SureCall Genomic Analysis software to map the reads and call somatic mutations. Pre-alignment included processing the read sequences to trim low quality bases from the ends and remove adaptor sequences. Mapping short reads against a reference genome was performed using Burrows-Wheeler Alignment (BWA) to generate BAM alignment files. Post-alignment processing included filtering of duplicates, region padding, and realignment were performed to minimize erroneous alignment of read ends. Alignment parameters included a mismatch penalty of 4, base quality encoding of phred-33, score threshold for a match of 1, maximum number of permitted mismatches of 2 and a minimum matching seed length of 19. SNPPET algorithm was used to call somatic mutations from tumor–normal pair samples according to the following parameters: aberration score threshold of 6, minimum statistical score for the caller algorithm to consider a given genomic interval significant is 6, discard multiple hit reads with mapping quality of 0, central tendency summarization method was used, variant score threshold of 0.3 based on the Phred scaled quality value, minimum quality score of 20 for a base within a read must have to be called a mutation, variant call quality threshold of 100, minimum allele frequency of 0.1, minimum number of reads supporting variant allele of 10. Multiple alleles at a locus were reported. For tumor–normal pair analysis, SNPPET compares the variants found in tumor samples to those found in matched normal samples to assess alterations that are not related to the tumor. Two models tare fitted to tumor and normal samples and used to evaluate variant due to sequencing error and true variant. Variants called “somatic” by SNPPET are those more likely to be a true variant than a sequencing error in the tumor data and is also absent in the matched normal sample. Conversely, variant called “germline” by SNPPET are those more likely to be true variant in the normal sample. Variants are called “ambiguous” if there is insufficient data in either sample to make a call for the variant. Insufficient data include poor base quality, small number of reads supporting the variant, or variant location being at the edge of all the reads that support it.
We also performed secondary analysis using Sentieon Tnseq, a modified BWA-, GATK-, and Mutect2-based pipelines. Sentieon TNseq is a tumor–normal pair somatic variant detection product that allows for MuTect (calls SNV in somatic variants) or MuTect2 (calls both SNV and INDEL in somatic variants) analysis with but more efficient computing algorithms. Raw sequencing reads were aligned to the human reference genome (UCSC hg19) with BWA-MEM followed by sorting and indexing using Sentieon driver. QC results including alignment summary, GC bias, base quality by sequencing cycle, base quality score distribution, and insert size metrics were collected. Sentieon driver was used to remove duplicate reads and to perform joint indel realignment for tumor–normal paired samples. Somatic variants were called in paired samples using Sentieon TNseq TNsnv algorithms.
The genomic variant file in VFC format was then uploaded into the ingenuity variant analysis package for tertiary analysis. Cancer-relevant mutations were identified by applying a number of filters including confident filter, common filter, predicted deleterious filter, genetic analysis filter, and cancer driver variant filter. Confident filter was then applied to keep variants with call quality of 20 for cases and controls and read depth of at least 10 in any sample. Common variant filter was then applied to exclude all variants observed in the 1000 Genome project, Complete Genomics genome and NHLBI ESP exomes with allele frequency of at least 1%. Application of predicted deleterious filter kept mutations that were experimentally observed to be pathogenic or mutations associated with gain of function of a gene identified in the literature, gene fusions, inferred activating mutations determined by ingenuity, predicted gain of function by bSIFT, or mutations associated with loss of function of a gene by frameshift, in-frame indel, stop codon change, missense, and likely splice site loss of up to bases into intron. Additional filtering using genetic filter kept mutations that are associated with gain of function or homozygous, heterozygous, het-ambiguous, haploinsufficiency, and hemizygous. Cancer driver variant filtering was applied to keep mutations that are found in cancer-associated mouse knockouts, cellular processes and pathways, cancer therapeutic targets, in published cancer literature, and found in the COSMIC or TCGA database at a frequency of 0.01. Manual reanalysis of all mutations in all cases after filtering was performed to remove mutations found in repetitive genomic regions. The mutations were annotated with cancer and clinically relevant information using Oncotator (Broad) to identify cancer drivers predicted to have damaging effect on protein function.
The somatic mutation rate in each sample was calculated by filtering out noncoding mutations and dividing the results by 60 MB of exome capture coverage.
Whole genome sequencing.
Whole genome sequencing and data analysis of 5 T/N pairs of carcinoid samples was performed by Complete Genomics Inc. according to their published protocol (12). Approximately 7.5 μg of the unamplified human genomic DNA was sheared into 500 bp fragments and four separate artificial adapter sequences added to the fragment through an iterative process using type IIS restriction endonucleases. DNA nanoballs (DNB) were then created from the modified fragments using rolling circle amplification of about 200-fold in solution. Sequencing was performed using the Combinatorial Probe-Anchor Ligation (cPAL) technology to 100× coverage (12). Mapping of reads from whole genome sequencing and identification of single nucleotide polymorphisms (SNP), indels, substitutions, copy number variants (CNV), structural variants (SV), and mobile element insertions (MEI) was achieved using the Complete Genomics analysis tool (12). The mutation, CNV and SV files were uploaded into the ingenuity Variant analysis application (Qiagen) for further analysis. Cancer-relevant mutations were identified according to the filtering criteria described above for the exome sequencing.
Validation of selected mutations by resequencing
Targeted sequencing was used to confirm somatic cancer driver mutations. Mutations identified by different pipelines were validated separately. Primers flanking the location of genomic alterations were designed using primer3 online tool and amplicons generated by PCR. Primers for mutations identified by PDGX were designed using HG18 and those identified by Agilent SureCall, Sentien mutect2 and IVA were designed using Hg19. The amplicons were then used for Nextera library prep (Illumina) and sequencing performed on the Illumina Hiseq 2500 to produce 151 bp paired end reads. The reads were aligned to HG19 using the BWA aligner and variant callers GATK (Broad) and SureCall (Agilent) used for mutation detection. Mutations that were validated in the targeted mutations were presented.
Mutational significant analysis
Mutation significant analysis was performed on the MAF file of somatic mutations from 20 carcinoid (14 typical and 6 atypical) samples using the MutSigCV algorithm described in several genome projects (13, 14). This version of MutSig incorporates several modifications including assessment of genome-wide background mutation rate, which varies based on patient-specific and genomic position-based factors. The patient-specific factors involve the overall mutation rate and spectrum, while the genomic position-specific factors include HiC chromatic assessment defined by the extent of DNA compression, timing of DNA replication, and the level of gene expression. The input files were the mutation annotation file (MAF) file, coverage table, covariates table, genome build Hg19/GRCh37, output filename base and mutation type dictionary file. The MAF file was created from the Agilent SureCall output variant calling file (VCF) of somatic mutations after filtering out noncoding regions including mutations in RNA, 3″ flank, 3′UTR, 5″ flank, 5′UTR, IGR, and introns. The analysis produced tab-delimited report where the “nnei,” “x,” and “X” values provides information regarding the calculation of the background mutation rate for each gene.
Mutation signature detection
Data from non-negative matrix factorization (NMF) was applied to detect mutation signatures from WES somatic mutation data from 14 TC and 6 AC samples. The NMF is a mathematical model that deconstructs a complex multidimensional data to detect the fundamental signature of the data (15, 16). The signature profile of each sample is displayed using the pyrimidine of the mutated Watson–Crick base pair of six substitution subtypes: C>A, C>G, C>T, T>A, T>C, and T>G. Each substitution is analyzed by combining information on the bases immediately 5′ and 3′ to each mutated base generating 96 possible mutation types (6 types of substitution ∗ 4 types of 5′ base ∗ 4 types of 3′ base). Mutational signatures are based on the relative proportions of mutations generated by each signature based on the actual trinucleotide frequencies of the reference human genome version GRCh37. Given a matrix A of size P × Q, NMF can be used to decompose matrix A into two nonnegative matrices (A = P * Q + U), W and H have sizes P × k, and k × Q, respectively, where k is the rank and U is the matrix. The matrix A in this dataset is considered a multidimensional data set that is comprising 96 features (P) made up of mutation counts of each mutation type (C > A, C > G, C > T, T > A, T > C, T > G) at each 5′ and 3′ base context from 20 (Q) lung carcinoid cancer cases, giving a matrix 96 × 20. This was factorized into two matrices: W with size 96 × k and H with size k × 21 and NMF performed using k = 2 … 20.
Pathway analysis
To identify molecular networks and signaling pathways affected by genomic changes in carcinoid tumors, a list of genes that were differentially expressed and located in regions with corresponding copy number changes, as well as mutated genes from sequencing analysis, were separately subjected to pathway analysis using the Ingenuity Pathway Analysis application (Qiagen). The CNV/differentially expressed gene dataset was queried for interactions between members of the dataset. For the mutation dataset, a query for all known interactions with members of our dataset was used. The IPA software is a large curated database and analysis system for identifying signaling pathways, molecular networks, and biological and metabolic processes that are most significantly changed in a dataset. We applied the Core Pathway Analysis functionality on genes differentially expressed between tumor and normal samples with corresponding directional copy number changes to identify the interaction and relationship between genes based on the information in the IPA database. The Ingenuity Knowledgebase pools information from the literature, to identify molecular networks, upstream regulators, signaling pathways, and biological and metabolic processes affected. The analysis also looked for the location of the genes within the cell, which genes act upstream or downstream of other genes, and which gene or genes act as nodes for a particular network. The relationships identified include expression regulation, protein–protein binding, protein–DNA binding, activation regulation, transcription, translocation among others (17, 18).
Statistical analysis
We used JMP statistical software to performed t test, χ2 and ANOVA analysis to determine any correlation or differences between histology and number of mutations with other clinical characteristics. A P value of 0.05 was used to determine statistical significance.
Results
Clinicopathologic features
A total of 63 prospectively collected samples of surgically resected TC tumors (TC; n = 39), AC tumors (AC; n = 11), SCLC (n = 12), and adjacent normal lung tissue (n = 10) were included in this study. The tumor stage was adjusted according to the 7th edition of the TNM classification of malignant tumors (1, 6, 9). The mean age of patients was 68.4 ± 12.6 years with a mean clinical follow-up of 4.3 ± 3.7 years. The cases were evenly distributed between never smokers (50.8%) and current or former smokers. Most of the cases were stage I (63.5%) followed by stages II and III (15.9%) and stage IV (4.8%; Table 1 and Supplementary S1). There was statistical significant correlation between the different types of NETs and mean follow-up years, smoking status, stage and vital statistics (Table 1). There was no statistically significant difference between the number of mutations and smoking status, histology, stage, and vital stats (Supplementary Table S2).
Classification of pulmonary NETs
The World Health Organization (WHO) classification of lung tumors and the American Joint Commission on Cancer (AJCC) staging manual currently defines pulmonary carcinoids as typical and atypical based on the mitotic index and the presence or absence of necrosis. TCs have less than 2 mitotic figures per mm2 (or per 10 high power fields) and no necrosis, whereas ACs have 2 to 10 mitotic figures per mm2 (or per 10 high power fields) or focal tumor necrosis (1). Our goal was to use gene expression and high-density single-nucleotide polymorphism (SNP) arrays to evaluate this classification of pulmonary NETs based on differential gene expression and copy number variation (SNV). Hierarchical clustering of the differentially expressed genes separated the samples into three distinct groups of carcinoids, SCLC, and normal (Fig. 1). This suggests that gene expression profiling can be used to distinguish between carcinoid tumors and SCLC, but was not able differentiate between TC and AC. This is not surprising since the determination of whether a carcinoid is called typical or atypical is based on minor histologic variations as described above, whereas the histologic differences are more significant when comparing SCLC and carcinoids. Given the relatively small number of tumors with lymph node metastases or recurrence, no correlations were observed with clinical outcome data and gene expression for carcinoid tumors.
Clustering of gene expression segregates carcinoids from SCLC but does not distinguish between TC and AC. A total of 1,181 genes were filtered by P < 0.001 and |fold change| > 3 were clustered for 64 human lung samples, including 31 TC, 11 AC, 12 SCLC, and 10 corresponding normal lung tissue (NL).
Clustering of gene expression segregates carcinoids from SCLC but does not distinguish between TC and AC. A total of 1,181 genes were filtered by P < 0.001 and |fold change| > 3 were clustered for 64 human lung samples, including 31 TC, 11 AC, 12 SCLC, and 10 corresponding normal lung tissue (NL).
Integrated analysis of gene expression and genotyping
Previous studies have identified chromosomal aberrations in different types of NETs using approaches such as CGH, array CGH, and PCR (19–25). Recurrent chromosomal changes included the 11q deletion found in both TC and AC. The frequencies of the various chromosomal alterations are shown in Supplementary Table S3. Genotyping was performed on a total of 39 NETs including 16 typical, 11 atypical, and 12 SCLCs. CNV analysis was focused on CNVs spanning five or more array probes, each containing 50 nucleotides. Consistent with the gene expression array data, there were no significant differences between typical and ACs. We then performed integrated analysis of the genotyping and gene expression data for the carcinoid tumors. The 27 carcinoid samples were run in pairs on both DNA SNP array and mRNA gene expression microarray. Through integration analysis of two sets of data for each patient using Gene Name as the mapping ID to overlap CNV data with the significantly differential expressed gene list filtered by ANOVA P value < 0.05 and |FC| ≥ 2, a total of 1,883 genes were identified. Finally, 293 genes were identified by further filtering with same expression direction, such as combination of increased gene expression with CN amplification or decreased expression with CN deletion (Fig. 2A).
Large-scale genomic alterations in pulmonary NETs identify recurrent alteration in genes with links to NF-κB and MAPK pathways. A, Integration of differential gene expression and CNV showing regions of copy number changes resulting from deletion or amplification with corresponding directional changes in gene expression. Genomic locations are shown on the left, and differentially expressed genes with increased expression in regions of amplification (red) and decreased expression in regions of deletions (green) are listed on the right. Genes listed on the right are involved in ERK and NF-κB pathways and were selected from C and D. B, Frequency of top genes amplified and upregulated (red) and genes deleted and downregulated (green) occurring in at least three samples (Supplementary Table S4). ANOVA P value was <0.05 and |fold change| ≥2. C and D, Genetic interaction network shows differentially expressed genes with copy number changes that are linked to the NF-κB and ERK or MAPK pathways. Green represent downregulated genes, whereas red indicated upregulated genes. Solid lines indicate direct relationships between proteins whiles dotted lines imply indirect interactions.
Large-scale genomic alterations in pulmonary NETs identify recurrent alteration in genes with links to NF-κB and MAPK pathways. A, Integration of differential gene expression and CNV showing regions of copy number changes resulting from deletion or amplification with corresponding directional changes in gene expression. Genomic locations are shown on the left, and differentially expressed genes with increased expression in regions of amplification (red) and decreased expression in regions of deletions (green) are listed on the right. Genes listed on the right are involved in ERK and NF-κB pathways and were selected from C and D. B, Frequency of top genes amplified and upregulated (red) and genes deleted and downregulated (green) occurring in at least three samples (Supplementary Table S4). ANOVA P value was <0.05 and |fold change| ≥2. C and D, Genetic interaction network shows differentially expressed genes with copy number changes that are linked to the NF-κB and ERK or MAPK pathways. Green represent downregulated genes, whereas red indicated upregulated genes. Solid lines indicate direct relationships between proteins whiles dotted lines imply indirect interactions.
Top recurrent genes overexpressed in regions of amplification included FIGF, DCX, ITM2A, MAOA, and HBA2 whereas recurrent genes that were downregulated and located in deleted regions included PCSK1N, GDI1, FSTL3, RHBDL1, and IFITM2 (Fig. 2B; Supplementary Table S4).
Ingenuity pathway analysis found several differentially expressed and CNV genes to be involved in the NF-κB and ERK/MAPK pathways (Fig. 2A, C and D). The affected genes with direct interaction with the NF-κB (complex) included CASP4, CAT, GLRX, GPX1, MSR1, MUC1, and TNFRSF12A. MUC1, TNFRSF12A, and GLRX act upstream and activate the NF-κB pathway, whereas CAT, GPX1, and MSR1/SDA downregulate the pathway (26–30). CASP4 is a target of the NF-κB pathway (31). IPA analysis of RNAseq data from Fernandez-Cuesta and colleagues also found deregulation of the NF-κB pathway, thus confirming our results (7). Genes in the ERK pathway involved EFNA1, PCSKIN, SCN2A, PROS1, C4BPA, CD6, FIGF, and HBA1 genes. Functional analysis using the DAVID functional annotation database found biological processes involved in respond to wounding and nervous system development to be deregulated in these samples (Supplementary Table S5).
Significantly mutated genes
We performed whole exome sequencing of 20 carcinoid cases with available blood or adjacent normal controls and selected a subset of 5 tumors and normal for whole genome sequencing. We used Agilent SureCall variant caller to identify somatic mutations. Mutation frequency per megabase of somatic mutations in each sample was calculated after excluding noncoding and silent mutations (Fig. 3A). To identify significantly mutated genes in carcinoid tumors, we used the MutSig algorithm to determine which genes harbored mutations at a higher frequency than expected by chance, taking into account the sample-specific mutation rate; the ratio of nonsynonymous to synonymous mutations in each gene; and the median expression level of each gene in carcinoid tumors. The analysis identified 12 statistically significant genes with an FDR cutoff of 0.1 (q1 ≤ 0.1; Fig. 3B; Table 2 and Supplementary Table S6), and 42 genes with FDR cutoff of 0.5 (q1 ≤ 0.5). Biological pathways and networks impacted by the significantly mutated genes were the NF-κB and APP pathways (Supplementary Fig. S1; Supplementary Table S7). The genes linked to the NFKB pathway included C43BP, CIART, PXYD5, GUCA1C, IRX6, LANCL1, PDE1A, PRG2, SGK2, TBPL1, and TMEM41B. Those affecting APP pathway were ALPPL2, DGKA, FAM3B, IFI16, MCL1, OLFM4, PDLIM3, PGF, RIT1, TNFRSF11B, and VAV3. The Amyloid-β precursor protein (APP) protein is found in several proteins including the brain and spinal cord and implicated in Alzheimer disease. Several recent studies have implicated APP protein in cancer development in breast, pancreas, prostate, and non–small cell lung cancers (32–37). The APP protein has also been reported to contribute to the pathogenicity of breast cancer (34, 38, 39). NF-κB activity and signaling pathways promote cancer cell proliferation, metastasis, epithelial–mesenchymal transition, and resistance to treatment (40–42). The study by Fernandez-Cuesta and colleagues identified three chromatin remodeling genes including MEN1, ARID1A, and EIF1AX to be significantly mutated (q < 0.2; ref. 7) in pulmonary carcinoid tumors. We looked for chromatin remodeling genes significantly mutated in our dataset and found three genes: DPF1 (P = 3.59E−03), RNF212 (P = 2.63E−02), and TAPBPL (P = 4.51E−02) but all with FDR q = 1 (Table 2 and Supplementary Table S8).
Significantly mutated genes in pulmonary carcinoid tumor. A, A rank of total number of nonsynonymous mutations in individual patients. B, A heat map of mutations in significantly mutated genes color coded by the type of mutations. Right, Top 15 genes with the lowest q-value; Left, histogram showing the percentage of samples with mutations the each of the 15 significantly mutated genes. C, Distribution of base substitutions of each carcinoid sample from mutation signature analysis.
Significantly mutated genes in pulmonary carcinoid tumor. A, A rank of total number of nonsynonymous mutations in individual patients. B, A heat map of mutations in significantly mutated genes color coded by the type of mutations. Right, Top 15 genes with the lowest q-value; Left, histogram showing the percentage of samples with mutations the each of the 15 significantly mutated genes. C, Distribution of base substitutions of each carcinoid sample from mutation signature analysis.
The top 40 significantly mutated genes with q-value of 0.5 or lower prioritized by MutSigCV analysis are presented
Genes . | Nonsilent . | Silent . | nnei . | P . | q . |
---|---|---|---|---|---|
TMEM41B | 6 | 0 | 50 | 2.54E−07 | 3.49E−03 |
DEFB127 | 11 | 0 | 50 | 3.70E−07 | 3.49E−03 |
WDYHV1 | 7 | 0 | 50 | 2.07E−06 | 1.07E−02 |
TBPL1 | 5 | 0 | 50 | 2.27E−06 | 1.07E−02 |
ALPPL2 | 11 | 0 | 50 | 7.98E−06 | 3.01E−02 |
ABCE1 | 6 | 0 | 48 | 1.12E−05 | 3.52E−02 |
MT4 | 8 | 0 | 43 | 3.45E−05 | 9.30E−02 |
GUCA1C | 6 | 0 | 50 | 5.52E−05 | 1.30E−01 |
OR52N1 | 13 | 1 | 50 | 6.55E−05 | 1.37E−01 |
LANCL1 | 6 | 0 | 50 | 8.41E−05 | 1.45E−01 |
RIT1 | 5 | 0 | 50 | 8.44E−05 | 1.45E−01 |
THG1L | 10 | 0 | 50 | 9.23E−05 | 1.45E−01 |
LCORL | 5 | 0 | 50 | 1.25E−04 | 1.75E−01 |
TMEM161B | 7 | 0 | 26 | 1.30E−04 | 1.75E−01 |
IRX6 | 5 | 0 | 50 | 1.49E−04 | 1.87E−01 |
GABRA3 | 6 | 0 | 14 | 2.18E−04 | 2.23E−01 |
MCL1 | 5 | 0 | 50 | 2.22E−04 | 2.23E−01 |
OLFM4 | 5 | 1 | 50 | 2.24E−04 | 2.23E−01 |
TNFRSF11B | 5 | 1 | 50 | 2.25E−04 | 2.23E−01 |
PRG2 | 7 | 1 | 13 | 2.52E−04 | 2.34E−01 |
SNRPD2 | 3 | 0 | 50 | 2.61E−04 | 2.34E−01 |
VAV3 | 9 | 1 | 39 | 3.02E−04 | 2.59E−01 |
C4BPB | 5 | 1 | 6 | 4.49E−04 | 3.42E−01 |
FXYD5 | 5 | 0 | 50 | 4.50E−04 | 3.42E−01 |
SGK2 | 6 | 0 | 50 | 4.53E−04 | 3.42E−01 |
SAP30BP | 4 | 0 | 50 | 4.74E−04 | 3.42E−01 |
PGF | 3 | 0 | 50 | 4.89E−04 | 3.42E−01 |
PDE1A | 6 | 0 | 8 | 5.21E−04 | 3.50E−01 |
ZFAND2B | 4 | 0 | 50 | 5.56E−04 | 3.50E−01 |
TMEM155 | 6 | 0 | 50 | 5.95E−04 | 3.62E−01 |
DGKA | 5 | 0 | 47 | 6.93E−04 | 4.08E−01 |
IFI16 | 15 | 0 | 22 | 7.14E−04 | 4.08E−01 |
ADCK4 | 6 | 2 | 43 | 7.36E−04 | 4.08E−01 |
NADK | 6 | 0 | 50 | 8.65E−04 | 4.62E−01 |
FAM3B | 4 | 0 | 50 | 8.81E−04 | 4.62E−01 |
PDLIM3 | 4 | 0 | 44 | 9.23E−04 | 4.70E−01 |
PCDP1 | 10 | 2 | 19 | 9.73E−04 | 4.76E−01 |
THYN1 | 6 | 0 | 50 | 9.87E−04 | 4.76E−01 |
MYL7 | 3 | 0 | 16 | 1.01E−03 | 4.76E−01 |
CDK2AP1 | 3 | 0 | 50 | 1.04E−03 | 4.77E−01 |
DPF1* | 4 | 0 | 48 | 3.59E−03 | 1 |
RNF212* | 2 | 0 | 50 | 2.63E−02 | 1 |
TAPBPL* | 5 | 0 | 33 | 4.51E−02 | 1 |
Genes . | Nonsilent . | Silent . | nnei . | P . | q . |
---|---|---|---|---|---|
TMEM41B | 6 | 0 | 50 | 2.54E−07 | 3.49E−03 |
DEFB127 | 11 | 0 | 50 | 3.70E−07 | 3.49E−03 |
WDYHV1 | 7 | 0 | 50 | 2.07E−06 | 1.07E−02 |
TBPL1 | 5 | 0 | 50 | 2.27E−06 | 1.07E−02 |
ALPPL2 | 11 | 0 | 50 | 7.98E−06 | 3.01E−02 |
ABCE1 | 6 | 0 | 48 | 1.12E−05 | 3.52E−02 |
MT4 | 8 | 0 | 43 | 3.45E−05 | 9.30E−02 |
GUCA1C | 6 | 0 | 50 | 5.52E−05 | 1.30E−01 |
OR52N1 | 13 | 1 | 50 | 6.55E−05 | 1.37E−01 |
LANCL1 | 6 | 0 | 50 | 8.41E−05 | 1.45E−01 |
RIT1 | 5 | 0 | 50 | 8.44E−05 | 1.45E−01 |
THG1L | 10 | 0 | 50 | 9.23E−05 | 1.45E−01 |
LCORL | 5 | 0 | 50 | 1.25E−04 | 1.75E−01 |
TMEM161B | 7 | 0 | 26 | 1.30E−04 | 1.75E−01 |
IRX6 | 5 | 0 | 50 | 1.49E−04 | 1.87E−01 |
GABRA3 | 6 | 0 | 14 | 2.18E−04 | 2.23E−01 |
MCL1 | 5 | 0 | 50 | 2.22E−04 | 2.23E−01 |
OLFM4 | 5 | 1 | 50 | 2.24E−04 | 2.23E−01 |
TNFRSF11B | 5 | 1 | 50 | 2.25E−04 | 2.23E−01 |
PRG2 | 7 | 1 | 13 | 2.52E−04 | 2.34E−01 |
SNRPD2 | 3 | 0 | 50 | 2.61E−04 | 2.34E−01 |
VAV3 | 9 | 1 | 39 | 3.02E−04 | 2.59E−01 |
C4BPB | 5 | 1 | 6 | 4.49E−04 | 3.42E−01 |
FXYD5 | 5 | 0 | 50 | 4.50E−04 | 3.42E−01 |
SGK2 | 6 | 0 | 50 | 4.53E−04 | 3.42E−01 |
SAP30BP | 4 | 0 | 50 | 4.74E−04 | 3.42E−01 |
PGF | 3 | 0 | 50 | 4.89E−04 | 3.42E−01 |
PDE1A | 6 | 0 | 8 | 5.21E−04 | 3.50E−01 |
ZFAND2B | 4 | 0 | 50 | 5.56E−04 | 3.50E−01 |
TMEM155 | 6 | 0 | 50 | 5.95E−04 | 3.62E−01 |
DGKA | 5 | 0 | 47 | 6.93E−04 | 4.08E−01 |
IFI16 | 15 | 0 | 22 | 7.14E−04 | 4.08E−01 |
ADCK4 | 6 | 2 | 43 | 7.36E−04 | 4.08E−01 |
NADK | 6 | 0 | 50 | 8.65E−04 | 4.62E−01 |
FAM3B | 4 | 0 | 50 | 8.81E−04 | 4.62E−01 |
PDLIM3 | 4 | 0 | 44 | 9.23E−04 | 4.70E−01 |
PCDP1 | 10 | 2 | 19 | 9.73E−04 | 4.76E−01 |
THYN1 | 6 | 0 | 50 | 9.87E−04 | 4.76E−01 |
MYL7 | 3 | 0 | 16 | 1.01E−03 | 4.76E−01 |
CDK2AP1 | 3 | 0 | 50 | 1.04E−03 | 4.77E−01 |
DPF1* | 4 | 0 | 48 | 3.59E−03 | 1 |
RNF212* | 2 | 0 | 50 | 2.63E−02 | 1 |
TAPBPL* | 5 | 0 | 33 | 4.51E−02 | 1 |
NOTE: Genes with asterisks are chromatin remodeling genes.
Mutation signature of carcinoid tumors
The patterns of somatic mutations—mutational signatures—caused by different mutational processes in cancer genomes have recently been uncovered thanks to advances in genome sequencing and development of computational tools to decode those patterns (43, 44). Analysis of the pan-cancer genome comprising genomic data from all major cancers identified more than 20 mutational signatures (44). We extracted mutational signatures from carcinoid tumors using six classes of base substitution—C > A, C > G, C > T, T > A, T > C, T > G—incorporating information immediately 5′ and 3′ to each mutated base, giving 96 possible mutation classification using nonnegative factorization analysis. The analysis revealed a signature characterized predominantly by a combination of C > T and T > C transition substitutions, with minor contribution from T>G transversion substitutions (Fig. 4). Although this mutation signature is novel from the 22 signatures identified from the pan-cancer genome, it is identical to the mutational process 1 signature reported by Alexandrov and Stratton (43), which was explained to indicate activity of endogenous mutational process that may have occurred throughout the lifespan of patients.
Mutational signature pattern of substitutions in pulmonary carcinoid cancers based on trinucleotide frequency. Substitution mutation types are described on the x-axes while percent mutations are shown on the y-axis. The proportion of mutation bars for each of the six classes of substitution is displayed by a different color. The substitution classes were defined using a 96-substitution classification equivalent to the six substitution types and their immediate 5′ and 3′ sequence context.
Mutational signature pattern of substitutions in pulmonary carcinoid cancers based on trinucleotide frequency. Substitution mutation types are described on the x-axes while percent mutations are shown on the y-axis. The proportion of mutation bars for each of the six classes of substitution is displayed by a different color. The substitution classes were defined using a 96-substitution classification equivalent to the six substitution types and their immediate 5′ and 3′ sequence context.
Mutation in cancer driver genes
To identify mutated cancer driver genes, we performed further filtering using the ingenuity variant analysis software (https://www.qiagenbioinformatics.com/products/ingenuity-variant-analysis) from QIAGEN, Inc. A total of 126 cancer driver mutations predicted to have damaging effect on the protein function were found in 114 genes. Genes with recurrent mutations in at least two samples were ATP1A2, CNNM1, MACF1, RAB38, NF1, RAD51C, TAF1L, EPHB2, POLR3B, and AGFG1. The mutations were annotated using Oncotator to identify biological processes impacted. The mutations included 85 missense, 4 nonsense, 6 splice site, and 1 nonstop mutations (Fig. 5A and Supplementary Table S9). The top biological processes affected mutated genes included cellular metabolism, cell division cycle, cell death and apoptosis, immune regulation, and chromatin remodeling genes. Other known cancer driver genes also mutated were KRAS, KDR, NFE2L1, MAP2K3, MSH3, PRRX2, RB1, SELENBP1, TACR1, and CSMD3 (Fig. 5A). Pathway analysis of mutated genes implicated ERK, one of four MAPK signaling pathways and Amyloid precursor protein (APP) pathways as deregulated in pulmonary carcinoid tumors (Fig. 5B and C). We also identified missense mutations in a number of chromatin remodeling genes including ERCC6, CTBP2, HDAC6, RAD54L, WDR11, CHD4, EP400, RB1, RNF213, BRD4, ZRANB3, NSD1, TAPBP, ARID1B, HDAC9, BRD3, and HCFC1 that were predicted by mutation taster to be disease causing and mutation assessor to be high-impact functional mutations. However, each of the mutations was identified by either SNPPET or MuTect2 but not both variant callers, suggesting low confident mutations (Table S12).
Mutation spectrum of lung carcinoid tumors. A, Heat map matrix of single nucleotide variations and small indels identified through whole genome and exome sequencing. Genes involved in this pathway are labeled in red. Scale represents the number of mutated genes in each sample. B and C, Ingenuity pathway analysis identified network of genes directly or indirectly involved in the regulation of the ERK1,2 and APP as one of the principal nodes. Solid lines indicate direct relationships between proteins whiles dotted lines imply indirect interactions.
Mutation spectrum of lung carcinoid tumors. A, Heat map matrix of single nucleotide variations and small indels identified through whole genome and exome sequencing. Genes involved in this pathway are labeled in red. Scale represents the number of mutated genes in each sample. B and C, Ingenuity pathway analysis identified network of genes directly or indirectly involved in the regulation of the ERK1,2 and APP as one of the principal nodes. Solid lines indicate direct relationships between proteins whiles dotted lines imply indirect interactions.
Mutation rate in pulmonary tumors was low compared with tumors from non–small cell lung cancer and other organs. DGKI inhibits KRAS activation, which is required for NF-κB activation. PTHLH, which acts downstream of Gli3, has been shown to activate the NF-κB pathway (45–48). The NF-κB pathway upregulates expression of both HERC5 and TACR1 (NK-1R) but its activation is inhibited by IRAK-3 (IRAK-M; refs. 49–51).
Genomic structural variations
Some cancers are driven not by single nucleotide mutations and small indels but by large structural alterations. For example, the fusion of ALK with an upstream partner, EML4, has been found in 3% to 7% of unselected NSCLC (52–54). Data from whole genome sequencing were analyzed to identify structural variations such as deletions, duplications, and inversions. Circos plots showing the overall genomic alterations confirm a distribution of significantly fewer genomic alterations in carcinoids compared with lung adenocarcinomas (unpublished data; Fig. 6A). The average somatic structural variations (SSV) identified in the carcinoids was 42 compared with 1,348 SSVs in adenocarcinomas (Fig. 6B). Both the carcinoid and adenocarcinoma samples were processed using the same protocols, sequenced with same platform and at similar coverage, and analyzed using the same analysis pipelines. The majority of the SSVs were intrachromosomal compared with a few that affected different chromosomes. Recurrent alterations in two out of five samples were found in protein phosphatase PTPRF, eukaryotic translation initiation factor EIF4E, and WD repeat domain containing protein WDR47 genes. Of the 5 cases with SNP, microarray, and WGS data, we found several differentially expressed genes with CNV that were identified by both whole genome sequencing and SNP analyses (Supplementary Table S10). They included amplification and upregulation of NELL1, TPH1, TMC5, ABCC8, SPON1, and SERGEF and deletion with downregulation in PARP14, KCNMB2, SLCO2A1, B2M, DUOXA1, SQRDL, ANXA2, MYO5C, SCG3, AQP9, PTPLAD1, and AKAP13.
Distribution of somatic structural mutations in lung carcinoid tumors. A, Circos plot showing markedly reduced genomic rearrangements in carcinoids compared with adenocarcinomas. B. Distribution of structural variation types in carcinoid tumors in reference to adenocarcinomas (unpublished).
Distribution of somatic structural mutations in lung carcinoid tumors. A, Circos plot showing markedly reduced genomic rearrangements in carcinoids compared with adenocarcinomas. B. Distribution of structural variation types in carcinoid tumors in reference to adenocarcinomas (unpublished).
Discussion
Pulmonary carcinoid tumors are defined histologically as TC and AC and are characterized by neuroendocrine differentiation, the potential to metastasize, and the production of bioactive amines which can result in the carcinoid syndrome. Little is known about the biology of pulmonary carcinoid tumors (2). Furthermore, although there are similarities between types of pulmonary NETs in terms of their morphology, structure, and immunohistochemistry, there are also dramatic differences in clinical behavior and prognosis within the spectrum of pulmonary NETs as well as with respect to their therapeutic approaches and treatment response (4). Our gene expression profiling analysis clearly segregated carcinoids and SCLC into different groups, although it was unable to differentiate between TC and AC. This demonstrates the subtlety of difference between these two tumor types despite their histopathologic classification. Integrated analysis also found a correlation between differentially expressed genes and areas of CNV. Several previous studies have identified the 11q deletion as a recurrent chromosomal abnormality in carcinoid tumors (19–25). Petzmann and colleagues reported 11q deletions in 55.6% of typical and 72.7% of ACs in one study and 50% and 83% in another (24, 25). Our genotyping analysis also found 11q deletions in a number of TC and ACs with corresponding downregulation of the IFITM1, IFITM2, SCGB1A1 genes. We also performed whole genome and whole exome sequencing on a number of the samples and compared the mutation rate per megabase with that in other lung cancer types and found that pulmonary NETs have one of the lowest mutation rates observed. The mutation rate in cancer driver genes ranged from as low as zero in some tumors to as high as 7 per megabase. It is possible such tumors may be driven by other genomic alterations such as structural variations, which can be identified by whole genome sequencing. Recurrent and damaging mutations in cancer relevant genes included ATP1A2, CNNM1, MACF1, RAB38, NF1, RAD51C, TAF1L, EPHB2, POLR3B, and AGFG1. The mutated genes affected critical biological processes including cellular metabolism, cell division cycle, cell death, apoptosis, and immune regulation. Pathway analysis of the mutated genes implicated ERK and APP pathways as deregulated in pulmonary carcinoid tumors. A previous study by Debelenko and colleagues in 1997 of 11 sporadic lung carcinoids to look for loss of heterozygosity (LOH) and mutations in the MEN1 gene using dideoxy fingerprinting found both copies of the MEN1 gene to be inactivated in 36% of the sporadic tumors (55). Mutect2 identified one somatic missense variant in the MEN1 gene (p.T546A) in four different samples that was not found by SNPPET. SNPPET found a p.T546A variant in one sample that was different from those found by mutect2 and two silent variants in two different samples. None of the variant overlapped between the two pipelines and none of them was found by either mutation taster or mutation assessor to be disease causing or considered functional variant. A recent similar sequencing study identified alterations in chromatin-remodeling genes including histone modifiers and the SWI/SNF complex in 22% of the cases studied (7). Key mutated genes identified included MEN1, PSP1, EIF1AX, and ARIDIA. We found three chromatin-remodeling genes, DPF1, RNF212 and TAPBPL, that were significantly mutated and additional genes in that pathway that were not significant (Table 2 and Supplementary Table S8). Although both SNPPET and mutect2 separately identified mutations in several chromatin remodeling genes including ERCC6, CTBP2, HDAC6, RAD54L, WDR11, CHD4, EP400, RB1, RNF213, BRD4, ZRANB3, NSD1, TAPBP, ARID1B, HDAC9, BRD3, and HCFC1, not a single one of them was identified by both pipelines, although it is possible some of them may be real. The low rate of mutations in chromatin remodeling genes in our samples compared with other studies could be due in part to differences in variant calling algorithms by different callers as well as different filtering parameters. New methods of annotation and filtering allow for better prediction of variants likely to be disease causing mutations and could explain differences between studies. Mutational significant analysis uses somatic mutations generated by the variant caller and therefore could be impacted significantly by the different number of mutations identified by different variant calling algorithms. Second, the fact that different studies produce different significantly mutated genes affecting the same biological process or signaling pathway underscore the effect of cancer heterogeneity, which could have a bigger effect on studies with smaller samples. The pathway analysis identified the NF-κB and MAPK/ERK pathways as deregulated in both the differentially expressed and mutation dataset. Analysis of RNAseq data from Fernandez-Cuesta and colleagues also implicated NF-κB and MAPK/ERK pathway alterations (Supplementary Table S11) and confirmed our findings. The NF-κB pathway has pro- and antitumorigenic roles in tumor cells. Mutations in genes encoding the NF-κB subunits and IκB proteins have been identified in both solid and hematologic malignancies (56–58). Although it will be difficult to accurately discern mechanism of action from sequencing data alone, it is possible that the NF-κB may crosstalk with the MAPK/ERK pathway through API activation to affect cell survival and apoptosis (59). The identification of this pathway by both differentially expressed and mutated genes is a unique finding in our study, and by comparison this pathway was not found to be important in a cohort of adenocarcinoma cases. Recurrent genes located in deleted regions and downregulated in 15% of the cases were CFD, GDI1, IFITM1, and IFITM2. Although our genomic analysis of 20 samples and that published by Fernandez-Cuesta and colleagues are important initial observations for using genomics to find drivers of pulmonary NETs development and progression, larger sequencing studies of whole exomes or genomes is needed to comprehensively evaluate genomic alterations in pulmonary NET carcinogenesis and identify recurrent driver mutations. Given our data, further studies into the direct roles of many of the genes and proteins that may be regulated by NF-κB regulation are needed to elucidate their contribution to lung pulmonary NET development.
In summary, our study used gene expression profiling to successfully classify NETs into SCLC and carcinoid subtypes, but the profile could not distinguish between TC and ACs. Some of the differentially expressed genes correlated with copy number changes in their respective chromosome location. Mutational signature and significance analyses of exome sequencing revealed mutation in ERK and APP pathways as well as mutation in chromatin remodeling genes in carcinoid tumors. The predominant mutational signatures in the cohort were C>T and T>C transitions. A small number of recurrent nonsynonymous mutations in a number of cancer driver genes were identified that may act as driver mutations in specific tumors.
Disclosure of Potential Conflicts of Interest
G. Vasmatzis is an employee of WholeGenome LLC. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: M.K. Asiedu, C.F. Thomas Jr, J. Jen, J. Molina, R. Kuang, M.C. Aubry, P. Yang, D.A. Wigle
Development of methodology: M.K. Asiedu, C.F. Thomas Jr, J. Dong, J. Jen, J. Molina, R. Kuang, D.A. Wigle
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): M.K. Asiedu, C.F. Thomas Jr, S.C. Schulte, P. Khadka, Z. Sun, F. Kosari, J. Jen, J. Molina, M.C. Aubry, P. Yang, D.A. Wigle
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): M.K. Asiedu, C.F. Thomas Jr, J. Dong, P. Khadka, Z. Sun, J. Molina, G. Vasmatzis, R. Kuang, D.A. Wigle
Writing, review, and/or revision of the manuscript: M.K. Asiedu, C.F. Thomas Jr, J. Dong, S.C. Schulte, P. Khadka, Z. Sun, J. Jen, J. Molina, G. Vasmatzis, R. Kuang, M.C. Aubry, P. Yang, D.A. Wigle
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M.K. Asiedu, C.F. Thomas Jr, D.A. Wigle
Study supervision: M.K. Asiedu, C.F. Thomas Jr, D.A. Wigle
Acknowledgments
This work was financially supported by grants from the Mayo Clinic Center for Individualized Medicine (CIM) and the Mayo Clinic-University of Minnesota Bio-Medical Informatics and Computational Biology collaboration.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.