Abstract
Targeted next-generation sequencing of DNA has become more widely used in the management of patients with lung adenocarcinoma; however, no clear mitogenic driver alteration is found in some cases. We evaluated the incremental benefit of targeted RNA sequencing (RNAseq) in the identification of gene fusions and MET exon 14 (METex14) alterations in DNA sequencing (DNAseq) driver–negative lung cancers.
Lung cancers driver negative by MSK-IMPACT underwent further analysis using a custom RNAseq panel (MSK-Fusion). Tumor mutation burden (TMB) was assessed as a potential prioritization criterion for targeted RNAseq.
As part of prospective clinical genomic testing, we profiled 2,522 lung adenocarcinomas using MSK-IMPACT, which identified 195 (7.7%) fusions and 119 (4.7%) METex14 alterations. Among 275 driver-negative cases with available tissue, 254 (92%) had sufficient material for RNAseq. A previously undetected alteration was identified in 14% (36/254) of cases, 33 of which were actionable (27 in-frame fusions, 6 METex14). Of these 33 patients, 10 then received matched targeted therapy, which achieved clinical benefit in 8 (80%). In the 32% (81/254) of DNAseq driver–negative cases with low TMB [0–5 mutations/Megabase (mut/Mb)], 25 (31%) were positive for previously undetected gene fusions on RNAseq, whereas, in 151 cases with TMB >5 mut/Mb, only 7% were positive for fusions (P < 0.0001).
Targeted RNAseq assays should be used in all cases that appear driver negative by DNAseq assays to ensure comprehensive detection of actionable gene rearrangements. Furthermore, we observed a significant enrichment for fusions in DNAseq driver–negative samples with low TMB, supporting the prioritization of such cases for additional RNAseq.
See related commentary by Davies and Aisner, p. 4586
This article is featured in Highlights of This Issue, p. 4581
Inhibitors targeting kinase fusions have shown dramatic and durable responses in lung cancer patients, making their comprehensive detection critical. Here, we evaluated the incremental benefit of targeted RNA sequencing (RNAseq) in the identification of gene fusions in patients where no clear mitogenic driver alteration is found by DNA sequencing (DNAseq)–based panel testing. We found actionable alterations (kinase fusions or MET exon 14 skipping) in 13% of cases apparently driver negative by previous DNAseq testing. Among the driver-negative samples tested by RNAseq, those with low tumor mutation burden (TMB) were significantly enriched for gene fusions when compared with the ones with higher TMB. In a clinical setting, such patients should be prioritized for RNAseq. Thus, a rational, algorithmic approach to the use of targeted RNA-based next-generation sequencing (NGS) to complement large panel DNA-based NGS testing can be highly effective in comprehensively uncovering targetable gene fusions or oncogenic isoforms not just in lung cancer but also more generally across different tumor types.
Introduction
The identification of ALK and ROS1 kinase fusions in non–small cell lung cancer (NSCLC) has led to the approval of a number of successful targeted therapies, which has revolutionized the treatment of patients whose tumors harbor those fusions (1–9). Inhibitors targeting lower frequency fusions (NTRK1/2/3, RET, and NRG1) and mutations causing MET exon 14 (METex14) skipping have also shown dramatic and durable responses in patients enrolled in clinical trials (10–12). FDA approval was recently granted to a TRK inhibitor (larotrectinib) in patients with tumors harboring an NTRK fusion. Gene fusions are also becoming increasingly important mechanisms of acquired resistance to tyrosine kinase inhibitors in lung adenocarcinomas (13–16). The widespread clinical implementation of next-generation sequencing (NGS), along with technical advances, has resulted in enhanced detection of oncogenic gene fusions and intense interest in their clinical targeting (17–22).
Targeted DNA-based NGS techniques specifically designed to detect rearrangements in kinases can effectively detect oncogenic kinase fusions with high confidence (23–25). For instance, the FDA-cleared MSK-IMPACT large panel, hybrid capture–based NGS assay (21, 26), is designed to detect many common kinase fusions, including those involving ALK, RET, and ROS1, and METex14 skipping mutations, via tiling of the appropriate introns for hybrid capture. However, there are technical limitations to the ability of such DNA-based assays to detect gene fusions (27). First, such assays can only identify fusions in genes where the genomic rearrangements occur in typically short introns effectively covered in the panel (Fig 1). Some clinically important fusions arise from rearrangements in very long introns, the tiling of which would significantly compromise coverage of the remainder of the genes on the panel. Moreover, some introns harbor repetitive sequence elements also present elsewhere in the genome that therefore cannot be assessed by short-read NGS due to the difficulty in uniquely mapping such reads, resulting in gaps in the coverage of certain introns and hence blind spots in the detection of potential rearrangement breakpoints. Second, DNA sequencing (DNAseq) assays provide no direct evidence that the rearrangement produces a fusion expressed at the mRNA level (28), a particular problem for rearrangements that appear noncanonical at the genomic DNA level. To address this need, our laboratory has validated an RNA-based custom solid tumor Fusion-Panel (MSK-Fusion; refs. 29, 30) that utilizes Archer Anchored Multiplex PCR (AMPTM) technology (31) to detect gene fusions in carcinomas and sarcomas.
Tumor mutation burden (TMB) is an emerging potential biomarker for immunotherapy (32–34). Nivolumab and ipilimumab have recently been found to be more effective in extending progression-free survival in patient subsets with higher TMB (35–37). Recent studies have observed that most tumors with oncogenic kinase driver alterations have low TMB (38, 39). Our large cohort of prospectively sequenced clinical samples provides the opportunity to more broadly examine the relationship between TMB status and gene fusions in lung cancer, where targetable kinase fusions are frequently detected. Moreover, we reasoned that low TMB could be an indicator of the greater likelihood of occult gene fusions in driver-negative tumors that could benefit from RNA sequencing (RNAseq) using the MSK-Fusion panel.
In the present study, we conducted a retrospective sequencing analysis using the MSK-Fusion panel on lung adenocarcinomas that were previously profiled by MSK-IMPACT and were found to lack an oncogenic driver (40). We aimed to elucidate the importance of following DNAseq by RNAseq for the comprehensive detection of gene fusions, determine the clinical feasibility of having adequate tissue for both DNA and RNA testing, and explore the possible correlation of TMB with the likelihood of kinase fusion detection via additional RNAseq testing.
Materials and Methods
We identified patients with NSCLC who underwent targeted DNAseq using the MSK-IMPACT assay from January 2014 through January 2018. Lung adenocarcinoma cases lacking an oncogenic activating mutation, defined as hotspot mutations in BRAF, EGFR, NRAS, KRAS, ERBB2, MAP2K1, MET; amplification of EGFR, ERBB2, FGFR1, MET; rearrangements involving ALK/RET/ROS, NTRK1/2/3, NRG1, BRAF were subject to further analysis using the MSK-Fusion panel (RNAseq). This study was performed after Memorial Sloan Kettering Cancer Center (MSK) Institutional Review Board Approval. All patients provided informed written consent for these somatic genomic analyses. The studies were conducted in accordance with the Declaration of Helsinki and the U.S. Common Rule. MSK-IMPACT and the MSK-Fusion assays have been approved by the New York State Department of Health as clinical assays. MSK-IMPACT also received FDA clearance as a class 2 in vitro diagnostic test (tumor profiling assay) in November 2017.
Patients identified to have an actionable gene fusion by RNAseq in their tumor were reviewed for treatment outcomes including rate of matching to targeted therapy and overall response rate (ORR). ORR to matched targeted therapy was assessed with RECIST version 1.1 by a dedicated study radiologist.
RNA extraction and quality control
A minimum of ten unstained slides and one hematoxylin and eosin–stained slide from formalin-fixed paraffin-embedded tissue (FFPE) were obtained for each sample and reviewed by a pathologist. Macrodissection was performed whenever indicated. Note that 10 μL of mineral oil was applied to each slide before scraping the tissue and placing it in a 1.5 mL Eppendorf tube. An additional 800 μL of mineral oil was added to each tube for tissue deparaffinization. RNA extraction was then performed using the standard RNeasy FFPE Kit and protocol (Qiagen, Catalog #73504). To address challenges around limited or unavailable tissue for RNA testing, our laboratory has tested RNA extraction on lysed cell material (lysate) left from DNA extraction and stored at room temperature for 1 year before they are moved to 4°C. Lysate material is obtained from FFPE tissue scraped and deparaffinized as indicated above. Note that 150 μL of Proteinase k and 250 μL of lysis buffer are added to the 1.5 mL Eppendorf tube and incubated overnight. About 250 μL of lysate is obtained, of which 40 μL is saved at room temperature for RNA extraction and used whenever the corresponding tissue is exhausted.
Total extracted RNA was quantified using the Qubit Broad Range RNA Assay Kit (Life Tech., Catalog #Q10211) and run on the TapeStation using RNA ScreenTape (Agilent, Catalog #5067-5576). Each RNA sample was tested using the Archer PreSeq RNA QC Assay, a qPCR-based method for assessing RNA quality, prior to library preparation and sequencing. A Ct value (41) >28 indicated a low-quality RNA sample and would be deemed insufficient for RNAseq. Samples with at least 50 ng (200 ng preferred) of RNA were used for testing.
Sequencing assays and analysis
RNAseq.
cDNA libraries were made using the Archer FusionPlex standard protocol and supplied reagents including Archer Universal RNA Reagent Kit for Illumina (Catalog #AK-0040-8), Archer MBC adapters (Catalog #SA0040-45), and our custom-designed Gene Specific Primer (GSP) Pool kit. Fusion unidirectional GSPs have been designed to target specific exons in 62 genes known to be involved in chromosomal rearrangements based on current literature. GSPs, in combination with adapters-specific primers, enrich for known and novel fusion transcripts. The assay includes 346 GSPs ranging from 18 to 39 base pairs in length designed by Archer to hybridize in either the 5′or 3′ direction to the relevant exons of each gene. The 62 target genes and their corresponding NCBI RefSeq ID used for gene annotation are listed in Supplementary Table S1.
A detailed description of the Anchored Multiplex Technology is available elsewhere (31). Briefly, cDNA undergoes end repair, dA-tailing, and ligation with half-functional Illumina molecular barcode adapters (MBC). These sequencing adapters contain molecular barcodes that allow for read deduplication and quantitative analysis. A clean-up after all enzymatic steps is performed using AMPURE XP magnetic beads (Fisher Scientific, Catalog #NC0110018). Cleaned ligated fragments are subject to two consecutive rounds of PCR amplifications using two sets of gene-specific primers (GSP1 pool used in PCR1 and a nested GSP2 pool designed 3′ downstream of GSP1, used in PCR2) and universal primers complementary to the Illumina adapters. This allows for the enrichment of fusion transcripts with the knowledge of only one of the gene partners. At the end of the two PCR steps, the final targeted amplicons are ready for 2 × 150 bp sequencing on an Illumina MiSeq sequencer. At the end of MiSeq sequencing, fastq files are automatically generated using the MiSeq reporter software (Version 2.6.2.3) and analyzed using the Archer analysis software (Version 5.0.4).
DNAseq and TMB.
A detailed description of MSK-IMPACT workflow and data analysis is described elsewhere (21, 26). TMB was calculated as the total number of mutations reported for a patient divided by the coding region target territory of MSK-IMPACT and is characterized as the number of somatic base substitution and indel alterations per Megabase (Mb).
Results
Clinical characteristics and patient demographics
A total of 2,522 tumors from unique lung adenocarcinoma patients were profiled by MSK-IMPACT between January 2014 and December 2017, of which 589 cases lacked a driver alteration and were considered for further RNAseq analysis. Additional specimens for RNA extraction were available for 275 (46%) of the 589 samples, 21 of which were found to be insufficient for testing due to low quality of RNA, resulting in 254 cases amenable for RNAseq (Fig. 2). The clinical characteristics of the patients tested are described in Supplementary Table S2. The remaining 314 of the 589 cases did not have available tissue for RNA extraction because all submitted material was used for DNA extraction, and no additional recuts could be requested as the original block was either exhausted or not available.
RNAseq in MSK-IMPACT driver–negative lung adenocarcinomas
Among the 2,522 unique lung adenocarcinomas profiled by MSK-IMPACT, 1,933 (77%) were positive for oncogenic drivers as previously defined (40). KRAS (785) and EGFR (643) were the 2 genes with the most commonly detected oncogenic mutations in 31% and 25% of the patients, respectively, in a mutually exclusive fashion. Other known mitogenic driver alterations were also identified and included mutations in BRAF (56), ERBB2 (61), MET exon 14 (55), NRAS (19), MAP2K1 (17) or gene fusions involving ALK (84), ROS1 (47), RET (42), BRAF (6), FGFR3 (5), NTRK1 (4), NRG1 (3), FGFR1 (1), FGFR2 (1), or high level, genomically focal amplification of MET (18) and ERBB2 (13), most of which represent actionable or potentially actionable alterations classified as OncoKB Levels 1 to 3 events in lung adenocarcinoma (ref. 42; Fig. 3A).
Two hundred and fifty-four cases where a driver alteration was not detected by MSK-IMPACT (DNAseq) were subject to further analysis using the RNA-based MSK-Solid Fusion panel (RNAseq). Twenty-two cases failed sequencing due to low coverage defined as the average number of unique RNA reads per targeted region (<50X). Among the 232 (91%) successfully sequenced samples, 196 samples remained driver negative by both DNAseq and RNAseq, and 36 were positive for mitogenic driver alterations (Figs. 2 and 3A). Among the 36 driver-positive cases, 33 showed actionable in-frame fusions involving METex14 skipping (n = 6) or one of the following kinase genes: 28% ROS1 (n = 10), 13.8% NRG1 (n = 5), 11% ALK (n = 4), 8% RET (n = 3), 5% NTRK3 (n = 2), 2.7% BRAF (n = 1), 2.7% FGFR2 (n = 1), and 2.7% NTRK2 (n = 1).
The gene fusions identified represent a diverse landscape of fusion partners (Fig. 3B and C), some of which are novel. For example, Chromobox 5 (CBX5) and Striatin (STRN) are novel fusion partners for FGFR2 and NTRK2, respectively. In addition, some of the identified gene fusions have not been previously observed in lung adenocarcinomas: RNA Binding Protein, MRNA Processing Factor (RBPMS), and Sequestosome 1 (SQSTM1) were the gene fusion partners involved in NTRK3 fusions. Both of these fusions were previously detected in papillary thyroid carcinoma (43–45). More details about these fusions including the exons involved and Refseq IDs are included in Supplementary Table S3. A novel in-frame fusion involving the first 2 exons of histone deacetylase 5 (HDAC5) and exons 1 through 22 of Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Alpha (PIK3CA) was also detected. Gene fusions involving the full length of PIK3CA were previously reported and are potentially actionable (46). An additional novel fusion involving YWHAE (Tyrosine 3-Monooxygenase/Tryptophan 5-Monooxygenase Activation Protein Epsilon) and the tumor-suppressor gene SMYD4 (SET and MYND Domain Containing 4) was identified. Gene fusions involving tumor-suppressor genes were identified in different tumor types including lung adenocarcinomas and showed a trend for a decreased tumor-suppressor expression (19).
RNAseq fusions not detected by MSK-IMPACT due to panel design
Fifty-two (15/29) gene fusions detected by RNAseq were not expected to be called by MSK-IMPACT due to the lack of coverage of introns inferred to be the site of the genomic breakpoints (Fig. 3B): SLC34A2-ROS1 (n = 5), SLC3A2-NRG1 (n = 2), CD74-NRG1 (n = 2), SDC4-ROS1, SDC4-NRG1, SQSTM1-NTRK3, RBPMS-NTRK3, HDAC5-PIK3CA, and YWHAE-SMYD4. All SLC34A2-ROS1 and SDC4-ROS1 fusions involved exon 32 of ROS1, which predict, although not unequivocally, the possibility of ROS1 intron 31 to be involved in the genomic breakpoint. As previously described (47), intron 31 of ROS1 is known to harbor repetitive elements; most of this intron is excluded from the MSK-IMPACT hybrid capture bait design as the reads would be difficult or impossible to reliably map to the genome. Thus, intronic repetitive regions are not covered by MSK-IMPACT. A small portion of the 5′ region of intron 31 is covered by MSK-IMPACT (Supplementary Fig. S1), and unless the genomic breakpoint occurs in that specific region, a rearrangement would not be detected by DNAseq.
Likewise, DNAseq of introns is not an effective modality to detect NTRK3 fusions because the NTRK3 introns involved in recurrent genomic breakpoints, introns 13 and 14, respectively span 93 and 92 Kb. Tiling such large introns would result in a significant increase to the overall DNA panel size. Only NTRK3 fusions with ETV6 as the 5′ partner are expected to be detected by MSK-IMPACT because the panel captures ETV6 intronic regions known to be involved in fusions. As with NTRK3, NRG1 relevant introns are not captured by MSK-IMPACT due to their large size. CD74 intron 6 is tiled in the MSK-IMPACT panel. However, the CD74-NRG1 fusion detected by RNAseq involved exon 8 indicating the possibility that the genomic breakpoint of this fusion took place in intron 7, which is not captured by the DNA panel. HDAC5-PIK3CA and YWHAE-SMYD4 fusions are novel fusions without specific intronic tiling in the DNAseq panel.
RNAseq-only fusions expected to be called by MSK-IMPACT
Nearly half (48%; 14/29) of the additional gene fusions identified by RNAseq would have been expected to be detected by the MSK-IMPACT panel based on its design (Fig. 3C): CD74-ROS1 (n = 3), EML4-ALK (n = 2), KIF5B-RET (n = 2), KIF5B-ALK (n = 1), CLTC-ALK (n = 1), AGK-BRAF(n = 1), FGFR2-CBX5 (n = 1), STRN-NTRK2 (n = 1), CCDC6-RET (n = 1), and LRIG3-ROS1 (n = 1). The above fusions involved ROS1 exons 34 and 35, ALK exons 20 and 19, RET exon 12, BRAF exon 8, NTRK2 exon 16, and FGFR2 exon 17. For all of these, the corresponding introns (ROS1 introns 33/34, ALK introns19/18, RET intron11, BRAF intron 7, and NTRK2 intron15) are effectively tiled in the MSK-IMPACT DNAseq panel (Supplementary Table S4). Upon manual review in Integrative Genomics Viewer (IGV) (48, 49), those specific intronic regions had sufficient sequencing coverage except for two fusion-positive samples, CD74-ROS1 (exon34) and EML4-ALK (exon20), where the DNA quantity was suboptimal and resulted in low sequencing coverage, and for the KIF5B-RET–positive tumor where intron 11 had lower coverage. This could have led to less sensitivity for the detection of fusions involving this particular intron (Supplementary Table S5). In addition, both the samples positive for CLTC-ALK and CCDC6-RET fusions by RNAseq had low tumor purity (<20%) which was assessed by a pathologist but also confirmed by the fact that no DNA mutations were called in the sample including silent mutations (Supplementary Table S5). This demonstrates the ability of RNAseq to detect fusion events even in specimens with a proportion of tumor cells that is suboptimal or inadequate for DNAseq, presumably because high expression of the fusion mRNA can “compensate” for low tumor content. Finally, in the six samples positive for AGK-BRAF, CD74-ROS1, KIF5B-RET, KIF5B-ALK, EML4-ALK, and FGFR2-CBX5 fusions, a structural variant involving one of the fusion partners was detected by DNAseq. In these cases, it is likely that the oncogenic fusion was caused by one or more complex DNA rearrangements that could not be fully captured by our DNA panel (Supplementary Table S5). This further illustrates the advantage of RNAseq in detecting gene fusions that are challenging to capture by targeted DNAseq assay designs.
METex14 skipping
RNAseq identified 6 samples positive for METex14 skipping that were not noted on DNAseq, including canonical MET splice mutations. Upon further manual review of DNAseq variants in MET introns 13 and 14, noncanonical MET deletions involving intronic nucleotide sequences up to 26 base pairs from the splice site were detected in five of six samples (Supplementary Table S6). One sample was negative for MET mutations by DNAseq possibly indicating a different mechanism leading to METex14 skipping.
Low TMB in cases with kinase fusions
TMB was assessed for all MSK-IMPACT cases that were positive for a driver alteration including hotspot mutations, amplifications, and gene fusions (n = 1,933). TMB median was calculated and compared between all fusion-positive [1.97 mutations (mut)/Mb, interquartile (IQ) range, 0.88–3.51] and fusion-negative samples (5.58 mut/Mb, IQ range, 2.95–8.85), representing a significant difference in TMB (P < 0.00001, Mann–Whitney test; Fig. 4A) and indicating an enrichment for kinase fusions in low TMB samples. To see if a TMB cutoff could be used to identify cases in which additional RNAseq testing would be the most fruitful and therefore of highest priority, TMB was assessed in the 232 DNAseq driver–negative cases that successfully underwent RNAseq; in this subset, 81 cases had low TMB (0–5 mut/Mb), of which 31% were fusion positive. In contrast, in the 151 cases with higher TMB (>5 mut/Mb), only 7% of the cases were positive for fusions (P < 0.0001, Mann–Whitney test), further supporting the notion that gene fusions are enriched in low TMB samples (Fig. 4B).
Complete landscape of fusions in lung adenocarcinomas
Next, we used the combined NGS data to provide the most complete and accurate assessment to date of the prevalence of known kinase fusions in lung adenocarcinoma in our patient population. Comprehensive DNAseq and RNAseq in 2,522 unique lung adenocarcinomas identified 223 high-confidence in-frame and targetable gene fusions (Fig. 5) involving NRG1 (0.32%) and the following kinase genes: ALK (3.44%), ROS1 (2.26%), RET (1.78%), BRAF (0.28%), FGFR3 (0.20%), NTRK1 (0.16%), NTRK3 (0.08%), FGFR2 (0.08%), FGFR1 (0.04%), NTRK2 (0.04%), PIK3CA (0.04%), MET (0.04%), and EGFR (0.04%). In addition, our analysis also provides further evidence of the promiscuity of certain 5′ partner genes that are found to recombine with multiple kinase genes. For example, KIF5B is a common upstream fusion partner to RET (n = 31) but also to ALK (n = 3) and EGFR (n = 1). Similarly, CD74 and SDC4 partner with both NRG1 (n = 5 and n = 1) and ROS1 (n = 29 and n = 8), respectively, to form fusion transcripts.
Clinical outcomes of patients with RNAseq fusion–positive DNAseq-negative tumors
Of the 33 RNAseq-positive/DNAseq-negative patients with potentially targetable alterations (27 with kinase gene fusions and 6 with METex14), 10 went on to be matched to targeted therapy. Alterations in the tumors from these 10 patients included 1 ALK fusion, 4 ROS1 fusions, 2 NTRK fusions, 2 NRG1 fusions, and 1 METex14 skipping alteration. Treatment and response to therapy, as defined by RECIST version 1.1, are outlined in Fig. 6, which shows that 8 patients (80%) had clinical benefit from the matched targeted therapy identified thanks to the additional RNAseq testing. Of these 10 patients, 8 had TMBs below 5 mut/Mb, while the remaining two had TMBs of 7.9 and 11.4 mut/Mb, respectively. The other 23 patients did not receive targeted therapy for a variety of reasons: 6 did not have metastatic disease, 4 were on active surveillance with stable disease after prior treatment modalities, 6 were already on other systemic therapy at the time of the result, 1 patient was lost to follow-up, and in 5 retrospective patients, the RNAseq results were only available postmortem.
Discussion
The number of kinase inhibitors successfully targeting oncogenic gene fusions and rearrangements is increasing, providing better disease management options for patients with cancer (19, 50–55). Therefore, the accurate detection and characterization of those events is clinically essential. Targeted DNA-based sequencing offers a comprehensive tool to detect all types of oncogenic alterations including some structural variants. However, due to the frequent complexity of DNA rearrangements and assay design limitations, it is plausible that some important gene fusions and rearrangements are not accurately detected by DNA-based sequencing techniques. In this study, we have used a clinically validated targeted RNAseq assay (MSK-Fusion) to test lung adenocarcinomas lacking oncogenic driver alterations by DNAseq (MSK-IMPACT). We have demonstrated that 14% (n = 36) of the tested DNAseq-negative cases were positive for fusions or rearrangements by RNAseq. In addition, a clinical benefit was achieved in 80% of the patients whose tumors were positive for fusions or METex14 skipping and who received matched targeted therapy. Importantly, as we have previously found (38, 39), tumors with low TMB are enriched for the presence of a targetable oncogenic driver. This may reflect the fact that most major oncogenic alterations driving MAPK signaling in lung adenocarcinoma [with the exception of KRAS G12C mutations (56), MAP2K1 mutations (57), some non-V600E BRAF mutations (58–60)] are typically seen in never smokers whose tumors therefore do not show the elevated TMB consequent to smoking-induced mutagenesis. Based on this observation, we found that, in more resource-limited settings, the yield of additional RNAseq testing could be increased by focusing on cases that are driver-negative by DNAseq and show low TMB.
One of the challenges of DNA-based gene fusion detection is that most genomic breakpoints that produce fusion genes take place in introns, which cannot always be fully covered by hybrid capture–based NGS either because they contain repetitive elements (61, 62) or they are too long for targeted panel assays. For example, 34% (10/29) of the fusion transcripts not detected by DNAseq included the ROS1 gene. Six of those fusions involved ROS1 Exon 32 predicting that the genomic breakpoint site may have possibly taken place in intron 31. This intron is known to include numerous repetitive elements. These can be present at many other sites in the genome, and inclusion of baits for these regions would simply result in unmappable reads; therefore, such repetitive regions are not covered in hybrid capture–based NGS assays, and hence genomic breaks in these regions are usually missed. In addition, several introns that are known to be involved in genomic breakpoints tend to be very long. For example, each of introns 13 and 14 of the kinase gene NTRK3 or intron 5 of NRG1 is close to 100 Kb in length (UCSC Genome Browser), which is close to 10% of the total size of MSK-IMPACT. Tiling such introns is not only technically challenging but also not practical in terms of overall sequencing throughput and cost, for high volume clinical laboratories that have to make optimal use of resources and limited sequencing capacity.
Nearly half (48%) of the gene fusions not detected by DNAseq involved exons where the presumably involved introns are well covered by the DNA panel. It is possible that the genomic breakpoint causing the rearrangement simply took place in an intron that was not tiled by the panel. The second possible reason from missing gene fusions by DNAseq is low tumor purity. Although our cutoff for DNAseq is 20% tumor content by histologic assessment, the true proportion of tumor cells in the sample can be lower when estimated using somatic mutation variant frequencies. For example, two samples in our cohort were positive for CLTC-ALK and CCDC6-RET fusions by RNAseq, whereas in both cases, no DNA mutations, including silent ones, were detected in the sample. This indicates that the true tumor purity of those samples is likely very low (<5%) and highlights of the advantages of RNAseq, where a highly expressed event can still be detected in the context of low tumor purity. Finally, suboptimal gDNA quality/quantity leading to low-quality sequencing reads can also interfere with gene fusion detection. DNAseq did not detect a CD74-ROS1 and EML4-ALK in two separate samples with low gDNA input. In both cases, poor DNA quality was confirmed by further quality control of the sequencing results.
In six cases, METex14 skipping events were detected by RNAseq, but no METex14 splice mutations were called in the corresponding DNA by MSK-IMPACT. Upon further visual inspection of reads in IGV (48, 49), we identified intronic MET mutations located up to 40 base pairs away from the splice site in intron 13 (Supplementary Table S6). The MSK-IMPACT pipeline was not originally configured to call mutations that far into the intron. Because of this finding, any sample with a putative splicing mutation in MET detected in intron 13 or 14 by MSK-IMPACT is now reflexed to RNAseq to confirm the presence of METex14 skipping at the RNA level.
One of our study limitations is that out of the 589 driver-negative lung adenocarcinoma cases that were candidates for RNAseq follow-up, only 275 (47%) had available tissue for RNA extraction. Given the rate of detection in the subset of cases with adequate material (14%), the total number of fusions detected in this study would have been significantly higher if material was available for all cases. This highlights the fact that, in the real-world setting, additional material for RNA extraction is unavailable in many cases. Of the lung adenocarcinomas submitted for DNAseq, 70% were from very limited samples, such as small lung biopsies (53%) or cytology (17%) specimens. Often, all of the unstained FFPE sections are used up for DNA extraction with little left for RNA extraction. In addition, recuts from the original FFPE block are often not possible due to exhaustion of tumor material. In order to circumvent the challenge around limited material in our clinical laboratory, we have validated RNA extraction on limited amounts of lysate remaining after automated DNA extraction and stored at room temperature (Supplementary Fig. S2). This has allowed us to have immediate access to adequate material for RNA extraction and to enable comprehensive DNAseq and RNAseq for most of the eligible cases in our clinical laboratory. Clinical requests for RNAseq testing can occur up to approximately 1 to 2 months after the cell lysate is originally preserved at room temperature. Although the extracted RNA quality and quantity are compatible with downstream sequencing for the majority of the cases, this approach has not been systematically evaluated on lysate material saved at room temperature for longer timeframes.
Another study limitation is that RNAseq was performed using a targeted amplicon-based panel, which included a limited number of genes. In addition, a primer design was only included for the canonical exons known to be involved in gene fusions or isoforms. It is also possible that our targeted RNAseq panel assay has missed as yet undescribed but possibly clinically important gene fusions that could be detected with other sequencing approaches including targeted hybridization capture– based RNAseq (23, 25) or whole transcriptome sequencing (63).
It has been previously noted that tumors positive for gene fusions contained a low number of mutations (64). In this study, we have also demonstrated that, in a driver-positive lung adenocarcinomas cohort assessed by DNAseq, the fusion-positive samples had a significantly lower TMB than the fusion-negative ones. In addition, among the driver-negative samples tested by RNAseq, those with low TMB were enriched for gene fusions when compared with the ones with higher TMB. These results indicate that driver-negative tumors with low TMB are more likely to harbor fusions than the ones with higher TMB. In a clinical setting, such patients should be prioritized for RNAseq for the potential detection of targetable gene fusions, although our results do not support limiting additional RNAseq testing to this patient subset. Indeed, we recommend that all patients whose tumors are driver-negative by DNAseq go on to RNAseq to ensure that no driver alterations are missed. Overall, we find that a rational, algorithmic approach to the use of targeted RNA-based NGS to complement increasingly routine large panel DNA-based NGS testing can be a highly effective strategy to comprehensively uncover targetable gene fusions or oncogenic isoforms not just in lung adenocarcinomas but also more generally across different tumors types.
Disclosure of Potential Conflicts of Interest
R. Benayed reports receiving commercial research grants from ArcherDx. M. Offin is a consultant/advisory board member for PharmaMar. C.M. Rudin is a consultant/advisory board member for AbbVie, Amgen, Ascentage, AstraZeneca, Bristol-Myers Squibb, Celgene, Daiichi Sankyo, Genentech/Roche, Ipsen, Loxo, PharmaMar, and Harpoon. D.M. Hyman reports receiving commercial research grants from Loxo, PUMA Biotechnology, AstraZeneca, and Bayer Pharmaceuticals, and is a consultant/advisory board member for Chugai Pharma, CytomX Therapeutics, Boehringer Ingelheim, AstraZeneca, Pfizer, Bayer Pharmaceuticals, and Genentech/Roche. M.E. Arcila reports receiving speakers bureau honoraria from Invivoscribe. M.F. Berger reports receiving other commercial research support from Illumina, and is a consultant/advisory board member for Roche. M.G. Kris is a consultant/advisory board member for AstraZeneca, Regeneron, and Pfizer, and reports receiving other remuneration from Genentech. A. Drilon is a consultant/advisory board member for Ignyta/Roche/Genentech, Loxo/Bayer/Lilly, TP Therapeutics, AstraZeneca, Pfizer, Blueprint Medicines, Takeda/Ariad/Millenium, Helsinn Therapeutics, Beigene, BergenBio, Hengrui, Exelixis, Tyra, and Verastem, and reports receiving other remuneration from MORE Health, GlaxoSmithKline, Foundation Medicine, Merck, Teva, Taiho, Medscape, OncLive, PeerVoice, PER, Targeted Oncology, Research to Practice, and Wolters Kluwer. M. Ladanyi reports receiving commercial research grants from LOXO and Helsinn Therapeutics, and is a consultant/advisory board member for AstraZeneca, Bristol-Myers Squibb, Takeda, Bayer, and Merck. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: R. Benayed, M. Offin, M.E. Arcila, A. Zehir, M.G. Kris, A. Drilon, M. Ladanyi
Development of methodology: M. Offin, M.E. Arcila, M.F. Berger, A. Zehir, A. Drilon
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): R. Benayed, M. Offin, K. Mullaney, P. Sukhadia, K. Rios, P. Desmeules, R. Ptashkin, J. Chang, D. Halpenny, C.M. Rudin, M.E. Arcila, A. Zehir, M.G. Kris, A. Drilon
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): R. Benayed, M. Offin, R. Ptashkin, H. Won, J. Chang, A.M. Schram, M.E. Arcila, M.F. Berger, A. Zehir, M.G. Kris, A. Drilon, M. Ladanyi
Writing, review, and/or revision of the manuscript: R. Benayed, M. Offin, R. Ptashkin, J. Chang, D. Halpenny, A.M. Schram, C.M. Rudin, D.M. Hyman, M.E. Arcila, A. Zehir, M.G. Kris, A. Drilon, M. Ladanyi
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M. Offin, P. Desmeules, R. Ptashkin, D.M. Hyman, M.G. Kris
Study supervision: M. Offin, D.M. Hyman, M. Ladanyi
Acknowledgments
The authors gratefully acknowledge J. Keith Killian for his expert technical advice.
This research was supported in part by the NCI of the NIH (P01 CA 129243, T32 CA009207, and P30 CA008748) and in part by a research grant from LOXO Oncology.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.