Despite significant advances in cancer precision medicine, a significant hurdle to its broader adoption remains the multitude of variants of unknown significance identified by clinical tumor sequencing and the lack of biologically validated methods to distinguish between functional and benign variants. Here we used functional data on MAP2K1 and MAP2K2 mutations generated in real-time within a co-clinical trial framework to benchmark the predictive value of a three-part in silico methodology. Our computational approach to variant classification incorporated hotspot analysis, three-dimensional molecular dynamics simulation, and sequence paralogy. In silico prediction accurately distinguished functional from benign MAP2K1 and MAP2K2 mutants, yet drug sensitivity varied widely among activating mutant alleles. These results suggest that multifaceted in silico modeling can inform patient accrual to MEK/ERK inhibitor clinical trials, but computational methods need to be paired with laboratory- and clinic-based efforts designed to unravel variabilities in drug response.
Leveraging prospective functional characterization of MEK1/2 mutants, it was found that hotspot analysis, molecular dynamics simulation, and sequence paralogy are complementary tools that can robustly prioritize variants for biologic, therapeutic, and clinical validation.
See related commentary by Whitehead and Sebolt-Leopold, p. 4042
Prospective tumor sequencing is increasingly used by clinicians to guide treatment selection in patients with cancer (1). Despite widespread enthusiasm for this approach, only a small number of cancer-associated genes have been clinically validated as predictive biomarkers of drug response. Furthermore, even in well-studied genes, the majority of mutations identified by clinical tumor profiling are of unknown biologic and clinical significance.
Here, we set out to formally evaluate the accuracy and predictive value of a multifaceted in silico approach to distinguish functional from benign mutations within the context of a co-clinical trial paradigm. We used real-time tumor mutation profiles generated as part of an ongoing institution-wide prospective tumor sequencing initiative, to direct preclinical discovery efforts with the goal of informing patient care. To do so, we conducted a computational and biochemical comparative analysis of mutations in the MAPK kinase genes, MAP2K1 and MAP2K2, which encode the MEK1 and MEK2 kinases, respectively. Upon activation by RAF kinases, MEK1/2 phosphorylate ERK, thereby promoting proliferation and survival. Activation of MAPK signaling is common in human cancer, most often mediated by mutations in the RAS genes, BRAF, NF1, or upstream receptors. Mutations in MAP2K1/2 are less common, and their phenotypic contribution remains poorly understood. A small number of recurrent MEK1 mutations have been shown to be oncogenic (2–4). In addition, dramatic and durable antitumor responses to MEK inhibition have been reported in patients with MEK1-mutant histiocytosis, but only anecdotally in other cancer types (2, 5–7). The likelihood that such responses will be observed more broadly remains unknown. We therefore sought to biologically validate an in silico discovery platform that could distinguish between functional and therapeutically actionable versus benign MAP2K1 and MAP2K2 mutations.
Materials and Methods
42K MSK-IMPACT sequencing cohort and hotspot data analysis
We assembled somatic mutational data from 42,434 retrospectively and prospectively sequenced human cancers using an approach analogous to those described previously (8, 9). As compared with prior hotspot analyses, all mutational data from the studies associated with The Cancer Genome Atlas (TCGA) were derived from the single multicenter mutation calling in multiple cancers project (MC3; ref. 10). We also included incrementally accrued patients enrolled on our institutional prospective tumor profiling initiative (11), numbering a total of 21,918 patients (51.6% of the full study cohort). For this prospective study, beginning in May 2014, patients with cancer seen at Memorial Sloan Kettering Cancer Center (New York, NY) were offered matched tumor-germline DNA sequencing at physician discretion on an institutional protocol (ClinicalTrials.gov identifier: NCT01775072). Written informed consent was obtained from all participating patients and the study was conducted in accordance with recognized ethical guidelines. MSK-IMPACT data are broadly available via AACR Genie (12). After excluding studies and samples of insufficient quality and breadth, we performed hotspot analysis of MAP2K1 and MAP2K2 as described previously (8, 9). MAP2K1 and MAP2K2 mutational hotspots were considered significant if they exceeded a FDR of 1% (q<0.01). Trending hotspots were those with a q-value between 0.01 and 0.25, corresponding to a FDR of 25%.
Cell lines and culture conditions
HEK-293H cells were purchased from ATCC and HEK-293T cells from Invitrogen and maintained in DME-HG with 10% FBS, 2 mmol/L l-glutamine and 50 units/mL each of penicillin and streptomycin and used within 3 months of passages after receipt. CRISPR knock-in lines were derived by subcloning guide RNA sequences targeting MEK1 F53L and K57N (gcagcagcgaaagcgccttg and cttctgcttctgggtaagaa) and E102_I103del (tggagatcaaacccgcaatc and tgatctccagatgaattagc) into a px330-U6-Chimeric-CBh-hSpCas9 mammalian expression vector (Addgene, catalog no. 42230) followed by transient transfection into 293T cells along with a 200 bp donor vector encoding the missense mutation or deletion of interest plus a mutated PAM site (F53L:tttggaacaggaccaacttggaggccttgcagaagaagctggaggagctagagcttgatgagcagcagcgaaagcgccttgaagcacttcttacccagaagcagaaggtgggagaactgaaggatgacgactttgagaagatcagtgagctgggggctggcaatggcggtgtg;K57N:acttggaggccttgcagaagaagctggaggagctagagcttgatgagcagcagcgaaagcgccttgaagcatttcttacccagaatcagaaggtgggagaactgaaggatgacgactttgagaagatcagtgagctgggggctggcaatggcggtgtggtgttcaaggtct;E102_I103del:1)tatctttcatcccttcctccctctttctttcataaaacctctctttcttccacctttctccagctaattcatctgaaacccgcaatccgaaaccagatcataagggagctgcaggttctgcatgagtgcaactctccgtacatcgtgggc,2)tatctttcatcccttcctccctctttctttcataaaacctctctttcttccacctttctacagctaattcatctgaaacccgcaatccggaaccagatcataagggagctgcaggttctgcatgagtgcaactctccgtacatcgtgggc). Cells were then single cell sorted into 96-well plates and single colonies were screened for perfect integration of the mutant basepair by PCR of the affected region of MEK1, followed by Sanger sequencing. As further confirmation select subclones were analyzed by MSK-IMPACT genomic sequencing and PCR/Sanger sequencing to confirm the presence of the mutant and lack of off-target mutations. CRISPR lines were maintained in the same conditions as parental 293T and limited to use within 15 passages. All lines tested Mycoplasma negative.
MEK1/2 transfections, drug treatment, and Western blotting
MAP2K1 mutants were generated from the C-terminal GFP-tagged MEK1 plasmid (pEGPF-N1-MEK1, Addgene, catalog no. 14746) and MAP2K2-mutant constructs were generated off of the N-terminal myc-tagged MEK2 plasmid (pcDNA3.1-Hygro_MEK2, Addgene, catalog no. 40776) using the QuikChange II XL Site-Directed Mutagenesis Kit (Agilent, catalog no. 200522) and verified by Sanger sequencing. HEK-293H or HEK-293T cells were transiently transfected with the wild-type or mutant MEK1-GFP plasmid using Lipofectamine 2000 Transfection Reagent (Invitrogen, catalog no. 11668500). Trametinib was obtained from GlaxoSmithKline and SCH772984 from Merck. All drugs were dissolved in DMSO to yield 10 mmol/L stock and stored at −80°C. At 24 hours post-transfection, media was aspirated and refreshed with or without targeted inhibitors for various times (1 hour for large mutant panels without drug, or as indicated with drug), at which point, cells were washed with cold PBS and harvested on ice and pelleted. Briefly, cells were then lysed in 1% NP-40 buffer with protease and phosphatase inhibitors, protein quantified using the Pierce BCA kit and prepared, and run on Western blots as reported previously (13).
Antibodies, enhanced chemiluminescence, and Western blot imaging
Rabbit polyclonal antibodies recognizing phosphorylated MEK1/2 (S217/221; catalog no. 9121L), MEK1/2 (catalog no. 9122), phosphorylated ERK1/2 (T202/Y204; catalog no. 9101), ERK1/2 (catalog no. 9102), and phosphorylated p90RSK T356/S363 (catalog no. 9344) were obtained from Cell Signaling Technology. Rabbit monoclonal antibodies recognizing RSK (catalog no. 9355), p90RSK S380 (catalog no. 9341), GFP [D5.1 XP(R); catalog no. 2956], and GAPDH (catalog no. 2118) were obtained from Cell Signaling Technology. After incubation with primary antibody overnight, membranes were probed with horseradish peroxidase–conjugated secondary antibody (Amersham NA934) for 1 hour, washed and proteins were detected by chemiluminescence (SuperSignal West Pico Plus, Thermo Fisher Scientific, catalog no. 34577) and visualized using the Fuji LAS-4000 imager (GE Healthcare Lifesciences).
Molecular dynamics simulation
Protein preparation: The crystal structure [Protein Data Bank(PDB) ID: 3SLS; ref. 14] of wild-type MEK1 was retrieved from the PDB (www.rcsb.org; ref. 15). Structures of mutant MEK1 mutations were constructed using SWISS-MODEL (16) with the wild-type structure as a template. The ligand (ANP) from the crystal structure was used as the ligand of each MEK1-simulated structures.
Simulation: All of the molecular dynamics simulations were conducted in a periodic boundary box with SPC water model using the Gromacs 5.1.4 package (17) with the Gromos 53a6 force field (18). The parameterization of the ligand was produced by PRODRG2.5 server (19, 20). To neutralize the systems, chloride ions and sodium ions were added to a random place in the simulation box (18). The molecular dynamics simulations were subjected to three steps. First, the main chain of the residues in the protein and heavy atoms (C, O, and N) of the ligand were constrained to minimize the energy, and the same process was performed with the unconstrained system. In the second step, each system was slowly heated from 0 to 300 K with constrained and unconstrained conditions successively. Finally, the 100 ns unconstrained production molecular dynamics simulation was performed. The hydrogen atoms were constrained using the SHAKE algorithm (21). The particle mesh Ewald summation algorithm was taken to calculate the long-range electrostatic interactions (22). All production simulations were performed under constant (atmospheric) pressure maintained by the Langevin piston method (23) and an optimum temperature (300 K) with 2 fs time step. System configurations were updated for every 2.0 ps. The protein secondary structure analysis was performed with the plug-in of Gromacs software for the simulation of trajectories and the secondary structure data were extracted for the probability analysis. Percent probability was based on the time of the existence of the α-helix or β-sheet during the simulation divided by the whole simulation time. Each secondary structure was represented by different letters in the simulation data (e.g., α-helix corresponds to G, β-sheet corresponds to E). The probability of the secondary structure of each residue in the simulation process was obtained by counting the probability of each letter. First, the total number of frames in the simulation (Total) was determined, followed by the total number of frames for each residue in the α-helix secondary structure (G). Then the % probability of α-helix for each residue was calculated using G/Total*100. Similarly, the probability of β-sheet for each residue was determined by E/Total*100.
MEK2 paralogy analysis
Paralogous residue positions were determined between MEK1 and MEK2 using pairwise alignment between their respective peptide sequences (MEK1: ENSP00000302486 and MEK2: ENSP00000262948) based on the BLOSUM62 matrix to assign amino acid similarity using a gap-opening penalty of 11 and a gap extension penalty of 1. For each MEK1 residue, a 30 amino acid window on either side was used as the region of alignment to MEK2, which avoids biasing the alignment at a specific residue based on a region of worse overall alignment. E-values indicate that only the N-terminal sequence was of lower alignment quality and therefore paralogy more difficult to determine.
Pan-cancer landscape of MAP2K1 mutations
To define the landscape of MAP2K1 alterations pan-cancer, we analyzed sequencing data from 42,434 patient tumors encompassing 32 organ types. A total of 20,516 tumor samples originated from previously published resources (TCGA, individual publications, etc.). The remaining 21,918 tumor/normal pairs were tumors from our ongoing prospective MSK-IMPACT genomic profiling initiative (11). MAP2K1 alterations were identified in 1.1% of all tumors (458 of 42,434 tumors), of which, the majority were missense mutations (81%), followed by in-frame deletions (4.3%). The incidence of MAP2K1 mutations ranged from 6.6% in melanoma and 6% in bowel cancers to less than 2% in other tumor types (Fig. 1A). A notable outlier was histiocytosis, where MAP2K1 mutations were identified in 26% of tumors (Fig. 1A). MAP2K1 mutations were distributed throughout the gene, with mutational clusters localized to helix-A in the negative regulatory domain and helix-C in the kinase domain (Fig. 1B and C; refs. 3, 24, 25).
We next performed an in-depth in silico analysis to catalog MAP2K1 mutations as driver versus benign events. First, we employed a statistical framework for mutational hotspot detection that adjusts for gene- and site-specific background mutation rates, among other variables (8, 9). Thirteen statistically significant mutational hotspot sites in MAP2K1 were identified, which accounted for 49.3% of the MAP2K1-mutated tumors (Fig. 1A, black; Fig 1B and C, dark blue; Supplementary Table S1). Eleven were missense mutants, including K57, the most frequently mutated site, and novel variants at N122 and P193. Two were clusters of largely uncharacterized in-frame deletions, one encompassing residues E41-Q58, and a second spanning I99-I107 (Fig. 1B and C). We observed lineage-specific patterns of mutation with K57N most common in lung and bowel cancers and P124S/L in melanoma (Supplementary Fig. S1). The balance of patients harbored one or more of six trending hotspots, defined as those that fell just below the threshold of statistical significance (Fig. 1B and C, light blue), and non-hotspot mutations, defined as those consistent with the background mutation rate (Fig. 1A and C, gray).
We next interrogated the predictive value of hotspot status with the ability of MEK1 variants to activate ERK. In total, we generated 115 MEK1 mutants, including 92 missense and 23 insertion/deletion (indel) variants that were either identified within our 42K cohort, previously reported, or were mutants not yet found somatically, but generated on the basis of paralogy. Twelve of 13 MAP2K1 hotspot sites (93.6%; 29/31 alleles tested at these 13 sites) validated as biologically active in our screen (Fig. 2A and B). In contrast, only 1/6 trending hotspot sites (28.6%; 2/7 alleles tested at these 6 sites) and 4/45 non-hotspot sites (10.4%; 5/48 alleles tested at these 45 sites) induced ERK phosphorylation (Fig. 2B).
Among the hotspot sites, we confirmed that MEK1 point mutants predicted to disrupt helix-A in the negative regulatory domain (F53L, Q56P, K57E/N, and D67N) or distort the position of helix-C in the kinase domain (C121S, P124S/L, G128D/V, and Y130C; Fig. 2A) were activating, consistent with the literature (2, 3, 25–32). We additionally expressed rare mutant alleles at these hotspot sites, including F53I/V, K57T, P124Q/R, and Y130H, and found them to be activating as well. MEK1 E203K (n = 26), the lone hotspot peak in the distal kinase domain, also validated as ERK-activating, consistent with prior studies. Interestingly, our efforts to assess the impact of all patient-derived variants within a hotspot site revealed that while the less common E203R and E203V mutants (n = 2 each) were functional, E203D, a conservative amino acid change present in 2 patients, did not induce pERK levels above wild-type MEK, and E203Q did so only minimally (Fig. 2A; Supplementary Fig. S2). The two remaining missense hotspots were previously uncharacterized variants at N122 and P193. N122D activated pERK and would be predicted to disrupt helix-C similarly to C121S (Fig. 2A). P193S was the only hotspot mutation that did not induce pERK (Supplementary Fig. S2).
Among the trending hotspots, only L177M/V induced pERK expression (Fig. 2A; Supplementary Fig. S2). Its mechanism of action is unclear as it is not located near other activating mutations in either sequence or structure. The other trending hotspots (R47, R49, R108, A257, and G294) were not activating, even though R49 clusters in three-dimensional (3D) proximity to other activating MEK1 mutations (Fig. 2A; ref. 33).
Of the 45 non-hotspot sites tested, only the previously known MEK1 I103N and the novel T55P, H119Q/Y and N109_R113del mutations (see below for indel) were activating, all of which are situated in or near critical domains (Fig. 2A; Supplementary Fig. S2). Notably, MEK1 T55P was previously identified as a germline variant in a patient with a RASopathy syndrome, as have seven of the activating MEK1 somatic hotspots (F53, D67, P124, G128, Y130, and E203), and the non-hotspot E44 mutation (34–36). Despite its previous association with a RASopathy (Noonan syndrome), the E44G/K mutants did not activate ERK, consistent with a thorough review of this patient's clinical records, which noted that the patient's mother was an asymptomatic carrier of the E44G variant (36). We also confirmed that MEK1 L42F, a germline variant yet to arise somatically, induced pERK (Supplementary Fig. S2). Finally, only one of 10 mutants N-terminal to F53 was activating (L42F) nor were any of the 25 mutants C-terminal to I204. In sum, the results robustly support the value of hotspot analysis for predicting the oncogenic status of previously uncharacterized genomic variants.
To further refine our ability to computationally predict the functional status of MEK1 mutants, we performed 3D molecular dynamics simulation to infer the ability of specific amino acid substitutions to impact MEK1 conformational activation. Systematic evaluation of the structural changes that occur following RAF-mediated phosphorylation of wild-type MEK1 demonstrated that disordering of the empiric α-helical structure of the activation (T-) loop is critical for kinase activation and could be used to predict MEK1-mutant activation status (29–31, 37). Molecular dynamics simulation of the biochemically benign, non-hotspot R49H, Q58H, and R201C MEK1 mutants predicted retention of the α-helix structure similar to wild-type MEK1, despite their location within regulatory domains or near known hotspots (Fig. 2C and D). In contrast, molecular dynamics simulation predicted that the novel activating non-hotspot variants MEK1 H119Y and L177M, as well as previously characterized activating hotspots P124S and E203K, would significantly disrupt the α-helix conformation of the T-loop similar to phosphomimetic-MEK1 (Fig. 2C and D). Thus, molecular dynamics simulation could distinguish activating MEK1 mutants and therefore help prioritize nonrecurrent MEK1 mutations for functional evaluation.
Hotspot and molecular dynamics simulation analysis of two novel clusters of MEK1 in-frame deletions
Hotspot analysis identified two clusters of somatic MEK1 in-frame indels. Four unique negative regulatory domain mutants were identified in patients with histiocytosis (E41_F53del), lung adenocarcinoma (E41_L54del), melanoma (E51_Q58del), and pheochromocytoma (F53_Q58delinsL). Furthermore, 14 patients with diverse tumor types harbored five unique helix-C deletions, the most common being E102_I103del, a previously characterized activating deletion (Fig. 1C; refs. 4, 25, 38, 39). All nine of these MEK1 hotspot in-frame indels robustly induced ERK phosphorylation (Fig. 3A; Supplementary Fig. S2).
As prior large-scale sequencing studies, including in melanoma, failed to report MEK1 indels, we reviewed subsequent MSK-IMPACT analyses, the multi-institutional AACR GENIE dataset (12), and the literature to substantiate our discovery of these clusters. This analysis identified 14 additional nonrecurrent somatic or germline MEK1 indels, only two of which had been previously characterized (Supplementary Table S2; refs. 2, 40). Twelve of these 14 MEK1 indels induced pERK expression, including the lone insertion mutation (L50_E51insI), as well as all seven negative regulatory deletions and all four helix-C deletions (Fig. 3A; Supplementary Fig. S2). Notably, a germline MEK1 K59del identified in cardiofaciocutaneous syndrome patients was less activating than other indels in that region (35, 41, 42). Also of note, MEK1 N109_R113del (n = 2), localized within helix-C, was activating but fell just below the statistical threshold used to define trending hotspots (Supplementary Table S1). Of the two remaining indels, R201_E203delinsQ was not activating in our assay as it effectively creates a E203Q mutation, which was similarly minimally activating (Supplementary Fig. S2). D152_L155del did not fall within either hotspot cluster or near any other functional domain and was not activating (Supplementary Fig. S2).
To extend prior work that revealed the activating potential of disrupting helix-A of the negative regulatory domain (43), we sought to define the boundaries within which deletions in helix-C induced MEK1 activity. We therefore designed 6-bp deletions (two amino acids, paired) walking 5′ to 3′ in the helix-C region from residues A95 to L115 and observed robust activation of ERK in the subregion from L98 to R113, with the highest activity involving or including deleted residues between L101 and I107 (Fig. 3B). These data suggest that all novel MEK1 indels involving amino acids L98 to R113 should be considered as likely activating.
Consistent with our biochemical and hotspot analyses, molecular dynamics simulation also predicted that MEK1 I99_K104del, E102_I103del, and N109_R113del would induce loss of the α-helical conformation of the T-loop and thus induce kinase activation (Fig. 3C and D). Conversely, molecular dynamics simulation of R96_K97del and E114_L115del, functionally benign MEK1 indels that lie outside of the essential helix-C residues, were predicted to retain the α-helical T-loop structure similar to wild-type MEK1 (Fig. 3C and D). Thus, individually and collectively, hotspot, biological, and structural analyses revealed the presence of two clusters of activating MEK1 in-frame indels.
Leveraging paralogy between MEK1 and MEK2 to predict the functionality of MEK2 mutants
MAP2K2 alterations were present in only 0.7% of tumors (294 of 42,434) in our combined retrospective and prospective cohort, likely less than the requisite cohort size for adequately powered hotspot analysis. Given that MAP2K1 and MAP2K2 are highly homologous genes that share 80% base pair identity, with 90% conservation of their kinase domains, MAP2K2 was an attractive candidate to benchmark paralogy analysis as a method to distinguish between functional and benign variants (44–46). Multiple sequence alignment of the MEK1 and MEK2 protein coding sequences demonstrated that all residues within the negative regulatory and helix-C domains were identical, except for one amino acid (Supplementary Fig. S3). Therefore, we tested whether paralogy to MEK1 could reliably predict the activation status of previously uncharacterized MEK2 mutations (Fig. 4A).
All 14 MEK2 mutations paralogous to activating MEK1 mutants, hotspot and non-hotspot alike, induced ERK phosphorylation. These mutations included four classified as hotspots (MEK2 F57, V64, P128, Y134) and two as trending hotspots (Q60, C125; Fig. 4B; Supplementary Tables S3 and S4). As predicted on the basis of paralogy to activating MEK1-negative regulatory domain indels, we also found that MEK2 L46_E55del, a germline RASopathy variant yet to be identified somatically (36), was activating (Fig. 4B). None of 12 MEK2 mutations paralogous to experimentally validated benign MEK1 variants induced ERK phosphorylation (Fig. 4B; Supplementary Fig. S3; Supplementary Table S3). Overall, assessment of paralogy to MEK1 accurately discriminated activating from benign variants in MEK2.
Differential MEK and ERK inhibitor sensitivity among MEK1/2 mutants
Both durable responses and acquired resistance to MEK inhibition have been reported in MEK1-mutant patients (2, 5–7, 26, 40, 47, 48). To define the MEK/ERK inhibitor sensitivity of the novel MEK1 and MEK2 variants identified above, we compared the ability of MEK and ERK inhibitors to abrogate ERK pathway output. We observed significant differences among ERK-activating MEK1 and MEK2 mutations in the ability of trametinib, an FDA-approved MEK inhibitor, to inhibit ERK phosphorylation (Fig. 5A; Supplementary Fig. S4). The novel missense alleles L42F, T55P, H119Q/Y, N122D, and L177M were sensitive to MEK inhibition, though at different thresholds. Negative regulatory domain indels spanning E41-E62 were also highly sensitive to trametinib, whereas helix-C indels from I99-I107 were less responsive. This is consistent with recent reports demonstrating that cells expressing MEK1 E102_I103del require higher concentrations of trametinib to inhibit ERK than MEK1 point mutants such as K57N (4). The novel N109_R113del, just C-terminal to the helix-C hotspot cluster, proved to be an exception as it was highly sensitive to trametinib.
The sensitivity of activating MEK2 mutants to trametinib also varied as a function of mutant allele in a pattern similar to their paralogous MEK1 alleles (Fig. 5B; Supplementary Fig. S4). We next extended this therapeutic analysis to the effects of ERK inhibition using the ATP-competitive ERK1/2 kinase inhibitor SCH772984 (49). In contrast to trametinib, SCH772984 uniformly inhibited downstream pathway activity, as assessed by downregulation of RSK phosphorylation, in cells expressing MEK1 and MEK2 missense and indel mutants (Figs. 5C and D; Supplementary Fig. S5).
Transient or stable overexpression models enable faster and broader assessment of mutant allele activity and drug sensitivity than knock-in cell line or mouse models. While time-efficient, oncogene overexpression can confound assessment of drug sensitivity by altering relative levels of mutant and wild-type protein expression or by influencing other protein–protein interactions (50). We therefore used CRISPR/Cas9 to generate isogenic 293T cells in which the F53L, K57N, and E102_I103del mutants were knocked-in and expressed under the control of the endogenous MEK1 promoter. As predicted, CRISPR knock-in MEK1 F53L, K57N, and E102_I103del cells had increased ERK phosphorylation, with E102_I103del resulting in the greatest induction (Fig. 5E). Basal pERK was, however, markedly lower in each of the CRISPR knock-in lines as compared with 293T cells transfected with K57N or E102_I103del (Fig. 5F, 0 nmol/L trametinib lanes). As such, trametinib was able to more efficiently suppress pERK in CRISPR knock-in cells, especially in the case of MEK1 E102_I103del (Fig. 5F). Moreover, SCH772984 effectively abrogated p-RSK in both models (Fig. 5G).
One limitation to the broader adoption of cancer precision medicine is the large number of variants of unknown significance identified during clinical tumor and germline next-generation sequencing. Emerging algorithmic strategies and high-throughput phenotyping screens have attempted to address this challenge, but experimentally validated consensus methodology is lacking. To gauge the utility of in silico methods for the stratification of variants of unknown significance as functional or benign, we leveraged a cohort of 42,434 sequenced tumor/normal matched pairs and comprehensive patient- and paralogy-directed functional analysis of the MAP2K1 and MAP2K2 genes. We found that an integrated, multifaceted computational analysis could, with strong predictive value, categorize mutants as biologically active versus likely inert.
It has long been presumed that mutational recurrence within a population is indicative of positive selection and thus likely to be predictive of functionality. More recently, this dogma has been questioned on the basis of data suggesting that a subset of hotspots are nonfunctional variants arising at inherently mutable sites (51). For instance, APOBEC3A-mediated deamination of cytosine within DNA short hairpin loops can increase passenger mutation rates 200-fold compared with nonstem-loop sites enriched in known drivers (52). The hotspot algorithm used here was designed to account for many of these innate covariates including nucleotide context mutability, gene- and site-specific mutation rates, maximizing sensitivity for hotspot discovery (minimizing false negatives) while seeking to maintain a low FDR of 1%. In support of the robustness of this computational approach, 93.6% of MEK1 hotspots were confirmed to induce ERK activation versus only five of 51 trending and non-hotspot sites. Our review of the postulated 27 optimal APOBEC3A substrate sites in MEK1 and 15 in MEK2 (52), indicate that none were mutated in MEK1 in the >42K sequenced tumors to date, suggesting that APOBEC3A-induced mutagenesis is not a confounding factor in this case.
The differences in inherent mutability of individual sites throughout the human genome does, however, reinforce the importance of careful biological verification of statistical methods as performed here using large, collaborative sequencing datasets to test and train algorithmic models coupled with subsequent laboratory validation. Our experience suggests that no single hotspot algorithm is likely to account for all genomic covariates or achieve a balance of sensitivity and specificity that is appropriate for all settings. For example, in a research versus clinical context, an algorithm designed to maximize specificity may fail to flag a subset of rare actionable variants that could be critical to individual active cancer patients in need of novel therapies.
To account for the limitations of the hotspot approach to variant classification and to help identify driver mutations in the longtail of the frequency distribution, we have proposed systematic integration of multiple in silico techniques that add critical dimensions and context, such as protein folding, sequence conservation or association with germline pathogenicity, to help strengthen or refine hotspot predictions. In our study, disordering of the T-loop α-helix served as a structural proxy for distinguishing activating from benign mutants. Provided that one can isolate discrete conformational changes that influence protein activation, molecular dynamics simulation could also be a valuable method for inferring the mechanism by which a particular mutant induces kinase activation when sequence location is uninformative; information that could also influence drug sensitivity. Paralogy was also highly effective for candidate mutation discovery in MAP2K2, a gene for which the mutation rate is sufficiently low as to make hotspot analyses equivocal. For example, paralogy accurately predicted the functionality of the nine trending or non-hotspot MEK2 variants that aligned to activating MEK1 mutants. Moreover, paralogy analysis refined one trending (R53Q) and two hotspot (R112Q and R231H) calls to nonfunctional, similar to their inert MEK1 counterparts.
Whereas our functional studies were guided by and largely restricted to the spectrum of MAP2K1 and MAP2K2 mutations identified to date in patients with cancer, several laboratories have taken the reciprocal approach of systematically screening pooled variants using high-throughput cancer phenotyping assays (53–58). These methods provide a rich array of data, but are time and labor intensive, require conversion of phenotypes into computational endpoints, and work best when a single phenotype (e.g., proliferation) is a robust surrogate of tumorigenic output. There is also the potential for masking of weaker variants in such screens by the confounding effects of protein overexpression or the failure to correctly classify variants that have neomorphic phenotypes. We acknowledge that our use of ERK phosphorylation as a surrogate for MEK1/2 variant activity, while straightforward, may also not account for neomorphic phenotypes, or phenotypes that only manifest in the context of specific co-alterations or lineage specific factors.
As several MEK inhibitors are FDA-approved, one primary motivation for delineating all activating MEK1/2 mutations is to identify the most appropriate candidates for treatment with selective ERK pathway inhibitors. However, while our in silico methods robustly identified activating MEK1/2 mutants, MEK inhibitor sensitivity varied widely among mutants. Neither inferences from domain location nor position in 3D space explained the variable MEK inhibitor responses observed. MEK1/2 mutants were uniformly sensitive to ERK inhibition, suggesting that ERK inhibition may be a more broadly effective treatment approach for patients harboring MAP2K1/2 mutations. These data highlight the critical need to generate and train new robust algorithmic methodologies to predict response to treatment to help guide the management of patients with previously uncharacterized variants of unknown significance.
Accrual of MEK-mutant patients to clinical trials of MEK or ERK inhibitors has been limited to date, and review of patient records at our institution revealed few examples of off-label MEK inhibitor use. Still, dramatic and durable response of MEK1-mutant tumors to MEK inhibitors have been observed in histiocytosis and in an exceptional responder with low-grade serous ovarian cancer (2, 5–7). However, some of the exact mutants (e.g., Q56P, P124L) that were present in histiocytosis patients that responded to MEK inhibitors have been postulated as a basis for acquired resistance to RAF and/or MEK inhibitor therapy in melanoma and colorectal cancer (26, 47, 48, 59). This clinical result suggests that cellular context including lineage, co-mutation pattern, gene dosage, or allelic imbalance may have equal or even greater influence on patient response than biochemical allele sensitivity. If so, patients with mutant alleles that are less sensitive to trametinib, such as E102_I103del, could still respond if the mutation arose in certain lineage contexts, such as histiocytosis, that are less likely to harbor co-mutations that diminish MEK dependence or that exhibit exquisite lineage dependence on ERK signaling. In such patients, even partial inhibition of MAP kinase signaling for short duration may be sufficient to induce a tumor response. Such tumors would be analogous to BCR-ABL–mutant chronic myelogenous leukemia, where even transient exposure to dasatinib can provoke an irreversible apoptotic program and clinical response (60).
In sum, our efforts indicate that multifaceted in silico analysis is an efficient and accurate predictor of mutant allele biologic activity. We find that molecular dynamics simulation and sequence paralogy are robust tools that can complement hotspot analysis and help prioritize recurrent and nonrecurrent variants for in depth biologic, therapeutic, and clinical validation.
Disclosure of Potential Conflicts of Interest
M.T. Chang reports personal fees from Genentech (Genentech employee at time of publication but not during the time in which the work was performed) outside the submitted work. Z. Yao reports personal fees from MapKure LLC (serve as a SAB member of MapKure) outside the submitted work. A.N. Shoushtari reports personal fees and nonfinancial support from Bristol Myers Squibb and Immunocore; personal fees from Castle Biosciences; nonfinancial support from AstraZeneca, Xcovery, and Polaris outside the submitted work. O. Abdel-Wahab reports grants and personal fees from H3 Biomedicine Inc.; personal fees from Amgen, Foundation Medicine Inc., Janssen, Merck, and Incyte outside the submitted work. T. Merghoub has served as a consultant for Leap Therapeutics, Immunos Therapeutics, and Pfizer, and cofounder of Imvaq therapeutics. T. Merghoub has equity in Imvaq therapeutics and reports grants from Bristol Myers Squibb, Surface Oncology, Kyn Therapeutics, Infinity Pharmaceuticals, Peregrine Pharmeceuticals, Adaptive Biotechnologies, Leap Therapeutics, and Aprea. T. Merghoub is an inventor on patent applications related to work on oncolytic viral therapy, alphavirus-based vaccines, neo-antigen modeling, CD40, GITR, OX40, PD-1, and CTLA-4. M.F. Berger reports personal fees from Roche and grants from Grail outside the submitted work. N. Rosen reports grants and personal fees from Chugai (grant, consultant); personal fees from Sanofi (consultant), Jubilant (consultant), Ribon (SAB), Boeringer (consulting), Astrazeneca (SAB), Novartis (SAB), Tarveda (SAB), and Array-Pfizer (unpaid consultant); other items from Effector (consultant without pay), Revmed (consulting without pay), Kura (equity), and Concarlo (consultant); personal fees and other items from Zai Lab (SAB, equity), MAPCure (SAB, equity), Beigene (SAB, equity), Fortress (consulting, equity), and Zai Labs (SAB, equity). In addition, N. Rosen has a patent for Lutris, ointment for treating Mek inhibitor-induced rash. B.S. Taylor reports grants and personal fees from Genentech; personal fees from Boehringer Ingelheim and Loxo Oncology at Lilly outside the submitted work. D.B. Solit reports personal fees from Pfizer (advisory board member), Loxo Oncology (advisory board member, Stock Options), Lilly Oncology (advisory board member), Vivideon Therapeutics (advisory board member), and Illumina (advisory board member) outside the submitted work. No potential conflicts of interest were disclosed by the other authors.
A.J. Hanrahan: Conceptualization, data curation, formal analysis, supervision, validation, investigation, visualization, methodology, writing-original draft, project administration, writing-review and editing. B.E. Sylvester: Conceptualization, formal analysis, validation, investigation, methodology, writing-review and editing. M.T. Chang: Data curation, formal analysis, validation, investigation, visualization, methodology, writing-review and editing. A. Elzein: Formal analysis, validation, investigation, writing-review and editing. J. Gao: Data curation, formal analysis, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. W. Han: Software, formal analysis, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. Y. Liu: Software, formal analysis, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. D. Xu: Software, formal analysis, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. S.P. Gao: Formal analysis, validation, investigation, methodology, writing-review and editing. A.N. Gorelick: Data curation, software, formal analysis, validation, investigation, methodology, writing-review and editing. A.M. Jones: Formal analysis, validation, investigation, writing-review and editing. A.J. Kiliti: Formal analysis, validation, investigation, methodology, writing-review and editing. M.H. Nissan: Formal analysis, validation, investigation, writing-review and editing. C.A. Nimura: Formal analysis, validation, investigation, writing-review and editing. A.N. Poteshman: Formal analysis, validation, investigation, writing-review and editing. Z. Yao: Formal analysis, validation, writing-review and editing. Y. Gao: Formal analysis, validation, writing-review and editing. W. Hu: Data curation, formal analysis, writing-review and editing. H.C. Wise: Validation, writing-review and editing. E.I. Gavrila: Data curation, writing-review and editing. A.N. Shoushtari: Resources, data curation, writing-review and editing. S. Tiwari: Resources, investigation, writing-review and editing. A. Viale: Resources, formal analysis, investigation, writing-review and editing. O. Abdel-Wahab: Resources, validation, writing-review and editing. T. Merghoub: Resources, formal analysis, investigation, writing-review and editing. M.F. Berger: Resources, data curation, software, formal analysis, methodology, writing-review and editing. N. Rosen: Formal analysis, writing-review and editing. B.S. Taylor: Resources, data curation, software, formal analysis, funding acquisition, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. D.B. Solit: Conceptualization, formal analysis, supervision, funding acquisition, visualization, methodology, writing-original draft, project administration, writing-review and editing.
This research was supported by the Marie-Josée and Henry R. Kravis Center for Molecular Oncology (CMO), Cycle for Survival, the NIH [R01 CA229624 (to D.B. Solit), R01 CA234361 (to D.B. Solit), U54 OD020355 (to D.B. Solit and B.S. Taylor), R01 CA207244 (to B.S. Taylor), R01 CA204749 (to B.S. Taylor), R35-GM126985 (to D. Xu), and R21-LM012790 (to D. Xu), T32 TROT fellowship (to B.E. Sylvester), 1 R01 CA201247 (to O. Abdel-Wahab)], the NIH/NCI (Cancer Center support grant P30 CA008748R01, R01 CA056821 to T. Merghoub), and the NCI (ITCR grant U24-CA220457-01 to J. Gao). B.S. Taylor was also supported by research grants from the American Cancer Society (RSG-15-067-01-TBG), Prostate Cancer Foundation, Anna Fuller Fund, and the Josie Robertson Foundation. J. Gao was supported by the Fund for Innovation in Cancer Informatics from the Brown Performance Group (the-ici-fund.org). O. Abdel-Wahab was also supported by grants from the Histiocyte Society, the Erdheim–Chester Disease Global Alliance, the Functional Genomics Initiative of Memorial Sloan Kettering Cancer Center, the Histiocytosis Association, the Leukemia & Lymphoma Society and the Frame Fund. T. Merghoub was also funded in part through Swim Across America, Ludwig Institute for Cancer Research, Parker Institute for Cancer Immunotherapy, and Virginia B. Squiers Foundation. The authors would like to acknowledge the Integrated Genomics Operation, MSKCC CMO (S. Duygu Selcuklu and Pavitra Rao) and MSKCC Diagnostic Molecular Pathology. We would also like to acknowledge the AACR Project GENIE registry and consortium for their commitment to data sharing. We further acknowledge Rony Seger and Dustin Maly for sharing their MEK1/2 vectors via Addgene.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.