Abstract
The lack of knowledge about the relationship between tumor genotypes and therapeutic responses remains one of the most critical gaps in enabling the effective use of cancer therapies. Here, we couple a multiplexed and quantitative experimental platform with robust statistical methods to enable pharmacogenomic mapping of lung cancer treatment responses in vivo. The complex map of genotype-specific treatment responses uncovered that over 20% of possible interactions show significant resistance or sensitivity. Known and novel interactions were identified, and one of these interactions, the resistance of KEAP1-mutant lung tumors to platinum therapy, was validated using a large patient response data set. These results highlight the broad impact of tumor suppressor genotype on treatment responses and define a strategy to identify the determinants of precision therapies.
An experimental and analytical framework to generate in vivo pharmacogenomic maps that relate tumor genotypes to therapeutic responses reveals a surprisingly complex map of genotype-specific resistance and sensitivity.
Introduction
Efforts over the past decade have generated many novel cancer therapies (1, 2). However, patient responses are heterogeneous, with some patients responding well and others showing limited or no response (3, 4). Although it is believed that the genetic complexity of cancer underlies a significant portion of the variation in therapeutic response, the map of such pharmacogenomic interactions is currently lacking (5–7). Despite widespread tumor genotyping, only a few driver mutations currently inform clinical treatment decisions and clinical trial designs (8–10). This is driven by the fact that we do not yet know which tumor suppressor alterations influence sensitivity or resistance to specific therapies. The very premise that tumor suppressor genotype substantially affects therapeutic responses remains largely untested.
The pharmacogenomic landscape of cancer drug responses has been investigated using cell lines, patient-derived xenografts (PDX), and patient treatment outcome data (5, 11–14). However, such genotype–treatment interactions are notoriously difficult to measure using these systems for four major reasons: the large numbers of driver and passenger mutations, the observational instead of manipulative nature of the experiments, lack of the appropriate autochthonous in vivo environment, and the high stochasticity of tumor growth. Specifically, cell lines grown in vitro lack the appropriate in vivo environment, do not represent all cancer subtypes, and often carry additional alterations that arise during passaging (15). PDXs and human cell line transplantation models recapitulate some aspects of in vivo growth, but growth factor/receptor incompatibility, growth in nonorthotopic sites, and the obligate absence of the adaptive immune system compromise these approaches (16–18). Furthermore, human tumor–derived systems almost invariably have large numbers of mutations and genomic alterations. Thus, even large-scale analyses often lack the statistical power to glean cause-and-effect relationships between individual genomic alterations and therapeutic responses (5, 14). The same logic applies to patient treatment response data, which are generally too limited in scale to provide sufficient statistical power to confidently associate tumor suppressor genotypes with metrics of clinical response (19). Such data are particularly sparse for unapproved therapies (limited to clinical trial results) and are nonexistent for preclinical therapeutic candidates.
A cost-effective system that introduces defined genomic alterations, measures the response of a large number of isogenic tumors, and recapitulates the in vivo physiologic context could be valuable for uncovering genotype–treatment relationships. Here we present such a system based on tumor barcoding in genetically engineered mouse models. Genetically engineered mouse models of human cancer are important preclinical models, as they recapitulate the physiologic, tissue, and immunologic context of tumor growth (20, 21). These models uniquely enable the introduction of defined genomic alterations into adult somatic cells, which leads to the generation of autochthonous tumors (20). These tumors can recapitulate the genomic alterations, gene-expression state, histopathology, and therapy-refractive nature of corresponding human cancers (11, 22). Despite the potential value of these models in preclinical translation studies, the breadth of their utility has been limited in practice by the fact that they are neither readily scalable nor sufficiently quantitative (23–27).
To increase the scope and precision of in vivo cancer modeling and to assess tumor suppressor gene function in a multiplexed manner, we previously developed a system that couples tumor barcoding with high-throughput barcode sequencing (Tuba-seq; ref. 26). This method integrates CRISPR/Cas9-based somatic genome engineering and molecular barcoding into well-established Cre/Lox-based genetically engineered mouse models of oncogenic Kras-driven lung cancer (28). The initiation of lung tumors with pools of barcoded Lenti-sgRNA/Cre viral vectors enables the generation of many tumors of different genotypes in parallel. All neoplastic cells within each clonal tumor have the same two-component barcode, in which an sgID region identifies the sgRNA and a random barcode (BC) is unique to each tumor. Thus, high-throughput sequencing of the sgID-BC region from bulk tumor-bearing lungs can quantify the number of neoplastic cells in each tumor of each genotype (28). Previous Tuba-seq studies quantify tumor suppressor effects and their interaction with other tumor suppressor genes, focusing only on comparisons within mice (28–30). Comparisons of tumor distributions across mice are more challenging and required improvements in accuracy as well as new analytical methods.
Here, we optimize multiple key aspects of the Tuba-seq approach. The greatly improved accuracy in tumor calling enabled us to compare tumor size distributions between groups of mice, i.e., treated and untreated groups, and to generate a large-scale map that relates tumor genotype to therapeutic responses in vivo. We developed a new analytical and computational framework, pharmacogenomic tumor barcoding with high-throughput barcode sequencing (PGx-Tuba-seq). We quantify the treatment responses of tens of thousands of oncogenic KRAS-driven lung tumors of eleven different tumor suppressor genotypes to a diverse panel of therapies, and uncover a surprisingly complex pharmacogenomic map of resistance and sensitivity. PGx-Tuba-seq represents a more tractable method to uncover the therapeutic response of different tumor genotypes than previous in vitro and in vivo screening approaches.
Materials and Methods
Mice, tumor initiation, and drug treatment
All animal experiments have been approved by Institutional Animal Care at Stanford University with protocol number 26696. Lung tumors were initiated by intratracheal delivery of the same lentiviral pools (26). 1.1 × 105 and 2.2 × 104 infectious unit/mouse were administered to each KrasLSL-G12D (K); R26LSL-Tomato (T) (hereafter KT), and KT;H11LSL-Cas9 mouse (31–33), respectively. Drug treatments were started 15 weeks after tumor initiation. For the main pharmacogenomic mapping experiment, mice were assigned to eight treatment arms or were left untreated for 3 weeks (Fig. 1A; Table 1).
Type . | Treatment . | Dose . | Frequency . | Route of administration . |
---|---|---|---|---|
Monotherapy | Palbociclib | 100 mg/kg | Daily | Oral gavage |
Monotherapy | Everolimus | 10 mg/kg | Daily | Oral gavage |
Monotherapy | Phenformin | 100 mg/kg | Daily | Oral gavage |
Monotherapy | Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection |
Monotherapy | Trametinib | 0.3 mg/kg | Daily | Oral gavage |
Combination | Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection |
+ Trametinib | 0.3 mg/kg | Daily | Oral gavage | |
Combination | Carboplatin | 50 mg/kg | Every five days | Intraperitoneal injection |
+ Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection | |
Combination | Carboplatin | 50 mg/kg | Every five days | Intraperitoneal injection |
+ Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection | |
+ Trametinib | 0.3 mg/kg | Daily | Oral gavage |
Type . | Treatment . | Dose . | Frequency . | Route of administration . |
---|---|---|---|---|
Monotherapy | Palbociclib | 100 mg/kg | Daily | Oral gavage |
Monotherapy | Everolimus | 10 mg/kg | Daily | Oral gavage |
Monotherapy | Phenformin | 100 mg/kg | Daily | Oral gavage |
Monotherapy | Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection |
Monotherapy | Trametinib | 0.3 mg/kg | Daily | Oral gavage |
Combination | Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection |
+ Trametinib | 0.3 mg/kg | Daily | Oral gavage | |
Combination | Carboplatin | 50 mg/kg | Every five days | Intraperitoneal injection |
+ Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection | |
Combination | Carboplatin | 50 mg/kg | Every five days | Intraperitoneal injection |
+ Paclitaxel | 20 mg/kg | Every other day | Intraperitoneal injection | |
+ Trametinib | 0.3 mg/kg | Daily | Oral gavage |
Tuba-seq library generation
Genomic DNA was isolated from bulk tumor-bearing lung tissue from each mouse (26). Three benchmark control cell lines (∼5 × 105 cells/cell line) were added to each mouse lung sample prior to lysis to enable the calculation of the absolute neoplastic cell number within each tumor (28). To reduce the errors of the Tuba-seq pipeline by orders of magnitude, we implemented multiple critical changes to the library preparation, sequencing, and analysis (Tables 2 and 3). Q5 High-Fidelity 2x Master Mix (NEB, M0494X) was used to amplify the sgID-BC region from 32 μg of genomic DNA (34). To improve sequencing quality, we used unique dual-indexed primers and added 6–9 random nucleotides (Ns) to the flanking ends of both index primers before the sequence-specific primer regions (35). The libraries were pooled based on lung weight to ensure even read depth and sequenced on an Illumina HiSeq 2500 platform (Admera Health) with paired-end 150 bp reads.
Module . | Previous (Rogers et al., 2017) . | Current . | Purpose . |
---|---|---|---|
Viral production | Pooled | Each viral vector was prepared separately | Eliminate lentiviral template switching |
Library preparation | Taq polymerase | Q5 polymerase | Reduce PCR errors |
Library preparation | Single indexing | Dual unique indexing | Eliminate the impact of index hopping during sequencing on tumor calling |
Sequencing | Single-end | Paired-end | Reduce “spurious tumors” created by sequencing errors |
Read processing and tumor calling | DADA2 clustering | Stringent filtering on reads | Eliminate “spurious tumors” created by PCR and sequencing errors |
Remove spurious tumors recursively based on hamming distance | |||
Read processing and tumor calling | No restriction on BC length | Require exact length match | Eliminate “spurious tumors” created by PCR and sequencing errors |
Module . | Previous (Rogers et al., 2017) . | Current . | Purpose . |
---|---|---|---|
Viral production | Pooled | Each viral vector was prepared separately | Eliminate lentiviral template switching |
Library preparation | Taq polymerase | Q5 polymerase | Reduce PCR errors |
Library preparation | Single indexing | Dual unique indexing | Eliminate the impact of index hopping during sequencing on tumor calling |
Sequencing | Single-end | Paired-end | Reduce “spurious tumors” created by sequencing errors |
Read processing and tumor calling | DADA2 clustering | Stringent filtering on reads | Eliminate “spurious tumors” created by PCR and sequencing errors |
Remove spurious tumors recursively based on hamming distance | |||
Read processing and tumor calling | No restriction on BC length | Require exact length match | Eliminate “spurious tumors” created by PCR and sequencing errors |
Processing reads to identify the sgID and barcode and removal of “spurious tumor” generated by read errors
We required both the forward and reverse sequencing reads to match perfectly within the BC region. FASTQ files were processed to identify the sgID and BC counts for each tumor. The sgID region identified the targeted tumor suppressor gene. The number of reads with each unique sgID-BC in each sample was summed to calculate each putative tumor's size. PCR and sequencing errors within the random barcode regions generate spurious tumors. We used stringent criteria to reduce and even eliminate the effects of PCR and sequencing errors on tumor calls, greatly reducing spurious tumors (Fig. 1B; Supplementary Fig. S1A) when quantifying relative tumor sizes (Fig. 1C; Supplementary Fig. S1B–S1D), showing larger effect sizes (Supplementary Fig. S2A–S2D).
Developing unbiased procedures for detecting genotype-specific drug effects
Previous Tuba-seq analyses focused on comparing the sizes of tumors of different genotypes within individual mice (28, 30). Such analyses are largely robust to multiple sources of variation among mice (Supplementary Fig. S3A–S3D). Here we needed to compare tumor sizes between the untreated and treated groups when analyzing genotype-specific drug responses. We used the same viral pool to initiate tumors in all mice; therefore, the relative representation of transduced epithelial cells containing each Lenti-sgRNA/Cre is constant and does not vary across mice.
Null model of tumor responses with no genotype specificity
We assume that the therapy affects all tumors proportionally to their sizes such that the size of each tumor changes from X to X1 = X × S after the treatment, where S is the proportion of remaining cancer cells. Under the null model (H0) of no genotype-specific drug responses, S is constant and does not depend on tumor genotype. Under the alternative model H1, S varies depending on the genotype: SsgID,j = SInert × (1 + Gj), with Gj representing the genotype-specific therapeutic response (GSTR) of tumors of specific genotypes to the drug j. If Gj > 0, the inactivation of the tumor suppressor gene confers relative resistance; if Gj < 0, the inactivation of the tumor suppressor gene confers relative sensitivity.
The most extreme treatment reduced tumor sizes by ∼87%. Although the depth of sequencing varied across mice and treatments, we wanted to reliably identify tumors in each treated and untreated mouse. Thus, we chose to use the cutoff of L = 1,000 cells in the untreated mice, allowing reliable detection and accurate size estimates of tumors in each mouse.
Calculation of proportional size reduction as the drug effect
To estimate the value of the tumor reduction factor S that leads to the best match between the distributions of inert tumors (those with inert sgRNAs) in the treated and untreated groups, we calculated the value of S such that the median number of shrunk tumors across all the untreated mice was closest to the median of the number of observed tumors with the size above or equal to 1,000 cells across all the mice in the treated group.
Using relative tumor number (ScoreRTN) to estimate GSTR
Our first approach defines response as the number of tumors that exceed a minimum size threshold (Fig. 2A and B). The null hypothesis for each genotype is that the number of tumors above the cutoff L in the untreated mice should match the number above the new cutoff L×S in the treated mice. If a GSTR exists, the tumors with a specific sgID are more resistant to the drug than the inert tumors, and more of such tumors should remain above the adjusted cutoff of L×S than expected, whereas if they are more sensitive, fewer such tumors should remain above the adjusted cutoff of L×S. We first calculate the ratio of the number of tumors above the cutoff L in the untreated mice of a particular sgID to that of the inert tumors (RTNi,j,L),
where |{C_{i,j,k}}$| is the total number of tumors observed in mouse k in treatment group j (j = untreated here) carrying sgID i above the cutoff L. We then calculate the similar ratio for the treated mice with a modified cutoff L×S,
The null hypothesis can then be expressed as the expectation that
or alternatively that:
Under the alternative hypothesis where |{\rm{\ ScoreRT}}{{\rm{N}}_{i,j}}\ \ne \ 0$|, a positive sign of |{\rm{ScoreRT}}{{\rm{N}}_{i,j}}$| suggests that the tumors with a particular sgID are more resistant than the inert tumors, whereas a negative sign suggests the tumors are more sensitive than inert tumors.
Use relative geometric mean (ScoreRGM) to estimate GSTR
The second metric, ScoreRGM, compares the geometric mean of tumors carrying sgID i relative to the inert tumors in the untreated and treated groups (Fig. 2A and C). If we analyze a comparable number of tumors in the untreated and treated mice, with no GSTR, the relative growth advantage of tumors carrying a specific sgID (sgID i) relative to inert tumors, represented by the relative geometric mean, will remain constant. If the tumors with a specific sgID (sgID i) are resistant to the drug, the relative geometric mean for sgID i will be larger in the treated group, whereas if sensitive, the relative geometric mean will be smaller. Although RTN does not use the numeric value of tumor size other than comparing it with the cutoff, RGM incorporates such tumor size profile information. Hence, RGM and RTN are partly redundant but do incorporate partially different information about GSTR.
We denote the total tumor count (T) with a certain sgRNA (i) in an individual mouse (k) in the treated group (j) as Ti,j,k. Here, we do not limit tumors to those above 1,000 cells but rather count any tumor with greater than or equal to 2 reads (after the stringent filtering described above) as a tumor. For an untreated mouse, the proportion of initiated tumors of each sgID can be approximated by Ri, the ratio of Ti,untreated,k to TInert,untreated,k:
We then take the top N tumors with sgRNA i from mouse k treated by drug j as
where |{C_{i,j,k}}$| is the total number of inert tumors observed in each mouse above the cutoff |L \times S$| (S = 1 for the untreated group), and then we calculate the geometric mean for all tumors containing the sgID and inert tumors across all mice in the group.
The score for the relative geometric mean is calculated as
where |{\rm{G}}{{\rm{M}}_{i, j}}$| is the geometric mean for tumors containing sgID i in treatment group j in the selected N tumors. Under the null hypothesis, |{\rm{ScoreRG}}{{\rm{M}}_{i,j}}\ = \ 0$|. Under the alternative hypothesis where |{\rm{ScoreRG}}{{\rm{M}}_{i,j}}\ \ne \ 0$|, a positive sign of |{\rm{ScoreRG}}{{\rm{M}}_{i,j}}$| suggests that the tumors with a particular sgID are more resistant than the inert tumors, whereas a negative sign of the score suggests that these tumors are more sensitive than the inert tumors.
Calculating ScoreGSTR (|\hat{\bi G}$|) as the combined score
Although ScoreRTN and ScoreRGM may have an emphasis on different aspects of GSTR on tumor size distribution, it is helpful to have a single combined score. We calculated a combined score of GSTR (|\hat{G}$|) by taking the inverse variance weighted average of ScoreRTN and ScoreRGM, then converting it to the linear scale (Fig. 3).
If |\hat{G}$| > 0, the genotype is relatively resistant to the therapy, and if |\hat{G}$| < 0, the genotype is relatively sensitive to the therapy.
To be conservative, for the combined score to be called significant, we require at least one significant P value (P < 0.05), and one marginally significant P value (P < 0.1) for the two statistics ScoreRTN and ScoreRGM.
Comparing with human cell line response database GDSC
The drug-sensitivity data from human cell lines were downloaded from the Genomics of Drug Sensitivity in Cancer (GDSC) database (www.cancerrxgene.org; ref. 5). Due to the limited number of LUAD cell lines, we focused on comparing the results from Pan-cancer cell lines. All 5 monotherapies used in our study were assessed by GDSC. Except for Keap1 and Rbm10, which are not reported for everolimus and paclitaxel, the GSTR of all other 51 gene–drug pairs were quantified by GDSC. The effect size and FDR-corrected P values were used for comparison.
Analysis of clinical data for resistance to chemotherapy
Despite relatively widespread genotyping, clinical treatment data and response data are extremely limited. Memorial Sloan Kettering Cancer Center (MSKCC, New York, NY) has a tremendous program to genotype patients and to collect clinical data. Most patients with oncogenic KRAS-driven lung cancer get platinum doublet therapy. Patients with metastatic or recurrent lung adenocarcinoma harboring a KRAS mutation in codons 11, 12, or 61, as detected by MSK-IMPACT (36), were reviewed. Patients who received platinum chemotherapy (carboplatin or cisplatin) with pemetrexed ± bevacizumab as first-line treatment were included (n = 216). Treatment efficacy was measured as time of first treatment with platinum doublet chemotherapy to start of next systemic therapy, or death if no subsequent therapy was received. Patients who continued on platinum doublet therapy at the last follow-up were censored. The retrospective chart review was approved by the MSK Institutional Review Board.
Kaplan–Meier estimator plots of time-to-next treatment for patients with and without mutations at each of the 11 tumor suppressor genes of interest were generated. In addition, a multivariable Cox proportional hazards model analysis was performed, integrating the mutational status of the 11 genes as individual input features to assess the independent effect of co-occurring mutations.
Data availability statement
The sequencing data set generated and analyzed during the current study is available in the Gene-Expression Omnibus database (accession code: GSE146448). Other data and relevant code are available in https://github.com/lichuan199010/Tuba-seq-analysis-and-summary-statistics.
Results
Development of the PGx-Tuba-seq pipeline
To eliminate sgRNA-sgID/barcode uncoupling due to lentiviral template switching and to minimize PCR, sequencing, and clustering errors, we made multiple improvements to our Tuba-seq experimental protocols and analysis pipeline (Fig. 1A; Tables 2 and 3; and Materials and Methods; ref. 26). We initiated lung tumors in KrasLSL-G12D/+;Rosa26LSL-Tomato;H11LSL-Cas9 (KT;H11LSL-Cas9) mice and control Cas9-negative KT mice with a pool of barcoded Lenti-sgRNA/Cre vectors targeting eleven putative tumor suppressors and four control vectors with inert sgRNAs (Lenti-sgTSPool/Cre; Fig. 1A). To eliminate template switching during lentiviral reverse transcription, we generated each vector separately and pooled each viral vector immediately prior to tumor initiation (37). Tumor suppressors were selected based on common occurrence in human lung adenocarcinomas and previously suggested roles in oncogenesis (26). Eighteen weeks after tumor initiation, the sgID-BC region from each bulk tumor-bearing lung was PCR amplified and sequenced to quantify the number of neoplastic cells in each tumor (Fig. 1A).
Our new analysis pipeline essentially eliminated the impact of read errors, as assessed by two metrics, including the spurious tumors generated from spike-in barcodes with known sequences and correspondence of tumor barcodes with those from the lentiviral plasmid pool (Fig. 1B; Supplementary Fig. S1A). Quantification of the impact of tumor suppressor gene inactivation on tumor growth in KT;H11LSL-Cas9 mice using our optimized method uncovered effects that were generally consistent with our previous analyses, but with greater magnitudes of tumor suppression (Fig. 1C; Supplementary Figs. S1C and S1D and S2A–S2D; sign test for differences in magnitudes, P = 0.001; ref. 28). Consistent with the robustness of our methods, analysis of the KT mice with Lenti-sgTSPool/Cre-initiated tumor revealed no false-positive tumor-suppressive effects (Supplementary Fig. S1C and S1D). These technical improvements to the Tuba-seq method further enhance the ability of this technology to be applied to study a variety of questions in tumor progression and evolution, as well as quantification of the pharmacogenomic interactions as performed in this study.
When quantifying tumor suppressor gene effects using Tuba-seq, each mouse represents an internally controlled experiment in which metrics of tumor size can be compared between tumors of each tumor suppressor genotype and tumors initiated with inert sgRNAs within the same mouse (Fig. 1C; Supplementary Fig. S1B–S1D; ref. 26). In contrast, comparing tumor size distributions between groups of mice, such as between untreated and drug-treated groups, requires methods that address the technical and biological differences among mice. To understand the statistical properties and potential biases intrinsic to this type of analysis, we rigorously modeled drug responses and genotype-specific responses. We initially performed our modeling with the assumption that cancer cells in tumors of all sizes respond equally to each treatment, whereas the treatment effects can vary by genotype. Specifically, we estimated the drug effect on inert tumors and then applied this effect to all tumors to calculate an expected distribution of tumor sizes after treatment (Fig. 2A and Materials and Methods). GSTRs were quantified by comparing the observed distribution of tumor sizes for tumors of a certain genotype after treatment with the expected distribution derived from the untreated mice. We developed two statistics to characterize GSTRs: (i) ScoreRTN—relative tumor number, which compares the relative numbers of tumors above a certain size after treatment; and (ii) ScoreRGM—relative geometric mean, which constitutes the relative change in the geometric mean of tumors from the full distribution of tumor sizes (Materials and Methods). By assessing the performance of the two statistics, we showed that both statistics are unbiased (Supplementary Fig. S3E–S3H) and exhibit substantial and similar power (Supplementary Fig. S4A–S4C), although one statistic may outperform the other if the genotype-specific response is not uniform across tumor sizes (Methods; Fig. 2B and C; Supplementary Fig. S5A and S5B). Moreover, by performing power analysis and plotting the ROC curves for both statistics across multiple sample sizes (i.e., number of mice/group), we confirmed the high sensitivity and specificity of our system (Fig. 2B and C; Supplementary Fig. S4A–S4C). We also found that relaxing the assumption that tumors of all sizes respond proportionally to treatment did not change our results substantially (Supplementary Fig. S5A and S5B).
Complex pharmacogenomic map uncovered using the PGx-Tuba-seq pipeline
We applied Tuba-seq and our statistical metrics to assess the GSTRs of 11 genotypes of lung tumors to a panel of eight single and combination therapies (Figs. 1A and 3A; Table 1). These therapies were chosen to perturb diverse signaling pathways and assess the genotype dependency of chemotherapy responses. KT;H11LSL-Cas9 mice with Lenti-sgTSPool/Cre-initiated lung tumors were treated for three weeks with one of the eight therapies followed by Tuba-seq analysis (Figs. 1A and 3A). The total cancer cell numbers estimated by Tuba-seq were highly correlated with total tumor-bearing lung weights, which varied substantially among mice even within the same groups (Supplementary Fig. S6A–S6C). Despite expected mouse-to-mouse variation, comparing the overall tumor burden and the number of tumors with inert sgRNAs in the untreated and treated mice revealed significant overall therapeutic effects for five out of the eight treatments (Supplementary Fig. S6D).
We compared the tumor size profiles of treated mice with those of untreated mice and calculated the ScoreRTN and ScoreRGM (Supplementary Fig. S7A). For both statistics, we estimated the magnitudes of GSTRs and the associated P values using bootstrapping. Across all genotypes and treatments, the two statistics were well correlated in magnitude as expected under the model of proportional tumor responses (Supplementary Fig. S7B; r = 0.86, P = 10−46). Among the 88 assessed genotype–treatment pairs, 20 and 17 significant GSTRs (P < 0.05) were identified by ScoreRTN and ScoreRGM, respectively. Of these, 19 genotype–treatment interactions were significant by one statistic (P < 0.05) and at least marginally significant (P < 0.1) by the other (Supplementary Fig. S7A and S7B; Supplementary Table S1). We derived a composite measure of GSTR (|\hat{G}$|) with the magnitude estimated from the inverse variance weighted average of the two statistics (Materials and Methods, Fig. 3B). Analysis of genotype-specific effects across treatments highlighted similarities among tumor suppressors, including those of Lkb1 and Setd2 that we have previously suggest to have redundant tumor-suppressive effects (28). Furthermore, combination treatments clustered with their corresponding single therapies (Supplementary Fig. S7C and S7D), and an additive model shows good predictive power (Supplementary Fig. S7E and S7F). Power analysis showed that our findings were robust to the cancer cell number cutoff (Supplementary Fig. S8A), choice of inert sgRNAs (Supplementary Fig. S8B), and inaccurate estimation of drug effects (Supplementary Fig. S9A and S9B).
One of the detected GSTRs was well known in advance—the resistance of Rb1-deficient tumors to the CDK4/6 inhibitor, palbociclib. Our ability to rediscover this interaction serves as a positive control of our method and is consistent with the expectation that some pharmacogenomic interactions transcend cancer types (Supplementary Fig. S10A–S10E). This resistance is consistent with the biochemical features of this pathway (Supplementary Fig. S10F) and clinical findings in breast cancer and hepatocellular carcinoma (38–40).
To further test the performance of our experimental and statistical procedures, we performed two additional experiments. First, as a negative control for GSTR identification, we treated Cas9-negative KT mice with a combination of chemotherapy and MEK inhibition (Supplementary Fig. S11A). This treatment led to a dramatic reduction in tumor sizes compared with untreated KT mice (Supplementary Fig. S11B). Only one false-positive GSTR was identified (ScoreRTN, P = 0.03; ScoreRGM, P = 0.07) with a very weak magnitude of the effect (|\hat{G}$| = 0.093, whereas the minimum magnitude of significant GSTR interactions in the main experiment was 0.108; Fig. 3C; Supplementary Fig. S11C). Furthermore, none of the individual inert sgRNAs (sgNeo1, sgNeo2, sgNeo3, and sgNT) had a significant effect by either metric for any of the eight treatments in our main pharmacogenomic mapping experiment, adding confidence in the veracity of the detected GSTRs (Fig. 3B and C).
Simulations suggest that these cohort sizes have substantial albeit imperfect power (Supplementary Fig. S4A–S4C); therefore, we next attempted to rediscover the genotype–palbociclib interactions. We initiated tumors in a similar, yet somewhat smaller cohort of KT;H11LSL-Cas9 mice with Lenti-sgTSPool/Cre and repeated the palbociclib treatment. Analyses of these mice again identified Rb1 inactivation as a mediator of palbociclib resistance (Fig. 3D; Supplementary Fig. S10B). Smad4-deficient tumors, which showed modest resistance in our initial experiment, showed nominal resistance in the repeat experiment (|\hat{G}$| = 0.167), although this interaction was not significant (P = 0.17 and 0.20 for ScoreRTN and ScoreRGM, respectively). Given the magnitude of this GSTR and our sample sizes, this false negative is not surprising. Assuming a true positive rate of 80%, which is considered desirable (41, 42), when identifying two genuine GSTR signals (Smad4 and Rb1, for instance) in two independent experiments, the probability of missing at least one of these findings is 1–80%4 = 59%.
Multiple sources of evidence confirm the findings of our PGx-Tuba-seq analysis
Although most of the detected pharmacogenomic interactions we uncovered are novel, several lines of evidence derived from clinical and preclinical data are consistent with our observations. For instance, Lkb1-inactivation reduced sensitivity to mTOR inhibition in our data, which is supported by a previous in vitro study (43) and anecdotal data from the analysis of lung adenocarcinoma patient-derived primary cultures (Supplementary Fig. S12A–S12C; ref. 12). Moreover, previous studies have shown that KrasG12D;Lkb1−/− lung tumors are sensitive to phenformin (25) and resistant to MEK inhibition (23).
The ultimate goal of our study was to find genotype–treatment responses that predicted lung adenocarcinoma patient responses. Lung adenocarcinoma patients are often treated with first-line platinum-containing combination therapies. In our analysis, Keap1-inactivation led to resistance to treatments that included carboplatin, while not promoting significant resistance to the other therapies (Fig. 3B). Interestingly, KEAP1 inactivation has been previously suggested to reduce responses to chemotherapy (44–46). To further investigate the clinical impact of tumor suppressor genotype on lung adenocarcinoma responses, we queried the tumor suppressor genotype and therapeutic benefit of platinum-containing treatments (assessed as time-to-next-treatment) of 216 patients with oncogenic KRAS-driven human lung adenocarcinomas treated at Memorial Sloan Kettering Cancer Center (Materials and Methods). When each gene was assessed individually (Supplementary Fig. S13A–S13K), both KEAP1 and LKB1 mutations were associated with worse clinical outcomes (P = 6 × 10−6, Fig. 4A and P = 0.06; Supplementary Fig. S13C and S13J, respectively). However, the marginally significant effect of LKB1 mutation appears to be driven primarily by the co-occurrence of KEAP1 and LKB1 mutations (Supplementary Fig. S13L; refs. 47, 48). This finding is also well supported by our pharmacogenomic data in which Lkb1 inactivation did not confer resistance to platinum-containing treatments (Fig. 3B). We further quantified the hazard ratio of the mutational status of the 11 genes accounting for the effect of other coincident mutations. This analysis confirmed that mutations of KEAP1 correlated with a shorter time-to-next treatment, which is consistent with our Tuba-seq results as well as previous studies on the impact of KEAP1/NRF2 pathway alterations on platinum responses (Fig. 4A and B; refs. 44, 49). Our in vivo pharmacogenomic platform, in which the responses of tumors with defined genotypes can be quantified, establishes direct causal relationships between genotype and treatment responses, and enables accurate interpretation of patient data.
Comparisons with the cell line and PDX data
Although the positive and negative predictive values of cancer cell line studies are often questioned (50), the scale at which these in vitro studies can be performed has enabled the generation of drug response data across large panels of cell lines (11, 51, 52). Our study constitutes the largest in vivo survey of GSTRs; thus, we compared our findings to a study of cell line–therapeutic responses (GDSC; ref. 5) in which all five of our monotherapies were assessed (paclitaxel, palbociclib, phenformin, everolimus/rapamycin, and trametinib; ref. 5). Among the genotype–treatment pairs assessed in both studies, nine had significant effects in our analysis, but only one of these genotype–treatment pairs was significant in GDSC (RB1-palbociclib; Fig. 4C and D; Supplementary Fig. S14A and S14B). Note that in general we would not expect excellent agreement between our results and the cell line studies, given the lack of the autochthonous environment as well as the complexity of genetic backgrounds and mutation load in cell lines (50, 53).
The PRISM/DepMap compound screen has also quantified genotype-specific treatment responses (52). We tested whether mutation of each tumor suppressor gene is associated with a better or worse response for each genotype–treatment pair (the Mann–Whitney U test with FDR correction). The log viability measured by PRISM/DepMap compound screen and ScoreGSTR predicted by Tuba-seq were significantly correlated (ρ = 0.34, P = 0.01). Among the 9 significant genotype–treatment pairs predicted by Tuba-seq, 7 are in the same direction in the PRISM/DepMap compound screen data set, although only three of these effects were significant in PRISM (Supplementary Table S2). This is likely driven in part by their small sample size and the fact that in the PRISM/DepMap compound screen data set, the results are correlative and ignore all co-occurring mutations, whereas our analysis establishes a direct causal relationship.
PDX models can also be used to test for the association between genotype and drug response. Gao and colleagues conducted a very broad PDX study, generating a total of 4,759 response curves from ∼1,000 PDXs treated with 62 treatments (Supplementary Table S3; ref. 14). We used two-way ANOVA to determine whether there are any significant genotype–treatment pairs in these PDX data where the therapies overlap with our Tuba-seq results. Overall, there was no significant correlation between our ScoreGSTR and these ANOVA results (r = 0.124, P = 0.623; ρ = 0.07, P = 0.792; Supplementary Table S4). Given the large number of mutations per PDX (642 on average for the cancers used for comparison) and the small number of response curves measured per gene–drug pair (median number of treated PDXs that have the gene of interest mutated was 6; see Supplementary Table S3), the lack of correlation is not surprising. This PDX study, despite its extremely large scope, failed to identify the positive control genotype–treatment pair of RB1-mutated tumors being resistant to CDK4/6 inhibitors. These PDX results also did not uncover that KEAP1 inactivation leads to resistance to chemotherapy, which is an interaction that has been confirmed with clinical data (Fig. 4A; ref. 44).
Discussion
Here we described and validated a scalable and quantitative in vivo pharmacogenomic preclinical model, which has high power to identify genotype–treatment responses using modest-size cohorts of mice. Although the number of mice required is modest, the total number of assayed tumors is large—on the order of thousands per mouse—providing the ability to assay a large number of tumor suppressors in the same experiment at a reasonable cost (Supplementary Fig. S15A–S15C). Indeed, although genetically engineered mouse models are key preclinical models to study genotype-specific treatment responses, traditional approaches are neither rigorously quantitative nor scalable, requiring impractically large numbers of mice. For instance, we estimated that with ten mice per group, the sensitivity of our approach would be > 99% to detect a genotype-specific treatment resistance that results in tumor sizes that are 50% larger than control tumors. If we had used a more traditional approach of comparing four cohorts of mice (with and without a specific tumor suppressor alteration and therapy-treated versus vehicle-treated), ∼300 mice/group would be required to achieve the same sensitivity for just one tumor suppressor genotype (Supplementary Fig. S15A–S15C). To build the pharmacogenomic map presented in this study, we would have needed to breed, initiate tumors in, and treat ∼10,000 mice instead of 58; thus, our system represents a >100-fold increase in throughput. Moreover, our power to detect effects is mostly limited by the number of mice per group and not by the number of tumors per mouse, allowing future iterations of this approach to query more genotypes per mouse.
We used one sgRNA per gene for the screening, and one may be concerned with the efficiency and off-target effects of the sgRNA. However, these sgRNAs have been extensively validated by previous studies. The ruggedness of the pharmacogenomic landscape further suggests the efficiency of the sgRNAs, with 7 of the 11 sgRNAs showing some genotype-specific treatment responses. Moreover, our pipeline is largely immune to off-target effects for sgRNAs, and such effects would not be expected to generate GSTRs (Supplementary Fig. S3 and Supplementary Methods and Discussion). Furthermore, neither differences in tumor number nor overall tumor burden across mice dramatically shifts tumor suppressive effects, suggesting that this method is not dramatically influenced by mouse-to-mouse variation (Supplementary Fig. S16A and S16B; ref. 29).
Our method is not only scalable and quantitative, but also allows the introduction of specific alterations into each tumor and the study of marginal effects of individual tumor suppressor genes in isolation, which is not possible using traditional cell lines or PDX approaches. Moreover, the use of genetically engineered mouse models allows autochthonous tumors to develop in their natural immunocompetent environment. This provides not only the ability to study immunotherapies but also the ability to recapitulate aspects of chemotherapy and targeted therapy responses that are influenced by adaptive immune responses.
The key result of this study, which had been suspected but never directly demonstrated, is that tumor suppressor genotype has a substantial impact on responses to a range of distinct therapies. The fact that this was not previously demonstrated experimentally is primarily due to the lack of appropriate systems, which underscores the need for higher-throughput quantitative preclinical models (27). Indeed, although databases like The Cancer Genome Atlas and GENIE databases provide valuable information on the mutational spectra in tumors, these databases generally lack treatment histories and cannot be used to study pharmacogenomic interactions. Prior cell line studies suggested that very few genotypes significantly affect drug responses (e.g., 0.24% of genotype–treatment pairs in GDSC), which we believe is largely due to the lack of statistical power. In contrast, we show that >20% of genotype–treatment pairs show interactions, suggesting a complex pharmacogenomic map of resistance and sensitivity of KRAS-driven lung adenocarcinoma.
There are some potential caveats for our PGx-Tuba-seq approach. We can only introduce a limited number of mutations into each tumor, reducing our ability to recapitulate the high tumor mutation burden and overall complexity of human tumors. Although we can study the genetic interaction among up to three genes, is it possible that even higher order interactions could modify the pharmacogenomic landscape. Furthermore, the extent to which our results recapitulate responses in patients remains unknown due to the lack of large-scale patient data sets. Thus, the interpretation of our results will benefit from further experimental, bioinformatic, and clinical evidence.
The complexity and rugged nature of this pharmacogenomic map has important implications for precision medicine. The complexity of human cancer genomics and the large number of potential therapies suggest that large-scale investigation of the pharmacogenomic maps in preclinical models will aid in patient selection. Our framework for in vivo functional genomic studies should easily allow a larger number of genes and additional monotherapies and combination therapies to be tested. Application to other genomic subtypes of lung cancer and potentially to other cancer types should further increase our knowledge of the pharmacogenomic determinants of therapy responses (54). We anticipate that the use of this platform to quantify the effects of additional therapies across a greater diversity of cancer genotypes will provide a cause-and-effect pharmacogenomic understanding from which novel biological hypotheses and precision treatment approaches will emerge.
Authors' Disclosures
H. Cai reports grants from the University of California during the conduct of the study. I.P. Winters reports grants from NIH during the conduct of the study; personal fees and other support from D2G Oncology, Inc. outside the submitted work; in addition, I.P. Winters has a patent for USPN 10,801,021 issued, licensed, and with royalties paid from D2G Oncology, Inc., a patent for USPN 10,738,300 issued, licensed, and with royalties paid from D2G Oncology, Inc., a patent for USSN 16/909,806 pending, a patent for USSN 17/017,920 pending, a patent for USSN 17/281,919 pending, a patent for USSN 63/129,161 pending, a patent for USSN 63/113,769 pending, a patent for USSN 63/132,359 pending, a patent for USSN 63/122,928 pending, a patent for USSN 63/144,866 pending, a patent for USSN 63/170,402 pending, a patent for USSN 63/173,976 pending, a patent for USSN 63/185,785 pending, and a patent for USSN 63/194,044 pending. C.M. Rudin reports personal fees from AbbVie, Amgen, AstraZeneca, Epizyme, Genentech/Roche, Ipsen, Jazz, Lilly, Syros, Bridge Medicines, Harpoon Therapeutics, and Earli outside the submitted work. D.A. Petrov reports other support from D2G Oncology outside the submitted work; in addition, D.A. Petrov is a co-founder and advisor for D2G Oncology Inc., and has a patent for compositions and methods for multiplexed quantitative analysis of cell lineages issued and licensed to D2G Oncology. M.M. Winslow reports grants from NIH during the conduct of the study; other support from D2G Oncology, Inc. outside the submitted work; in addition, M.M. Winslow has a patent for Stanford pending, issued, licensed, and with royalties paid from D2G Oncology, Inc.; and is a co-founder and advisor for D2G Oncology Inc. No disclosures were reported by the other authors.
Authors' Contributions
C. Li: Conceptualization, resources, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. W.-Y. Lin: Conceptualization, resources, data curation, formal analysis, visualization, methodology, writing–original draft, writing–review and editing. H. Rizvi: Resources, formal analysis, validation, methodology, writing–original draft, writing–review and editing. H. Cai: Resources, data curation, formal analysis, validation, investigation, methodology, writing–review and editing. C.D. McFarland: Conceptualization, formal analysis, investigation, methodology, writing–review and editing. Z.N. Rogers: Conceptualization, methodology, writing–review and editing. M. Yousefi: Resources, data curation, investigation, methodology, writing–review and editing. I.P. Winters: Conceptualization, methodology, writing–review and editing. C.M. Rudin: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, methodology, writing–original draft, project administration, writing–review and editing. D.A. Petrov: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. M.M. Winslow: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing.
Acknowledgments
The authors thank L. Andrejka for technical support; A. Orantes and S. Mello for administrative support; members of the Stanford Animal Care staff for excellent animal care; D. Feldser, J. Sage, J. Zhang, and members of the Winslow and Petrov laboratories for helpful comments. C. Li was the Connie and Bob Lurie Fellow of the Damon Runyon Cancer Research Foundation (DRG-2331). W-Y. Lin was supported by an American Association of Cancer Research Postdoctoral fellowship (17-40-18-LIN). H. Rizvi was supported by the Druckenmiller Center for Lung Cancer Research at Memorial Sloan Kettering. H. Cai was supported by a Tobacco-Related Disease Research Program Postdoctoral Fellowship (28FT-0019). C.D. McFarland was supported by NIH K99-CA226506. M. Yousefi was supported by a Stanford Dean's fellowship, an American Lung Association senior research training grant, and NIH F32-CA236311. I.P. Winters was supported by the NSF Graduate Research Fellowship Program and NIH F31-CA210627. This work was supported by NIH R01-CA207133 (to M.M. Winslow and D.A. Petrov), NIH R01-CA231253 (to M.M. Winslow and D.A. Petrov), NIH R01-CA234349 (to M.M. Winslow and D.A. Petrov) and NIH U01-CA199215 (C.M. Rudin).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.