Cancer differs significantly between men and women; even after adjusting for known epidemiologic risk factors, the sexes differ in incidence, outcome, and response to therapy. These differences occur in many but not all tumor types, and their origins remain largely unknown. Here, we compare somatic mutation profiles between tumors arising in men and in women. We discovered large differences in mutation density and sex biases in the frequency of mutation of specific genes; these differences may be associated with sex biases in DNA mismatch repair genes or microsatellite instability. Sex-biased genes include well-known drivers of cancer such as β-catenin and BAP1. Sex influenced biomarkers of patient outcome, where different genes were associated with tumor aggression in each sex. These data call for increased study and consideration of the molecular role of sex in cancer etiology, progression, treatment, and personalized therapy.

Significance: This study provides a comprehensive catalog of sex differences in somatic alterations, including in cancer driver genes, which influence prognostic biomarkers that predict patient outcome after definitive local therapy. Cancer Res; 78(19); 5527–37. ©2018 AACR.

Sex differences in cancer have been known at least since 1949 (1), with repeated demonstration that males have higher cancer risk both in studies using North American (e.g., SEER; ref. 2) and international databases (e.g., IARC; ref. 3). Most, but not all tumor types show increased incidence in men: thyroid cancer occurs ∼2.5 times more frequently in women. These differences remain after controlling for known epidemiologic risk factors (3). At most tumor sites, cancers arising in men induce higher mortality (4); for example, there is a 3-fold increase in lethality from urinary bladder carcinomas in men relative to women (4). Further, there are significant differences in response to treatment: female patients with non–small cell lung cancer respond better to both surgery (5, 6) and chemotherapy (7, 8), even after accounting for differences in variables such as subtype. Female patients with colorectal cancer respond better to surgery, and this difference is driven by improved female survival in the rectal cancer subgroup (9). Similarly, female patients with colorectal also respond better to chemotherapy, which is partially attributed to differences in tumor site and microsatellite instability (10). Finally, a propensity-matched study of nasopharyngeal carcinoma found that females have a survival advantage regardless of tumor stage, radiation technique, and chemotherapy regimen, but that this advantage declines and disappears during menopause (11). Some of these differences in treatment response may be attributed to differences in driver mutations between the sexes, and others to differences in epigenetics or chromatin conformation.

The origins and mechanisms of these sex differences remain a major unresolved question in cancer biology. They may be caused by differences in the expression of genes on the sex chromosomes, in hormone levels, in developmental biology, or in lifestyle features not reflected in current epidemiologic studies. Likely, a mixture of all these components contributes to sex differences in patient outcomes. We hypothesized that, independent of their mechanism, sex differences in cancer would be reflected by differences in somatic mutation profiles. That is, male and female tumors would acquire mutations at different rates and of different types. Recent intriguing data on missense mutations in melanoma support this hypothesis (12). We, therefore, undertook a systematic evaluation of sex-associated biases in mutations in cancer across a broad range of tumor types. Our study provides a comprehensive pan-cancer catalog of sex-biased mutations and a perspective on sex-specific prognostic biomarkers.

Data acquisition and processing

mRNA abundance, DNA genome-wide somatic copy-number and somatic mutation profiles for the Cancer Genome Atlas (TCGA) datasets were downloaded from Broad GDAC Firehose (https://gdac.broadinstitute.org/), release 2016-01-28. For mRNA abundance, Illumina HiSeq rnaseqv2 level 3 RSEM-normalized profiles were used. Genes with >75% of samples having zero reads were removed from the respective data set. GISTIC v2 (13) level 4 data were used for somatic copy-number analysis. mRNA abundance data were converted to log2 scale for subsequent analyses. Mutational profiles were based on TCGA-reported MutSig v2.0 calls. All preprocessing was performed in R statistical environment (v3.1.3).

Patients younger than 18, older than 85 or lacking sex annotation were excluded from analysis, resulting in a sample size of 7,131 across all tumor types for copy-number alterations (CNA; 1.5% excluded, Supplementary Table S1) and 6,073 for somatic single-nucleotide variants (SNV; 1.5% excluded; Supplementary Table S1). Genes were excluded if they were mutated in fewer than 20 patients (for CNAs) or 5% of patients (for SNVs). Gene filters were applied independently for pan-cancer and per individual tumor type data set. All analyses excluded genes on the X and Y chromosomes.

Mutation load

Mutation load per patient was calculated as the sum of SNVs across all genes on the autosomes. Mutation load was Box–Cox transformed, and transformed values were compared between the sexes using unpaired two-sided t tests for both pan-cancer and tumor type–specific analysis. A linear regression model was used to adjust mutation load for tumor type for the pan-cancer comparison. Tumor type–specific P values were adjusted using the Benjamini–Hochberg false discovery rate procedure. Tumor types with q values meeting an FDR threshold of 10% were further analyzed using linear regression to adjust for tumor type–specific variables described in Supplementary Table S1. A multivariate q value threshold of 0.05 was then used to determine statistical significance. Full results are in Supplementary Table S2.

Genome instability

Genome instability was calculated as the percentage of the genome affected by copy-number alterations. The number of base pairs for each CNA segment was summed to obtain a total number of base pairs altered per patient. The total number of base pairs was divided by the number of assayed bases excluding the sex chromosomes (∼7.8 million bp) to obtain the percentage of the genome altered (PGA). Box–Cox transformed PGA was treated as a continuous variable and compared by sex using two-sided unpaired t tests for all tumor types combined (pan-cancer) and separately (tumor type–specific). Linear regression models were used to adjust PGA for tumor type, age, and race for the pan-cancer comparison. Tumor types where univariate testing indicated putative sex biases in PGA (FDR threshold of 10%) were also adjusted for tumor type–specific variables (Supplementary Table S1). A q value threshold of 0.05 was used to determine statistical significance for multivariate results and full results are presented in Supplementary Table S2.

Genome-spanning CNA analysis

Adjacent genes whose copy-number profiles across patients were highly correlated (Pearson r > 95%) were binned. The copy-number call for each patient was taken to be the majority call across all genes in each bin. Copy-number calls were collapsed to ternary (loss, neutral, gain) representation by combining loss groups (monoallelic and biallelic) and gain groups (low and high). The number of loss, neutral, and gain calls was summed per bin and sex, and assessed using univariate and multivariate techniques. For univariate analysis, proportional differences between the sexes for gains and losses were tested for each bin using proportions tests. To account for multiple testing, FDR correction was performed and an FDR threshold of 10% was used to select bins for further multivariate analysis.

After identifying candidate pan-cancer significant bins from univariate proportions testing, generalized linear modeling was used to reduce false positives that may arise from unbalanced tumor type subsets of the pan-cancer data. Multivariate logistic regression (MLR) was used to adjust ternary CNA data for sex, age, race, and tumor type. The MLR sex term was tested for significance and FDR corrected to identify bins with pan-cancer sex biases (q < 0.05).

The same approach was applied to each tumor type individually. Proportions tests were used to select bins for multivariate analysis (q value < 0.1). MLR was again used to adjust ternary copy-number call for clinical variables. MLR modeling for each tumor type varies based on available clinical data. Tumor type–specific models were fit independently per univariately significant bin and variable significance for each bin was extracted from the fitted models. FDR correction was used and an FDR threshold of 0.05 was used. A description of pan-cancer and tumor type–specific models, along with a breakdown of the data for each group, can be found in Supplementary Table S1 and results can be found in Supplementary Tables S3–S5.

CNA-mRNA functional analysis

Genes in bins altered by sex-biased CNAs after multivariate adjustment for kidney clear cell and kidney papillary cell cancers were further investigated to determine sex-biased functional effects. Available mRNA samples were matched to those used in CNA analysis. For each gene affected by a sex-biased loss, its mRNA abundance was modeled against sex, copy-number loss status, and a sex–copy-number loss interaction term. The interaction term was used to identify genes with sex-biased mRNA changes. FDR-adjusted P values and fold changes were extracted for visualization. A q value threshold of 0.05 was used for statistical significance. For genes affected by sex-biased gains, the same procedure was applied using copy-number gains.

CNA-mRNA survival analysis

Genes found to have significant or trending (FDR threshold of 10%) sex biases in the CNA-mRNA functional analysis were further analyzed using Cox proportional hazards modeling. That is, we focused on genes that were both altered by sex-biased CNAs (MLR q value < 0.05) and showed mRNA abundance differences between the copy-number neutral and loss/gain groups for either sex (sex–loss interaction q < 0.1). For each gene, the mRNA abundance was median dichotomized over all samples to identify low- and high-expression groups. Cox proportional hazard regression models incorporating sex, mRNA group, and a sex–mRNA group interaction were fit for overall survival after checking the proportional hazards assumption. FDR-adjusted interaction P values and log2 hazard ratios were extracted for visualization. A q value threshold of 0.1 was used to identify genes with sex-influenced survival.

Genome-spanning SNV analysis

We focused on genes mutated in at least 5% of patients. All genes tested are listed in Supplementary Table S6. Mutation data were binarized to indicate presence or absence of SNV in each gene per patient. Proportions of mutated genes were compared between the sexes using proportions tests for univariate analysis. FDR correction was used to adjust P values and a q value threshold of 0.1 used to select genes for multivariate analysis.

After identifying pan-cancer univariately significant genes from proportions testing, binary logistic regression (LR) was used to reduce false positives that may arise from unbalanced tumor type subsets of the pan-cancer data. Age and race were also included in the pan-cancer model. FDR correction was again applied and genes with significant pan-cancer sex terms were extracted from the models (q value < 0.05).

LR was also used for multivariate analysis of each individual tumor type to adjust for clinical variables. The same model variables from the CNA MLR models were used. Tumor type–specific models were fitted independently per univariately selected gene and variable significance for each gene was extracted from the fitted models as P values. FDR correction was used to adjust P values and a LR q value threshold of 0.05 was used. A description of pan-cancer and tumor type–specific models can be found in Supplementary Table S1. A summary of results can be found in Supplementary Table S5.

Validation of sex biases

Copy-number data for tumor types with sex-biased CNAs were downloaded from the Progenetix database (14) as a meta-analysis data set. Matching genomic regions were analyzed using proportions tests to validate genes in sex-biased CNAs. Similarly, somatic SNV data were obtained from cBioPortal and the ICGC Data Portal and analyzed to validate sex-biased somatic SNV load and genes with sex-biased mutations frequencies. A description of validation data, data sources, and results are available in Supplementary Table S7.

Multigene prognostic models

Computationally purified tumor mRNA profiles for the Director's Challenge data were downloaded (15). The training and validation cohorts were processed and split as previously described and were checked for balance between male and female samples. Colon transcriptomic data were downloaded (16, 17) and reprocessed and normalized. Colon training and validation cohorts were balanced for data source, sex, and survival status. Survival modeling was performed using overall survival as the clinical endpoint for both datasets.

To identify genes univariately associated with survival, purified mRNA abundance was median dichotomized for each gene identify low- and high-expression groups. Cox proportional hazard regression models included variables for sex, mRNA–group and the sex–mRNA group interaction, and P values and log2 hazard ratios were extracted for visualization. A P value threshold of 0.01 was used to determine statistical significance.

Ridge regression models were used to train 50,000 randomly generated 100-gene prognostic signatures. The glmnet package (v2.0-5) was used to run 10-fold cross-validation using glmnetcv (α = 0.1) and AUC as the type measure. Signatures were trained using the training cohort and validated in the validation cohort. Signatures were then run on male- and female-only validation patients, and Cox proportional hazards modeling was performed. Signatures that failed the proportional hazards assumption were removed from analysis. The same approach was used to train a signature using the top 100 univariately significant genes.

Statistical analysis and data visualization

All statistical analyses and data visualization were performed in the R statistical environment (v3.2.1) using the BPG (v5.3.4; ref. 18), mlogit (v0.2-4), glmnet (v2.0-5), and pROC (v1.8) packages.

Sex biases in mutation burden

We leveraged data from TCGA studies comprising 7,131 matched tumor–normal pairs of 18 tumor types: 4,265 from males and 2,866 from females (Supplementary Table S1). We focused on somatic CNAs and SNVs in protein-coding genes as they are well-established driver events. These data are well powered to detect differences in driver-gene mutation frequencies between tumors arising in men and those arising in women (Supplementary Fig. S1). We excluded genes and regions of the X and Y chromosomes and analyzed autosomal differences (19).

We first compared pan-cancer mutational burden between tumors arising in men and those arising in women. Male-derived tumors exhibited a higher density of somatic-coding SNVs than female-derived tumors in univariate analysis (difference in means = 0.17; 95% CI, 0.14–0.20, P = 1.0 × 10−29, unpaired Welch t test on Box–Cox transformed mutation load; Supplementary Fig. S2). This sex bias persisted even after multivariate analysis adjusting for imbalances in sample numbers across tumor type, race, and age (linear regression P = 4.5 × 10−6; Supplementary Table S2). After finding sex differences on the pan-cancer level, we asked if there were such differences within individual cancer types and focused our analysis on each tumor type. Six of these showed univariate sex biases in mutation density (10% FDR threshold; Supplementary Fig. S2) and were further investigated using tumor type–specific multivariate modeling. Again, we used Box–Cox transformation and linear modeling to determine whether sex remained a significant variable after adjusting for possible confounders (linear regression q values given in Fig. 1A; model-specific variables described in Supplementary Table S1). Finally, because the association between sex and mutation load may be biased by later stage male-derived tumors (Supplementary Table S1), we created a sub–pan-cancer model using only tumor types with stage data and found that higher mutation prevalence in male-derived samples persisted after accounting for stage. A summary of univariate and multivariate results can be found in Supplementary Table S2.

Figure 1.

Mutation burden is sex biased. We found sex differences in somatic mutation load (A) and genome instability (B). Each point represents a sample (male-derived, blue; female-derived, pink). We focused on tumor types with univariately significant sex differences in mutation and show q values from multivariate modeling here. Red lines show mean mutation burden for each group. C, Mosaic map showing the relationship between microsatellite instability and sex in stomach and esophageal cancer. D, Higher male mutation prevalence emerges after adjusting for microsatellite instability. Adjusted Box–Cox transformed data are shown.

Figure 1.

Mutation burden is sex biased. We found sex differences in somatic mutation load (A) and genome instability (B). Each point represents a sample (male-derived, blue; female-derived, pink). We focused on tumor types with univariately significant sex differences in mutation and show q values from multivariate modeling here. Red lines show mean mutation burden for each group. C, Mosaic map showing the relationship between microsatellite instability and sex in stomach and esophageal cancer. D, Higher male mutation prevalence emerges after adjusting for microsatellite instability. Adjusted Box–Cox transformed data are shown.

Close modal

Of the six tumor types with univariate sex differences (Fig. 1A), males exhibited more somatic-coding SNVs in bladder urothelial cancer (BLCA: difference in Box–Cox means = 0.55; 95% CI, 0.20–0.90; multivariate q = 3.6 × 10−3), melanoma (SKCM: difference in Box–Cox means = 0.78; 95% CI, 0.29–1.3; multivariate q = 0.037), renal papillary cell cancer (KIRP: difference in Box–Cox means = 2.2; 95% CI, 0.81–3.6; multivariate q = 0.019), and liver hepatocellular cancer (LIHC: difference in Box–Cox means = 0.16; 95% CI, 0.049–0.27; multivariate q = 0.019). There was an opposite trend in glioblastoma where female-derived samples had higher mutation burden (GBM: difference in Box–Cox means = 1.6; 95% CI, 0.14–3.0; multivariate q = 0.094). Using independent sequencing datasets, we validated the male biases seen in bladder, liver, lung adenocarcinoma, and skin cancers (Supplementary Table S7).

To see if these sex differences affected multiple mutation types, we also compared the load of CNAs across tumor types based on the percentage of genome altered, which is a prognostic marker in several tumor types (20–22). A putative univariate sex bias in pan-cancer PGA was not significant after multivariate adjustment (Supplementary Fig. S2); however, 4/18 individual tumor types showed univariate sex differences in PGA (Supplementary Fig. S2). These were further analyzed with multivariate modeling to examine the influence of sex (Fig. 1B; Supplementary Table S2). Males showed elevated genomic instability in stomach and esophageal cancer (STES: difference in Box–Cox means = 1.7; 95% CI, 0.92–2.4; multivariate q = 9.7 × 10−3), head and neck cancer (HNSC: difference in Box–Cox means = 1.9; 95% CI, 1.0–2.8; multivariate q = 0.016), and kidney clear cell cancer (KIRC: difference in Box–Cox means = 0.40; 95% CI, 0.14–0.67; multivariate q = 0.019). A strong opposite trend was seen in sarcoma, where PGA was higher in female-derived tumors (SARC: difference in Box–Cox means = 1.5; 95% CI, 0.41–2.7; multivariate q = 0.021).

Measures of mutation burden such as SNV load and PGA may be correlated with defects in DNA mismatch repair (MMR). For example, microsatellite instability (MSI), a marker of defective DNA MMR, is more common in some tumor types (23) and could be a confounder in the relationship between mutation burden and sex. We further examined three tumor types with available MSI-monodinucleotide assay data: colorectal, pancreatic, and stomach and esophageal cancers. In samples with MSI data (Supplementary Table S1), we found an association between MSI and sex in stomach and esophageal cancer (Pearson χ2P = 1.4 × 10−5; 40% of female-derived samples vs. 26% of male-derived samples; Fig. 1C) and colorectal cancer (Pearson χ2P = 0.025; 33% of female-derived samples vs. 25% of male-derived samples; Supplementary Fig. S3). By contrast, MSI status was not sex associated in pancreatic cancer (Pearson χ2P = 0.63). Incorporating MSI into our analyses of SNV burden and PGA, we first noted that MSI was associated with increased SNV burden but not PGA in all three tumor types. We then used multivariate models including MSI to examine the interplay between sex, MSI, and mutation burden. Intriguingly, though there was no univariate relationship between sex and SNV burden in stomach and esophageal cancer, a novel sex bias emerged after adjusting for MSI (MV P = 0.023; Fig. 1D). We observed the same effect in an independent data set (Supplementary Table S7). The association between sex and PGA persisted in this new model, enforcing the sex bias in PGA for this tumor type. Because MSI is thought to result from defective DNA MMR, we also looked for sex biases specifically in a set of seven MMR genes (24). Though we did not find sex biases in the mutation rates of DNA MMR genes, we observed significantly lower mRNA abundance in female-derived tumors for MLH1 (male mean = 8.89, female mean = 8.5, 95% CI, 0.19, 0.62, t test q = 0.0011) and PMS2 (male mean = 9.0, female mean = 8.87, 95% CI, 0.05–0.21, t test q = 0.0060). Taken together, this suggests that differential mRNA abundance may form a link between MMR and sex biases in mutation load in stomach and esophageal cancer. We did not find novel sex biases in colorectal or pancreatic mutation burden after accounting for MSI (Supplementary Fig. S3).

To investigate whether sex-biased mutation load is generally associated with DNA MMR, we also looked specifically at MMR genes in all tumor types with sex-biased mutation load. We found decreased MSH2 (male mean = 8.45, female mean = 8.83, 95% CI, 0.22–0.53, t test q = 3.98 × 10−6), MSH3 (male mean = 8.50, female mean = 8.71, 95% CI, 0.082–0.34, t test q = 1.51 × 10−3), MSH6 (male mean = 9.12, female mean = 9.65, 95% CI, 0.37–0.67, t test q = 4.57 × 10−10) and PMS1 (male mean = 7.71, female mean = 8.01, 95% CI, 0.14–0.46, t test q = 2.26 × 10−4) mRNA abundance in male kidney papillary tumors, corresponding with higher male mutation prevalence. Similarly, male mRNA abundance of PMS2 (male mean = 8.84, female mean = 8.97, 95% CI, 0.025–0.24, t test q = 0.055) and MLH3 (male mean = 8.70, female mean = 8.87, 95% CI, 0.039–0.30, t test q = 0.055) was also lower than that of female-derived tumors in liver cancer. This suggests that for some tumor types, differences in mutation load may be explained by sex biases in the efficiency of MMR. Taken together, this analysis of mutation burden identified sex biases across several tumor types even after adjusting for race, tumor stage, and smoking history, among others. Indeed, a sex bias in stomach and esophageal cancer was only discovered after adjusting for MSI status, highlighting its importance. Finally, changes in the abundances of DNA MMR mRNA form a candidate mechanism for sex biases in mutation density.

Sex biases in somatic CNAs

Differences in mutation density might reflect changes in specific driver genes, or alternatively global changes as might be induced by differences in DNA damage or repair. To distinguish these possibilities, we compared male- and female-derived tumors in the entire pan-cancer cohort (Fig. 2A). We binned adjacent genes across all samples so that all genes within a bin had highly correlated sample CNA profiles (Pearson r > 0.95). We then calculated the average bin CNA profile per sample and compared the rates of bin gain and loss between the sexes using proportions tests. Bins that were significant in univariate analysis (10% FDR threshold) were further analyzed using MLR. Bins with MLR q values < 0.05 contain genes lost or gained at significantly different rates between the sexes.

Figure 2.

Functional sex differences in CNAs are associated with outcome. Sex differences in CNAs for pan-cancer (A) and kidney clear cell cancer (B). Each plot shows, from top to bottom, the q value showing significance of sex from multivariate modeling, with yellow (green) points corresponding to 0.05 < q < 0.01 and deep blue (red) points corresponding to q < 0.01; the proportion of samples with aberration; the difference in proportion between male and female groups for amplifications; the same repeated for deletions; and the CNA profile heat map. The columns represent genes ordered by chromosome. Light blue and pink points represent data for male- and female-derived samples, respectively. C, Transcriptome differences between the sexes are seen in the interaction between sex and copy-number loss status in mRNA abundance modeling. Red points are genes with significant sex–copy-number loss interaction terms (q < 0.05). D, Genes with sex-biased copy-number loss and mRNA changes are associated with differential overall survival outcomes between the sexes. Again, the interaction term estimate and q values were used to determine genes with sex biases in survival. E,LATS1 is a marker of poor overall survival in women, but not in men.

Figure 2.

Functional sex differences in CNAs are associated with outcome. Sex differences in CNAs for pan-cancer (A) and kidney clear cell cancer (B). Each plot shows, from top to bottom, the q value showing significance of sex from multivariate modeling, with yellow (green) points corresponding to 0.05 < q < 0.01 and deep blue (red) points corresponding to q < 0.01; the proportion of samples with aberration; the difference in proportion between male and female groups for amplifications; the same repeated for deletions; and the CNA profile heat map. The columns represent genes ordered by chromosome. Light blue and pink points represent data for male- and female-derived samples, respectively. C, Transcriptome differences between the sexes are seen in the interaction between sex and copy-number loss status in mRNA abundance modeling. Red points are genes with significant sex–copy-number loss interaction terms (q < 0.05). D, Genes with sex-biased copy-number loss and mRNA changes are associated with differential overall survival outcomes between the sexes. Again, the interaction term estimate and q values were used to determine genes with sex biases in survival. E,LATS1 is a marker of poor overall survival in women, but not in men.

Close modal

In pan-cancer analysis, we discovered sex-associated differential CNAs in broad genomic segments covering 3,442 of the 23,693 genes annotated to autosomes. The vast majority of these (94.5%; 3,251 genes) were amplifications. Concordant with PGA observations (Fig. 1B), most were more prevalent in male-derived tumors (Supplementary Tables S3 and S4). Numerous cancer driver genes were sex biased in their CNA profile. For example, the MYC oncogene was amplified in 48% of male-derived tumors vs. 37% of female-derived tumors (q = 0.037, MLR). Hence, sex biases are seen in both genome-wide and in pan-cancer gene-specific CNA mutation profiles.

To evaluate if these large-scale pan-cancer differences in CNAs of specific genes also occurred in individual tumor types, we applied the same methodology to each tumor type. We created tumor type–specific gene bins and again used multivariate modeling to control for tumor type–specific factors (Methods; Supplementary Table S1). Sex-biased CNAs affecting thousands of genes were detected in eight tumor types: kidney clear cell, kidney papillary, head and neck, stomach and esophageal, liver, bladder, and both lung adenocarcinoma and squamous cell cancer (Fig. 2B; Supplementary Figs. S4–S10; Supplementary Table S5). Some sex-biased events were highly focal, such as female-biased loss of NCKAP5 in head and neck cancer (19% of male-derived vs. 37% of female-derived tumors; q = 0.046, MLR; Supplementary Fig. S5; Supplementary Table S3). Other sex-biased events covered broad genomic segments, such as whole-chromosome arms.

We performed pathway enrichment analysis for each tumor type to investigate functional consequences of sex-biased CNAs. Significant gene ontology terms related to genes in sex-biased gains and losses were found using g:Profiler (25) and interaction networks were visualized in Cytoscape (26) using Enrichment Map (Supplementary Fig. S11; ref. 27). The largest perturbed networks included metabolic processes in liver cancer, as well as nuclear organization and regulation processes in kidney clear cell cancer. In head and neck cancer, sex-biased CNAs affect genes related to lipoprotein and sterol activity. Immune-related processes were significant in several tumor types including stomach and esophageal and both kidney clear cell and papillary cancers. These pathway results suggest sex-biased CNAs may lead to downstream biases in biological processes.

To further characterize the consequences of sex-biased CNAs, we focused on kidney clear cell tumors (KIRC), a tumor type with robust statistical power (nmale = 336; nfemale = 185; Supplementary Fig. S1) and strong evidence of sex-biased PGA (Fig. 1B). After multivariate adjustment for age, race, stage, and grade, we identified 3,581 genes contained in sex-biased losses and 138 genes contained in sex-biased gains. All of these were more commonly mutated in male-derived tumors (Fig. 2B; Supplementary Tables S4 and S5). All sex-biased CNAs were broad events, with large losses of chromosomes 3, 6p, 8q, and 9 (covering the driver genes TSC1 and CDKN2A (28)). Most prominent of these was a large region from 3p11.1 to 3p12.3 deleted in ∼60% of male-derived tumors but only ∼35% of female-derived tumors (q < 10−3 for all genes; MLR).

To determine if these sex-biased CNAs influence the tumor transcriptome, we evaluated mRNA abundances in matched patient samples. We first focused on genes within large segments of sex-biased losses. We used linear regression to model mRNA as a function of sex, copy-number loss vs. no copy-number loss, and the interaction between sex and copy-number loss status. This allowed us to identify not only mRNA changes associated with copy-number loss alone, but also interactions where sex and copy-number loss synergize for an additional effect on mRNA. Approximately half of genes in regions affected by sex-biased copy-number losses were associated with changes in mRNA abundance (Supplementary Fig. S12), indicating that sex-biased CNAs have transcriptional consequences. In addition, there were multiple genes with interaction effects (10% FDR threshold) on chromosomes 3, 6, and 9 (Supplementary Fig. S12, red lines), including genes where sex and copy-number loss together changed mRNA abundance over 2-fold relative to their effects in isolation. These sex-copy-number interactions suggested that sex-biased copy-number changes induce transcriptional changes, and in some cases these changes vary by sex.

Next, we extended our focus to all genes affected by sex-biased losses (proportions test q < 0.1 and MLR q < 0.05) whose mRNA was repressed across samples with the loss. Applying the same linear regression model, we examined the effect of copy-number loss in the sexes and again extracted both the copy-number loss and the sex-copy-number loss interaction terms. Of the 2,165 genes, 74% showed associations between copy-number loss and decreased mRNA abundance (Supplementary Fig. S13, black points). In addition, copy-number loss affected mRNA abundance differently between the sexes in 36 genes that showed significant interactions between sex and copy-number loss (sex-loss interaction, q < 0.1; Fig. 2C, red points). Thus, sex-biased CNAs are associated with divergent transcriptomes in male- and female-derived tumors.

Finally, to demonstrate that these transcriptomic divergences are functional and clinically relevant, we evaluated the association of the 36 genes with sex biases in both CNAs and mRNA abundance (sex–loss interaction q < 0.1) with overall patient survival. Using univariate Cox proportional hazards modeling, we identified 16 sex-biased genes associated with outcome in both male and female tumors (Fig. 2D). Several genes showed strikingly divergent clinical associations, and all 16 with sex-biased survival were more prognostic in female-derived samples than male. For instance, loss of LATS1 was a marker of poor prognosis in women (HR = 0.39; 95% CI, 0.17–0.85, q = 0.03), but not men (HR = 1.2; 95% CI, 0.80–1.8, q = 0.67; Fig. 2E). Conversely, UBAC1 loss was a marker of good overall survival in women (HR = 2.64; 95% CI, 1.5–4.6, q = 0.0037) but not men (HR = 1.4; 95% CI, 0.95–2.1, q = 0.34; Supplementary Fig. S14). Similar patterns of sex-associated CNAs inducing divergent transcriptomes associated with clinical aggressivity were observed for KIRP (Supplementary Fig. S15), demonstrating the generality of this phenomenon.

Taken together, these data demonstrate that the frequency of somatic CNAs in specific genes is sex biased in many, but not all tumor types. These differences do not appear to be a result of well-known clinical or epidemiologic factors. Sex-biased CNAs are associated with sex biases in the transcriptome (and presumably the proteome as a well), and these transcriptomic differences are associated with differences in clinical outcome within and between the sexes.

Sex biases in somatic SNVs

We next asked whether sex differences were specific to somatic CNAs or if they also occurred in other mutation types. We compared the proportions of male-derived (nmale = 3,591) and female-derived (nfemale = 2,482) samples with SNVs in pan-cancer univariate analysis. Similar to our CNA analysis, we adjusted for unequal sample numbers of the tumor types and other factors using LR, here with a binary response variable indicating whether the gene harbored SNVs or not. In total, we tested 103 genes that were mutated in at least 5% of samples (Supplementary Table S6). Of these, four genes showed sex biases after adjustment for tumor type, age, and race, and all four showed elevated mutation rates in male-derived samples (Fig. 3A). Some of these mutations may be passengers and reflect increased DNA damage in male-derived tumors.

Figure 3.

Sex biases in driver SNVs. Sex differences in somatic SNVs for pan-cancer (A), stomach and esophageal carcinoma (B), hepatocellular carcinoma (C), and kidney clear cell cancer (D). Each plot shows, from top to bottom, the q value for significance of sex from multivariate modeling, with yellow points corresponding to 0.05 < q < 0.01 and green points corresponding to q < 0.01; proportion of samples with aberration; difference in proportion between male and female groups; mutation prevalence across all samples and a heat map showing mutation status for each sample. E,BAP1 mRNA abundance compared across sex and mutation status for kidney clear cell cancer. The P value for the sex–SNV interaction from mRNA modeling is shown. F, Mutated BAP1 is associated with poor prognosis in female patients, but not male patients in kidney clear cell cancer.

Figure 3.

Sex biases in driver SNVs. Sex differences in somatic SNVs for pan-cancer (A), stomach and esophageal carcinoma (B), hepatocellular carcinoma (C), and kidney clear cell cancer (D). Each plot shows, from top to bottom, the q value for significance of sex from multivariate modeling, with yellow points corresponding to 0.05 < q < 0.01 and green points corresponding to q < 0.01; proportion of samples with aberration; difference in proportion between male and female groups; mutation prevalence across all samples and a heat map showing mutation status for each sample. E,BAP1 mRNA abundance compared across sex and mutation status for kidney clear cell cancer. The P value for the sex–SNV interaction from mRNA modeling is shown. F, Mutated BAP1 is associated with poor prognosis in female patients, but not male patients in kidney clear cell cancer.

Close modal

Similarly to our CNA analysis, we next evaluated each of the 18 tumor types independently. We screened for candidate mutations using univariate analyses and FDR adjustment and then performed multivariable modeling. We excluded genes mutated in less than 5% of samples, meaning many lower-frequency sex-biased genes have not yet been uncovered and our results represent a lower bound of sex biases in somatic SNVs. Of the 18 tumor types evaluated, four exhibited sex-biased mutations in specific genes: stomach and esophageal, hepatocellular carcinoma, and both kidney clear cell and kidney papillary cell cancers (Fig. 3B–D; Supplementary Fig. S16; Supplementary Table S6). In stomach and esophageal cancer, all 10 sex-biased genes were mutated in a greater fraction of female-derived samples, including a number of transcription factors such as ZFHX3 (95% CI of the difference, 2.5%–15%, q = 0.018, LR), ZBTB20 (95% CI of the difference, 2.2%–14%, q = 0.034, LR), and GTF3C1 (95% CI of the difference, 4.0%–16%, q = 0.012, LR; Fig. 3B; Supplementary Table S6).

The largest differences in mutation frequency were seen in liver carcinoma, where two genes showed dramatic sex biases in mutation frequency (Fig. 3C; Supplementary Table S6). Male tumors were strongly enriched for mutations in β-catenin (CTNNB1), with 33% of male-derived tumors harboring a mutation compared with 12% of female-derived tumors (95% CI for the difference, 12%–30%, q = 0.0014, LR). These large differences suggest mutational associations with etiologic factors. For example, CTNNB1 mutations occur more frequently in tumors associated with Hepatitis B (95% CI for the difference, −1.9 to 27%, P = 0.07), and sex remains significant even after accounting for viral and alcohol risk factors. We validated this higher female mutation frequency in CTNNB1 in an independent patient cohort from the Liver Cancer—NCC, JP project on the ICGC Data Portal (17% higher; 95% CI for the difference; 9.7%–25%, P = 2.7 × 10−5; Supplementary Table S7).

The deubiquitinating enzyme BAP1 was almost exclusively mutated in female-derived hepatocellular tumors, occurring in 14% of female-derived tumors and 1.6% of male-derived tumors (95% CI of difference, 5.6%–20%, q = 0.017, LR). This enrichment of BAP1 mutations was also seen in 15% of female-derived kidney clear cell tumors compared with 6.1% of male-derived tumors (95% CI of difference, 1.7%–15%, q = 0.001, LR; Fig. 3D; Supplementary Table S6)—these tumors are not thought to be virally associated. BAP1 has been implicated as a tumor suppressor and is frequently inactivated in kidney clear cell cancer (28, 29). Comparison of mRNA abundance between hepatocellular carcinoma samples with mutated and wild-type BAP1 revealed striking sex differences: female-derived tumors with mutated BAP1 had 1.4-fold decreased mRNA abundance compared with those with wild-type BAP1, compared with a 4-fold decrease in male-derived samples (Supplementary Fig. S17). Indeed, linear modeling confirmed the significant interaction between sex and BAP1 mutation status (P = 5.8 × 10−5). The same sex-associated mRNA differences were not observed in kidney clear cell cancer (Fig. 3E), but we did observe a striking interaction in survival modeling. BAP1 mutation was associated with poor prognosis in female patients (HR = 2.59; 95% CI, 1.40–4.81, P = 0.0025; Fig. 3F) but not male patients (HR = 0.80; 95% CI, 0.32–1.97, P = 0.62). Indeed, the interaction between sex and BAP1 mutation was significant in Cox proportional hazards survival modeling (interaction q = 0.0025). Mutation of BAP1 is known to be associated with worse prognosis in kidney clear cell cancer, but evidence on its sex-biased prognostic value is conflicting (30).

We extended this mRNA and survival analysis to other sex-biased SNVs in liver, kidney papillary, and stomach and esophageal cancer but did not find additional sex–SNV interactions in these data (Supplementary Fig. S18). However, we noted that EP400 encodes a chromatin remodeling protein thought to be involved in ATM-mediated DNA damage response (31). We returned to the mutation prevalence data to investigate whether sex-biased EP400 mutation in stomach and esophageal cancer was associated with sex biases in mutation burden. Not only is mutated EP400 itself associated with higher SNV load (mutated EP400 mean SNV load = 4.82, wild EP400 mean = 3.82, 95% CI, 0.80–1.20, t test P = 4.70 × 10−13; Supplementary Fig. S19), there is an interaction between EP400 mutation and sex where mutation of this gene is associated with a greater increase in mutation burden in female-derived samples than in male-derived samples (Supplementary Fig. S19, interaction P = 0.009). This indicates that not only is EP400 mutation associated with increased mutation load, there is a greater effect in female-derived samples than male. Further, given the relationship between MSI, mutation burden and sex we described previously, we examined whether there was a relationship between EP400 mutation and MSI-positive samples and found no association (P > 0.05). Finally, we also validated sex-biased EP400 mutation in an independent data set (Supplementary Table S7). Overall, our analysis of somatic SNVs revealed that sex-biased mutation frequency is associated with impacts on mRNA abundance, survival, and mutation burden in several tumor types.

Clinical relevance of transcriptomic sex differences

The differential clinical impact of sex-biased kidney renal cell genes (Figs. 2B–E, 3E and F) suggested that sex may influence the accuracy of biomarkers used to personalize therapy. We asked if sex-naïve approaches to prognostic biomarker development result in biomarkers that can predict survival accurately well across all samples, but better in one sex than the other. The sex biases in mutational profiles and transcriptional changes suggest that biomarkers developed using data from both sexes without annotation may contain predictive features biased toward the sex in which that tumor type most frequently occurs. We focused on multigene prognostic mRNA signatures, such as those developed for non–small cell lung cancer to identify early-stage patients who might benefit from intensification of therapy (32, 33). We used the benchmark Director's Challenge data set of 443 tumor samples (223 men and 220 women) with mRNA abundance profiles (34) after deconvolution of tumor and stromal expression (15).

Univariate Cox proportional hazards modeling identified stark differences between male- and female-derived tumors (Fig. 4A). Overall, 0.8% of genes were prognostic in both sexes (black points) and 1.5% were prognostic in patients of only one sex (blue and pink points). Strikingly, 79 genes (0.9%) had mRNA-based groups that interacted with sex for an additional effect on survival (red points, P < 0.01). These divergences could be of significant magnitude. For example, elevated tumor abundance of SPINK1 was associated with poor outcome in women only (interaction P = 0.0032; Fig. 4B), while FBXO46 was prognostic in males and not females (interaction P = 0.0070; Supplementary Fig. S20). To assess the generality of these results, we assessed a set of 783 patients with colorectal cancer with median 3.5-year survival. There were again large differences in the magnitude and even direction of association between expression and outcome between the sexes (Fig. 4C; Supplementary Fig. S21).

Figure 4.

Sex differences influence prognostic biomarker accuracy. Comparing female and male hazard ratios from univariate Cox proportional hazards modeling in non–small cell lung cancer (A) and colon cancer (C). Red points, genes with significant interaction terms between sex and risk group. Blue and pink points are genes prognostic only in males and females, respectively. Gray points, genes not significant in either sex. B,SPINK1 is prognostic in females but not males in non–small cell lung cancer. D, Sex-specific receiver operating characteristic curves for a 100-gene non–small cell lung cancer signature fit on the combined sex training cohort and tested on female and male test cohorts. Blue lines, males; pink lines, females.

Figure 4.

Sex differences influence prognostic biomarker accuracy. Comparing female and male hazard ratios from univariate Cox proportional hazards modeling in non–small cell lung cancer (A) and colon cancer (C). Red points, genes with significant interaction terms between sex and risk group. Blue and pink points are genes prognostic only in males and females, respectively. Gray points, genes not significant in either sex. B,SPINK1 is prognostic in females but not males in non–small cell lung cancer. D, Sex-specific receiver operating characteristic curves for a 100-gene non–small cell lung cancer signature fit on the combined sex training cohort and tested on female and male test cohorts. Blue lines, males; pink lines, females.

Close modal

To assess the performance of multigene biomarkers, we applied ridge regression to the top 100 univariately prognostic genes found in the combined sex training cohort. The multigene signature attained an AUC of 0.63 and was prognostic in the validation cohort when sex was not considered (HR = 2.3; 95% CI, 1.32–4.01, P = 0.0035; Supplementary Fig. S22). However, this overall value hid significant sex bias: the signature performed very well in men (AUC = 0.73), but was indistinguishable from chance in women (AUC = 0.54; Fig. 4D). Finally, we verified that male- and female-derived tumors showed fundamentally distinct distributions using the independent training and validation cohorts defined by the data set generators (34) and empirically estimating the null distributions by training 50,000 randomly generated signatures (Supplementary Fig. S23; ref. 35). Together, these results show that large sex differences observed in driver genes lead to differences in the development application of biomarkers for personalized therapy.

The broad and unexplained sex differences in cancer outcomes represent a major gap in our understanding of the disease. We evaluated the molecular origins of these differences by comparing somatic mutation profiles in male- and female-derived tumors across a broad range of tumor types. We discovered large differences in mutation density and sex biases in the frequency of mutation of specific genes. These differences, however, are not uniform across tumor types. Rather, some show very significant sex bias in their mutational profiles, while others show no detectable sex biases. Further, some tumor types show sex bias in SNV mutation profiles, others in CNA mutational profiles, and still others in both. The mechanisms by which these differences occur remain to be elucidated.

Candidate mechanisms include differential chromatin architecture, mutagen exposures and DNA repair efficacy and bias. Indeed, our analysis of stomach and esophageal cancer suggests a complex relationship between sex and the cancer genome landscape. Our analysis of microsatellite instability in this tumor type posits a mechanism in dysfunctional DNA repair where baseline somatic SNV load is lower in female-derived samples. However, the high proportion of MSI-positive female-derived samples as well as lower mRNA abundance of DNA MMR genes MLH1 and PMS2 lead to higher SNV load in these individuals. Independently, EP400 is not only more frequently mutated in female-derived samples, it also has a greater impact and drives overall female somatic SNV burden higher. As a result, though overall SNV burden appears similar between male- and female-derived samples, more female samples harbor defects in DNA repair. Additional work is needed to further elucidate the interplay between microsatellite instability, DNA repair machinery, mutation load, and sex.

Our statistical modeling incorporate clinical and environmental variables to approach the true association of sex with the genomic characteristic of interest. However, it is important to note the limitations of this method in capturing all confounding variables. First, information on environmental variables is incomplete and may not be accurately reported. Second, adding variables increases model complexity and may decrease overall performance if there is insufficient data to support the model. Nevertheless, the tumor type–specific models in this analysis represent a foundation for putative sex differences, and our findings should be taken in context of each tumor type and its associated risk factors. Another challenge of our study lies in validating our findings in datasets with sufficient power and similar quality survival data. Though we were able to validate a subset of sex-biased CNAs and SNVs, there remain putative TCGA-specific sex biases. These validation challenges may be due to methodological differences between datasets included in meta-analysis and to the high level of heterogeneity in environmental factors that have yet to be accounted for.

Existing literature on sex differences in cancer genomics largely focus on individual tumor types and on specific genes or on a single data type (23, 26, 36). A previous pan-cancer study incorporating multiple mutation types focused on male-biased loss of function on X chromosome genes (19). Our analysis complements these sex chromosome–specific findings with a more general methodology by broadly analyzing both SNV and CNA mutations using transparent tumor type–specific models to generate a catalog of sex-biased events. We also describe for the first time, a relationship between mutation load and sex-biased DNA repair deficiency.

The potential consequences of sex-biased SNVs and CNAs range from perturbations of biological pathways such as metabolic processes to changes in mRNA abundance and prognostic biomarker performance. Significant insight into these questions on mechanism will arise from on-going primary tumor whole-genome sequencing and chromatin profiling efforts. Independent of their origins, these mutational sex biases have significant consequences for both preclinical and translational research. Preclinically, the sex of an experimental model (e.g., cell-line, organoid, patient-derived xenograft) may influence the effects of driver-gene mutations and, therefore, should be explicitly considered. From a translational perspective, our results suggest that in some cases, distinct multigene panels should be used to predict prognosis or drug sensitivity in men and women. Overall, these data call for increased study and consideration of the role of sex in cancer etiology, progression, treatment, and personalized therapy.

No potential conflicts of interest were disclosed.

Conception and design: C.H. Li, Y.-J. Shiah, P.C. Boutros

Development of methodology: C.H. Li, P.C. Boutros

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): C.H. Li, S. Haider, Y.-J. Shiah, K. Thai

Writing, review, and/or revision of the manuscript: C.H. Li, K. Thai, P.C. Boutros

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): C.H. Li, K. Thai, P.C. Boutros

Study supervision: P.C. Boutros

This study was conducted with the support of the Ontario Institute for Cancer Research to P.C. Boutros through funding provided by the Government of Ontario. This work was supported by the Discovery Frontiers: Advancing Big Data Science in Genomics Research program, which is jointly funded by the Natural Sciences and Engineering Research Council (NSERC) of Canada, the Canadian Institutes of Health Research (CIHR), Genome Canada, and the Canada Foundation for Innovation (CFI). P.C. Boutros was supported by a Terry Fox Research Institute New Investigator Award and a CIHR New Investigator Award. This work was supported by an NSERC Discovery grant and by Canadian Institutes of Health Research, grant # SVB-145586, to P.C. Boutros. The authors thank all the members of the Boutros lab for insightful discussions. The results described here are in part based upon data generated by the TCGA Research Network: http://cancergenome.nih.gov/.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Clemmesen
J
,
Busk
T
. 
Cancer mortality among males and females in Denmark, England, and Switzerland; incidence of accessible and inaccessible cancers in Danish towns and rural areas
.
Cancer Res
1949
;
9
:
415
21
.
2.
Cook
MB
,
Dawsey
SM
,
Freedman
ND
,
Inskip
PD
,
Wichner
SM
,
Quraishi
SM
, et al
Sex disparities in cancer incidence by period and age
.
Cancer Epidemiol Biomarkers Prev
2009
;
18
:
1174
82
.
3.
Edgren
G
,
Liang
L
,
Adami
H-O
,
Chang
ET
. 
Enigmatic sex disparities in cancer incidence
.
Eur J Epidemiol
2012
;
27
:
187
96
.
4.
Cook
MB
,
McGlynn
KA
,
Devesa
SS
,
Freedman
ND
,
Anderson
WF
. 
Sex disparities in cancer mortality and survival
.
Cancer Epidemiol Biomarkers Prev
2011
;
20
:
1629
37
.
5.
Ferguson
MK
,
Wang
J
,
Hoffman
PC
,
Haraf
DJ
,
Olak
J
,
Masters
GA
, et al
Sex-associated differences in survival of patients undergoing resection for lung cancer
.
Ann Thorac Surg
2000
;
69
:
245
9
.
6.
Minami
H
,
Yoshimura
M
,
Miyamoto
Y
,
Matsuoka
H
,
Tsubota
N
. 
Lung cancer in women: sex-associated differences in survival of patients undergoing resection for lung cancer
.
Chest
2000
;
118
:
1603
9
.
7.
Wakelee
HA
,
Wang
W
,
Schiller
JH
,
Langer
CJ
,
Sandler
AB
,
Belani
CP
, et al
Survival differences by sex for patients with advanced non-small cell lung cancer on Eastern Cooperative Oncology Group trial 1594
.
J Thorac Oncol
2006
;
1
:
441
6
.
8.
Kris
MG
,
Natale
RB
,
Herbst
RS
,
Lynch
TJ
 Jr
,
Prager
D
,
Belani
CP
, et al
Efficacy of gefitinib, an inhibitor of the epidermal growth factor receptor tyrosine kinase, in symptomatic patients with non–small cell lung cancer: a randomized trial
.
JAMA
2003
;
290
:
2149
58
.
9.
Wichmann
MW
,
Muller
C
,
Hornung
HM
,
Lau-Werner
U
,
Schildberg
FW
. 
Gender differences in long-term survival of patients with colorectal cancer
.
Br J Surg
2001
;
88
:
1092
8
.
10.
Elsaleh
H
,
Joseph
D
,
Grieu
F
,
Zeps
N
,
Spry
N
,
Iacopetta
B
. 
Association of tumor site and sex with survival benefit from adjuvant chemotherapy in colorectal cancer
.
Lancet
2000
;
355
:
1745
50
.
11.
OuYang
PY
,
Zhang
LN
,
Lan
XW
,
Xie
C
,
Zhang
WW
,
Wang
QX
, et al
The significant survival advantage of female sex in nasopharyngeal carcinoma: a propensity-matched analysis
.
Br J Cancer
2015
;
112
:
1554
61
.
12.
Gupta
S
,
Artomov
M
,
Goggins
W
,
Daly
M
,
Tsao
H
. 
Gender disparity and mutation burden in metastatic melanoma
.
J Natl Cancer Inst
2015
;
107
:
djv221
.
13.
Mermel
CH
,
Schumacher
SE
,
Hill
B
,
Meyerson
ML
,
Beroukhim
R
,
Getz
G
. 
GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers
.
Genome Biol
2011
;
12
:
R41
.
14.
Cai
H
,
Kumar
N
,
Ai
N
,
Gupta
S
,
Rath
P
,
Baudis
M
. 
Progenetix: 12 years of oncogenomic data curation
.
Nucleic Acids Res
2014
;
42
:
D1055
62
.
15.
Quon
G
,
Haider
S
,
Deshwar
AG
,
Cui
A
,
Boutros
PC
,
Morris
Q
. 
Computational purification of individual tumor gene expression profiles leads to significant improvements in prognostic prediction
.
Genome Med
2013
;
5
:
29
.
16.
Jorissen
RN
,
Gibbs
P
,
Christie
M
,
Prakash
S
,
Lipton
L
,
Desai
J
, et al
Metastasis-associated gene expression changes predict poor outcomes in patients with dukes stage B and C colorectal cancer
.
Clin Cancer Res
2009
;
15
:
7642
51
.
17.
Marisa
L
,
de Reyniès
A
,
Duval
A
,
Selves
J
,
Gaub
MP
,
Vescovo
L
, et al
Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value
.
PLoS Med
2013
;
10
:
e1001453
.
18.
P'ng
C
,
Green
J
,
Chong
LC
,
Waggott
D
,
Prokopec
SD
,
Shamsi
M
, et al
BPG: seamless, automated and interactive visualization of scientific data
.
bioRxiv 2017
.
doi:10.1101/156067
.
19.
Dunford
A
,
Weinstock
DM
,
Savova
V
,
Schumacher
SE
,
Cleary
JP
,
Yoda
A
, et al
Tumor-suppressor genes that escape from X-inactivation contribute to cancer sex bias
.
Nat Genet
2017
;
49
:
10
6
.
20.
Vollan
HK
,
Rueda
OM
,
Chin
SF
,
Curtis
C
,
Turashvili
G
,
Shah
S
, et al
A tumor DNA complex aberration index is an independent predictor of survival in breast and ovarian cancer
.
Mol Oncol
2015
;
9
:
115
27
.
21.
Lalonde
E
,
Ishkanian
AS
,
Sykes
J
,
Fraser
M
,
Ross-Adams
H
,
Erho
N
, et al
Tumor genomic and microenvironmental heterogeneity for integrated prediction of 5-year biochemical recurrence of prostate cancer: a retrospective cohort study
.
Lancet Oncol
2014
;
15
:
1521
32
.
22.
Hieronymus
H
,
Schultz
N
,
Gopalan
A
,
Carver
BS
,
Chang
MT
,
Xiao
Y
, et al
Copy number alteration burden predicts prostate cancer relapse
.
Proc Natl Acad Sci U S A
2014
;
111
:
11139
44
.
23.
Shah
SN
,
Hile
SE
,
Eckert
KA
. 
Defective mismatch repair, microsatellite mutation bias, and variability in clinical cancer phenotypes
.
Cancer Res
2010
;
70
:
431
5
.
24.
Li
GM
. 
Mechanisms and functions of DNA mismatch repair
.
Cell Res
2008
;
18
:
85
98
.
25.
Reimand
J
,
Arak
T
,
Adler
P
,
Kolberg
L
,
Reisberg
S
,
Peterson
H
, et al
g:Profiler—a web server for functional interpretation of gene lists (2016 update)
.
Nucleic Acids Res
2016
;
44
:
W83
9
.
26.
Shannon
P
,
Markiel
A
,
Ozier
O
,
Baliga
NS
,
Wang
JT
,
Ramage
D
, et al
Cytoscape: a software environment for integrated models of biomolecular interaction networks
.
Genome Res
2003
;
13
:
2498
504
.
27.
Merico
D
,
Isserlin
R
,
Stueker
O
,
Emili
A
,
Bader
GD
. 
Enrichment map: a network-based method for gene-set enrichment visualization and interpretation
.
PLoS One
2010
;
5
:
e13984
.
28.
Cancer Genome Atlas Research Network
. 
Comprehensive molecular characterization of clear cell renal cell carcinoma
.
Nature
2013
;
499
:
43
9
.
29.
Peña-Llopis
S
,
Vega-Rubín-de-Celis
S
,
Liao
A
,
Leng
N
,
Pavía-Jiménez
A
,
Wang
S
, et al
BAP1 loss defines a new class of renal cell carcinoma
.
Nat Genet
2012
;
44
:
751
9
.
30.
Ricketts
CJ
,
Linehan
WM
. 
Gender specific mutation incidence and survival associations in clear cell renal cell carcinoma (CCRCC)
.
PLoS One
2015
;
10
:
e0140257
.
31.
Smith
RJ
,
Savoian
MS
,
Weber
LE
,
Park
JH
. 
Ataxia telangiectasia mutated (ATM) interacts with p400 ATPase for an efficient DNA damage response
.
BMC Mol Biol
2016
;
17
:
22
.
32.
Kratz
JR
,
He
J
,
Van Den Eeden
SK
,
Zhu
ZH
,
Gao
W
,
Pham
PT
, et al
A practical molecular assay to predict survival in resected non-squamous, non-small-cell lung cancer: development and international validation studies
.
Lancet
2012
;
379
:
823
32
.
33.
Lau
SK
,
Boutros
PC
,
Pintilie
M
,
Blackhall
FH
,
Zhu
CQ
,
Strumpf
D
, et al
Three-gene prognostic classifier for early-stage non small-cell lung cancer
.
J Clin Oncol
2007
;
25
:
5562
9
.
34.
Director's Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma
,
Shedden
K
,
Taylor
JM
,
Enkemann
SA
,
Tsao
MS
,
Yeatman
TJ
, et al
Gene expression–based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study
.
Nat Med
2008
;
14
:
822
7
.
35.
Boutros
PC
,
Lau
SK
,
Pintilie
M
,
Liu
N
,
Shepherd
FA
,
Der
SD
, et al
Prognostic gene signatures for non-small-cell lung cancer
.
Proc Natl Acad Sci U S A
2009
;
106
:
2824
8
.
36.
Xiao
D
,
Pan
H
,
Li
F
,
Wu
K
,
Zhang
X
,
He
J
. 
Analysis of ultra-deep targeted sequencing reveals mutation burden is associated with gender and clinical outcome in lung adenocarcinoma
.
Oncotarget
2016
;
7
:
22857
64
.