Summary: Carter and colleagues propose a systematic analysis of the germline and somatic genome in cancer. They identify interactions that occur between germline and somatic variants. This elucidates the function of the germline genome in the context of cancer risk and development. Cancer Discov; 7(4); 354–5. ©2017 AACR.

See related article by Carter et al., p. 410.

Most cancer genomics research is focused on somatic events, such as acquired mutations, but increasing evidence suggests that inherited germline genetic variation also plays a key role in cancer risk (1, 2), pharmacogenomics (3, 4), and gene regulation (5, 6). We have known for decades that cancer can be heritable, and studies identifying specific germline genetic variants that predispose individuals to cancer span back beyond the 1990s—for example, the discovery of the highly penetrant BRCA1 gene (7). The sequencing of the human genome, the development of affordable genotyping technologies, and the subsequent use of genome-wide association studies (GWAS) further increased this list of known cancer risk variants. Larger sample sizes have allowed us to discover germline variants with an ever smaller effect on disease risk. For example, the largest meta-analysis of GWAS data for breast cancer risk, which compared germline variation between 62,533 patients with breast cancer and 60,976 controls, identified a total of 84 germline loci that explained about 16% of familial risk (8). Cheaper genome sequencing technologies are also allowing us to discover rare cancer risk variants. Given germline cancer risk variants have now been extensively catalogued, it is surprising that very little is known about the molecular function of most of these variants and in particular how they affect disease development. Indeed, studies that systematically evaluate the entire somatic and germline cancer genome and establish a link between the two are completely lacking.

In the current study, Carter and colleagues (9) take a first step in bridging this gap and propose studying the effect of the germline genome on cancer in a novel way, which aims at assigning function to these variants. Rather than comparing the germline genome between cohorts of cases (patients with cancer) and controls, as in GWAS, they are comparing the germline genome within a large group of patients with cancer. This allows them to ask two questions that have not previously been addressed: (i) How does germline genetic variation affect the propensity of cancer to occur in one tissue, rather than another? (ii) Does germline genetic variation affect the somatic mutation profiles in cancers?

To address these questions, the authors use data from The Cancer Genome Atlas (TCGA), a very large study that has collected genomics information on over 10,000 patients with cancer using multiple molecular profiling technologies. The germline genome in these patients has been measured using genotyping microarrays on samples collected from blood, which capture most common genetic variations (occurring at a minor allele frequency of >1%).

The first of the authors' questions (“How does germline genetic variation affect the propensity of cancer to occur in one tissue, rather than another?”) is addressed by using an approach similar to a conventional GWAS. However, instead of comparing patients with cancer to healthy controls, the authors compare genotypes in each of the 22 cancer types included in TCGA to the patients from the other 21 types of cancer pooled. The authors' analysis identified relevant signal. Indeed, of 916 markers that were prioritized in their discovery cohort, 395 were replicated in their validation cohort at an FDR of 0.25, a larger proportion than would have been expected by chance. The authors go on to demonstrate that the analysis identified several known risk variants and additionally identified specific examples that represent promising novel candidate genes for follow-up. Given the encouraging results, in the future it may be possible to supplement such an analysis with genotype data collected from other cancer sequencing studies or even from cancer risk GWAS.

The second of the authors' questions (“How does germline genetic variation affect the somatic mutation profile in individuals who have cancer?”) is also addressed using TCGA. TCGA carried out both germline genotyping and tumor exome sequencing. Thus, using these data, it has been possible to compare the frequency of specific somatic mutations, given a specific germline genotype. The analysis focused on the association between germline genotype and the frequency of somatic mutations in 138 well-established cancer genes. The sample size is small given the large number of statistical tests, but the authors argue that for some of the associations studied, the effect sizes recovered are much larger than for a conventional GWAS for complex traits or disease risk. For example, in TCGA data, the authors observed that a haplotype on chromosome 15q22.2 was associated with a 14-fold increased frequency of a copy-number variation affecting GNAQ.

Leveraging this hypothesis-generating computational approach, the authors discuss and experimentally validate two of their candidate associations. In any germline association study, establishing causal relationships is complicated by linkage disequilibrium and the fact that germline variants may be acting on a phenotype via a distal gene (10); here, establishing causality may be additionally complicated if the frequency of both germline and somatic variants differs between cancer types/subtypes. Thus, the authors wisely leverage additional biological knowledge and focused validation on instances where the germline variant and somatic variant pair likely affect genes that participate in the same pathway. They first highlight that an intronic SNP in RBFOX1 is associated with somatic mutation of SF3B1, which in turn is associated with splicing of several genes. They also demonstrate that individuals who inherit specific germline variants on chromosome 19 are 4 times more likely to have somatic mutations in the oncogene PTEN, compared with patients with cancer who do not inherit these variants. Two of the genes in the germline locus (GNA11 and STK11) are known to act in the PIK3CA/mTOR pathway, in which PTEN plays a repressive role. In one of several experimental follow-ups, the authors demonstrated that an increase in GNA11 expression in HEK293T cells led to an increase in mTOR signaling; this effect was amplified upon PTEN knockdown. The findings are consistent with germline variants on chromosome 19 increasing the activity of GNA11, which in turn provides an additional selective advantage for PTEN inactivation. The validation work demonstrates the ability of creative computational analysis on large datasets to generate important biological hypotheses that can be investigated at the bench. One future direction may involve further integration of the results of this novel analysis with GWAS: while the authors suggest that many germline variants are associated with specific somatic mutation profiles among patients with cancer, it may be interesting to assess whether the same variants are also strongly enriched for cancer risk variants. It could be equally interesting to consider that germline variants that affect somatic mutation profiles of tumors may not necessarily affect cancer risk.

Finally, the authors propose a novel method of discovering new cancer genes, using the rationale that if certain germline loci are associated with somatic mutation profiles in known cancer genes, the same sets of loci may also be associated with somatic mutation profiles of as-yet-unknown cancer genes. This analysis yielded 20 additional genes, whose somatic mutation profiles differed between select germline variants. These genes represent a promising starting point for follow-up work and, if validated, could reveal new cancer genes, which may provide a selective advantage during tumorigenesis, but possibly only on a specific germline genetic background.

Overall, the authors have presented a systematic integrative analysis of germline genome variation and somatic mutation profiles in patients with cancer. This new way of studying cancer has shed light on both disease risk and development. This article opens the door for further integrative analysis of somatic and germline variation in cancer, which will help in furthering our understanding of the mechanisms by which germline variants are relevant in cancer. The findings also serve as an opportunity to highlight the benefit of investment in computational methodologies and projects focused on intelligent reuse of existing data. This continued investment in open science, data sharing, and new computational approaches will certainly yield further benefit to the cancer research community, in both validating and broadening the analyses proposed by Carter and colleagues. Furthermore, these investments will lead to novel methods to understand cancer in unforeseen ways, as our talented community of computational biologists continues to explore these invaluable public resources.

No potential conflicts of interest were disclosed.

R.S. Huang has received support from the Avon Foundation research grant, NIH/NIGMS grant K08GM089941, NIH/NCI grant R21 CA139278, NIH/NIGMS grant UO1GM61393, the Circle of Service Foundation Early Career Investigator award, The University of Chicago Support Grant (#P30 CA14599), the Breast Cancer SPORE Career Development Award (CA125183), the National Center for Advancing Translational Sciences of the NIH (UL1RR024999), The University of Chicago CTSA core subsidy grant, and a Conquer Cancer Foundation of ASCO Translational Research Professorship Award In Memory of Merrill J. Egorin, MD (awarded to Dr. M.J. Ratain). P. Geeleher received support from the Chicago Biomedical Consortium grant PDR-020.

1.
Chang
CQ
,
Yesupriya
A
,
Rowell
JL
,
Pimentel
CB
,
Clyne
M
,
Gwinn
M
, et al
A systematic review of cancer GWAS and candidate gene meta-analyses reveals limited overlap but similar effect sizes
.
Eur J Hum Genet
2014
;
22
:
402
8
.
2.
Kar
SP
,
Beesley
J
,
Amin Al Olama
A
,
Michailidou
K
,
Tyrer
J
,
Kote-Jarai
ZS
, et al
Genome-wide meta-analyses of breast, ovarian, and prostate cancer association studies identify multiple new susceptibility loci shared by at least two cancer types
.
Cancer Discov
2016
;
6
:
1052
67
.
3.
Relling
MV
,
Evans
WE
. 
Pharmacogenomics in the clinic
.
Nature
2015
;
526
:
343
50
.
4.
Morrison
G
,
Lenkala
D
,
LaCroix
B
,
Ziliak
D
,
Abramson
V
,
Morrow
PK
, et al
Utility of patient-derived lymphoblastoid cell lines as an ex vivo capecitabine sensitivity prediction model for breast cancer patients
.
Oncotarget
2014
;
7
:
38359
66
.
5.
Li
Q
,
Seo
J-H
,
Stranger
B
,
McKenna
A
,
Pe'er
I
,
LaFramboise
T
, et al
Integrative eQTL-based analyses reveal the biology of breast cancer risk loci
.
Cell
2013
;
152
:
633
41
.
6.
Ongen
H
,
Andersen
CL
,
Bramsen
JB
,
Oster
B
,
Rasmussen
MH
,
Ferreira
PG
, et al
Putative cis-regulatory drivers in colorectal cancer
.
Nature
2014
;
512
:
87
.
7.
Hall
JM
,
Lee
MK
,
Newman
B
,
Morrow
JE
,
Anderson
LA
,
Huey
B
, et al
Linkage of early-onset familial breast cancer to chromosome 17q21
.
Science
1990
;
250
:
1684
9
.
8.
Michailidou
K
,
Beesley
J
,
Lindstrom
S
,
Canisius
S
,
Dennis
J
,
Lush
MJ
, et al
Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer
.
Nat Genet
2015
;
47
:
373
80
.
9.
Carter
H
,
Marty
R
,
Hofree
M
,
Gross
A
,
Jensen
J
,
Fisch
KM
, et al
Interaction landscape of inherited polymorphisms with somatic events in cancer
.
Cancer Discov
2017
;
7
:
410
23
.
10.
Smemo
S
,
Tena
JJ
,
Kim
K-H
,
Gamazon
ER
,
Sakabe
NJ
,
Gómez-Marín
C
, et al
Obesity-associated variants within FTO form long-range functional connections with IRX3
.
Nature
2014
;
507
:
371
5
.