Abstract
High numbers of lymphocytes in tumor tissue, including T regulatory cells (Treg), have been associated with better colorectal cancer survival. Tregs, a subset of CD4+ T lymphocytes, are mediators of immunosuppression in cancer, and therefore variants in genes related to Treg differentiation and function could be associated with colorectal cancer prognosis.
In a prospective German cohort of 3,593 colorectal cancer patients, we assessed the association of 771 single-nucleotide polymorphisms (SNP) in 58 Treg-related genes with overall and colorectal cancer–specific survival using Cox regression models. Effect modification by microsatellite instability (MSI) status was also investigated because tumors with MSI show greater lymphocytic infiltration and have been associated with better prognosis. Replication of significant results was attempted in 2,047 colorectal cancer patients of the International Survival Analysis in Colorectal Cancer Consortium (ISACC).
A significant association of the TGFBR3 SNP rs7524066 with more favorable colorectal cancer–specific survival [hazard ratio (HR) per minor allele: 0.83; 95% confidence interval (CI), 0.74–0.94; P value: 0.0033] was replicated in ISACC (HR: 0.82; 95% CI, 0.68–0.98; P value: 0.03). Suggestive evidence for association was found with two IL7 SNPs, rs16906568 and rs7845577. Thirteen SNPs with differential associations with overall survival according to MSI in the discovery analysis were not confirmed.
Common genetic variation in the Treg pathway implicating genes such as TGFBR3 and IL7 was shown to be associated with prognosis of colorectal cancer patients.
The implicated genes warrant further investigation.
Introduction
Colorectal cancer is the third most common cancer in the world (1). Implementation of population screening and the availability of new treatments have led to a decline in colorectal cancer–specific mortality and an increase in the 5-year survival (2–5). Colorectal cancer prognosis is very heterogeneous and dependent on several factors. The main prognostic factors that have been identified include TNM stage, tumor grade, presence of metastases (especially in the liver), baseline alkaline phosphatase levels, baseline C–reactive protein and albumin levels (Glasgow prognostic score), as well as neutrophil lymphocyte ratio and preoperative carcinoembryonic antigen (CEA) levels (6–10). In addition, the tumor microenvironment has been increasingly recognized to influence the prognosis of cancer including colorectal cancer (11). T regulatory cells (Treg), a subset of CD4+ T lymphocytes expressing the transcription factor FOXP3, are heterogeneous cell types, which play a central role in the maintenance of self-tolerance and immune homeostasis by suppressing the activation, proliferation, and function of numerous immune cells (12, 13). In tumor tissue, Tregs are able to suppress antitumor immune response and contribute to the development of an immunosuppressive tumor microenvironment. The presence of high numbers of Tregs has a negative prognostic effect on many cancer types such as breast cancer, melanoma, or cervical cancer (14). In contrast, for colorectal cancer patients, the presence of a high number of tumor-infiltrating lymphocytes, including Tregs, in the tumor microenvironment has been associated with a more favorable survival (14–22). Tumors with microsatellite instability (MSI) are frequently characterized by inflammatory lymphocytic infiltration and tend to be associated with a better survival than non–MSI-high colorectal cancers (23–25). This may, in part, be due to more effective immune responses involving Tregs (26, 27).
Genetic variation in inflammatory genes could play a role in the survival of patients after colorectal cancer diagnosis (28). To gain further insight into the biological mechanisms underlying Treg pathway and survival after colorectal cancer, we investigated common, inherited single-nucleotide polymorphisms (SNP) affecting genes involved in the regulation of Treg functions.
No studies have so far investigated a possible influence of genetic variants in Treg-related genes on the prognosis of colorectal cancer patients. Therefore, our aim was to investigate the association between 771 germline variants in 58 Treg-related genes and the overall disease-specific survival of colorectal cancer patients, and to assess possible effect modification by MSI status.
Materials and Methods
The study sample consisted of colorectal cancer patients recruited into the ongoing population-based case–control study DACHS (Darmkrebs: Chancen der Verhütung durch Screening) conducted in the Rhine-Neckar Odenwald region in southwestern Germany (29, 30). Cases diagnosed between January 2003 and December 2013 were included if they were older than 30 years of age (with no upper limit), were able to communicate in German, were able to participate in a personal interview of around 1 hour and were a resident of the Rhine-Neckar Odenwald region. Only histologically confirmed cases who were diagnosed with their first primary colorectal cancer (ICD-10:C18-C20) were included. All patients gave their written informed consent. The study was approved by the relevant ethical committees of the University of Heidelberg and the State Medical Boards of Baden-Württemberg and Rhineland-Palatinate, Germany, and was conducted in agreement with the Declaration of Helsinki.
At baseline, trained interviewers collected information on the patient's demographics, anthropometric indices, medical history including reproductive history, and lifestyle factors. A blood sample was requested at baseline, and for a minority of patients who refused to provide blood, a mouthwash sample was collected instead (0.9% of participants). After five years of follow-up, information on treatment and disease course was collected from the treating physician. Vital status was collected from the population registries, and cause of death was verified by death certificates from health authorities.
Genotype data
The Flexigene kit was used to extract DNA from EDTA blood and mouthwash samples of the DACHS patients, and quantification of the DNA was performed using Q6 Quanti-iT picoGreen dsDNA reagent and kit (Invitrogen/Life Technologies).
Genotyping was performed in collaboration with the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO); details have been previously described (31). DACHS samples were genotyped using the whole-genome Illumina CytoSNP assay (Illumina) for patients recruited in 2003–2006, the Illumina HumanOmniExpress BeadChip Kit for patients recruited in 2007–2010, and the Illumina HumanOmniExpress BeadChip Kit or the Illumina Infinium OncoArray-500K BeadChip for those recruited in 2011–2013. For quality control, genotyped variants were excluded based on call rate (<98%), lack of Hardy–Weinberg equilibrium in controls (HWE; P < 1 × 10−4), and low minor allele frequency (MAF < 0.05) as described elsewhere (31–34). Samples were imputed using as reference panel the cosmopolitan haplotypes from phase I of the 1,000 Genome Project (for patients recruited between 2003 and 2010) or the Haplotype Reference Consortium (for patients between 2011 and 2013; ref. 35) using the University of Michigan Imputation Server (36). Before imputation, Shapeit2 was used to phase the GWAS data (37).
SNP selection
Through extensive literature research, the most important genes in the Treg pathway were selected. Tagging SNPs were selected to represent genetic variation across the genes. SNPs in these genes as well as SNPs in the flanking regions (e.g., ±10 kb) were considered, after which Haploview 4.2 (Broad Institute) was used for the selection of tagging SNPs, with a pairwise tagging approach based on reference data from the HapMap project [Utah resident with Northern and Western Europe ancestry (CEU population), Phase II/Release 24]. In total, 771 SNPs in 58 genes were selected for this analysis (see Supplementary Table S1).
MSI data
Formalin-fixed paraffin-embedded (FFPE) tumor samples were used to determine MSI status. FFPE samples were collected from the different pathology departments at the cooperating hospitals and were stored at the tissue bank at the National Center for Tumor Diseases (NCT) in Heidelberg. The area with the highest tumor cell concentration was identified microscopically, and this section was then isolated using the DNeasy kit from Qiagen. A mononucleotide marker panel including BAT25, BAT26, and CAT25 was used to determine MSI status. Tumors showing amplifications in two or more markers were classified as MSI-high (38). These markers have a sensitivity of 98.2% and a specificity of 100% to differentiate MSI-high from non–MSI-high tumors (38, 39).
Data analysis
Cox regression models were used to test the individual SNP associations with overall survival and colorectal cancer–specific survival. Hazard ratios (HR) and 95% confidence intervals (CI) were calculated for each SNP. Survival time was calculated from date of diagnosis until date of death by any cause, death by colorectal cancer, or date of last contact. Median follow-up time was calculated using the reverse Kaplan–Meier method (40). Age, sex, and TNM stage were included in the model as relevant prognostic factors. Additional covariates were determined using backward elimination of a set of variables including grade (1, 2 vs. 3, 4), family history of colorectal cancer in first-degree relatives (no vs. yes), smoking, body mass index (18.5–25, 25–30, 30+ kg/m2), alcohol intake (0 and quartiles in subjects with alcohol intake >0 g/day), and physical activity (0 and quartiles). The variables body mass index (at diagnosis) and alcohol intake were retained in the final model as they were significantly associated with overall survival. Heterogeneity in the associations of the Treg gene polymorphisms with overall survival according to MSI status was assessed statistically using interaction terms between MSI status (high, non-high) and the polymorphisms and was evaluated using the likelihood ratio test. The SNP association according to MSI status was also estimated in subgroup analysis. The proportional hazards assumption was tested according to Grambsch and Therneau (41). The statistical analysis was carried out using SAS version 9.3 (SAS Institute) and R version 3.1.0 (www.r-project.org).
Replication set
For SNPs that showed a significant association with survival or an interaction with MSI (P < 0.01), replication was performed using studies participating in the International Survival Analysis in Colorectal Cancer Consortium (ISACC), a consortium aimed at investigating demographic, environmental, and genetic risk factors in association with colorectal cancer survival (42). For replication of the SNP associations with overall survival and colorectal cancer–specific survival, 1,821 colorectal cancer patients from six studies were included: Diet And Lifestyle Study (DALS), Health Professionals Follow-up Study (HPFS), Nurses' Health Study (NHS), Colon Cancer Family Registry (CCFR), Cancer Prevention Study II (CPS II), and the Melbourne Collaborative Cohort Study (MCCS). All studies were genotyped on Illumina GWAS platforms and imputed to the Haplotype Reference Consortium panel using the University of Michigan Imputation Server (35). Prior to imputation, Shapeit2 was used to phase the GWAS data (37). Details of genotyping and quality control (QC) for studies included in the validation are described elsewhere (32–34, 43). Patients included in ISACC studies are of European ancestry.
For replication of results on effect modification according to MSI status, 1,554 colorectal cancer patients from five studies (DALS, HPFS, NHS, CPS II, and MCCS) with available data on MSI were included. For DALS, MSI status was determined using 12 markers, a panel of 10 tetranucleotide repeats and the two mononucleotide repeats (BAT-26 and TGFBR2; ref. 44). For HPFS and NHS, MSI status was assessed based on the same 10 tetranucleotide repeats (45). For MCCS, MSI status was determined using a 10-loci panel in tumor DNA and matched normal tissue DNA (BAT25, BAT26, BAT40, MYCL, D5S346, D17S250, ACTC, D18S55, D10S197, and BAT34C4; ref. 46). For CPS II, determination of MSI status was based on the Bethesda Consensus Panel (47). Classification was based on ≥5 interpretable markers (unless all four markers were unstable, in which case the tumor was classified as MSI-high). For these studies, MSI-high tumors were defined when ≥30% of the markers showed instability.
Analyses were performed by pooling the samples from all studies and adjusting for study. We used the same statistical methods as for the discovery phase. To account for the overrepresentation of patients with a positive family history of colorectal cancer in the DALS Minnesota samples, we included family history as an additional covariate in the MSI interaction analysis given the correlation between positive family history and MSI-high tumors. Missing data on environmental factors were imputed using single imputation.
Functional annotation
To add functional information to significant variants, we used the NCI's “LDlink” web tool (https://ldlink.nci.nih.gov) to find all variants in linkage disequilibrium (LD; R2 ≥ 0.4 in phase III 1000 Genomes “EUR” population) with the significant variant. Subsequently, we used the variant effect predictor (VEP) tool of the Ensembl webpage (https://uswest.ensembl.org/info/docs/tools/vep/index.html) to show annotations (48).
Data availability
Genotyping data of the GECCO studies are available at the database of Genotypes and Phenotypes (dbGaP) for download at the accession number phs001078.v1.p1.
Results
Table 1 shows the characteristics of the DACHS study participants. The majority of patients were aged between 60 and 80 years at diagnosis with a median of 69 years, and 60.5% were male. Over 60% of the patients were diagnosed with TNM stage II or III disease, and 60.2% had a tumor in the colon. The median follow-up time was 63.0 months and, in total, 1,100 patients died during study follow-up. See the flow diagram in Fig. 1 for inclusion of patients in the discovery stage. In the DACHS study, 10.7% of patients with data on MSI status were MSI-high and 89.3% were non–MSI-high.
. | All . | MSI-high . | Non–MSI-high . |
---|---|---|---|
. | 3,593 (1,100) . | 211 (50) . | 1,754 (590) . |
N (number of events) . | n (%) . | n (%) . | n (%) . |
Sex | |||
Female | 1,420 (39.5) | 110 (52.1) | 712 (40.6) |
Male | 2,173 (60.5) | 101 (47.9) | 1,042 (59.4) |
Age, years | |||
<60 | 744 (20.7) | 36 (17.1) | 333 (19.0) |
60–<70 | 1,123 (31.3) | 59 (28.0) | 661 (31.4) |
70–<80 | 1,189 (33.1) | 61 (28.9) | 607 (34.6) |
≥80 | 537 (14.9) | 55 (26.1) | 263 (15.0) |
Median (Interquartile range) | 69 (61–76) | 71 (64–80) | 69 (62–76) |
TNM stage | |||
1 | 820 (22.8) | 30 (14.2) | 325 (18.7) |
2 | 1,088 (30.3) | 113 (53.6) | 554 (31.6) |
3 | 1,188 (32.9) | 62 (28.4) | 612 (34.9) |
4 | 502 (14.0) | 6 (2.8) | 260 (14.8) |
Site | |||
Colon | 2,164 (60.2) | 200 (94.8) | 1,036 (59.1) |
Rectum | 1,429 (39.8) | 11 (5.2) | 718 (40.9) |
CRC-specific death | |||
No | 2,811 (78.2) | 191 (90.5) | 1,334 (76.1) |
Yes | 718 (20.0) | 19 (9.0) | 404 (23.0) |
Missing | 64 (1.8) | 1 (0.5) | 16 (0.9) |
Recurrence | |||
No | 2,576 (71.7) | 185 (87.7) | 1,206 (68.8) |
Yes | 983 (27.4) | 25 (11.8) | 542 (30.9) |
Missing | 34 (0.9) | 1 (0.5) | 6 (0.3) |
BMI category | |||
Normal weight | 1,390 (38.7) | 65 (30.8) | 704 (40.1) |
Overweight | 1,527 (42.5) | 91 (43.1) | 728 (41.5) |
Obese | 676 (18.8) | 55 (26.1) | 322 (18.4) |
Current alcohol intake (g/day) | |||
No alcohol | 1,093 (30.4) | 74 (35.1) | 524 (29.9) |
0.1–6.1 | 705 (19.6) | 52 (124.6) | 331 (18.9) |
6.1–15.6 | 624 (17.4) | 34 (16.1) | 303 (17.3) |
15.6–32.6 | 607 (16.9) | 31 (14.7) | 306 (17.4) |
≥32.6 | 564 (15.7) | 20 (9.5) | 290 (16.5) |
Family history of CRC | |||
No | 3,086 (85.9) | 174 (82.5) | 1,505 (85.8) |
Yes | 507 (14.1) | 37 (17.5) | 249 (14.2) |
. | All . | MSI-high . | Non–MSI-high . |
---|---|---|---|
. | 3,593 (1,100) . | 211 (50) . | 1,754 (590) . |
N (number of events) . | n (%) . | n (%) . | n (%) . |
Sex | |||
Female | 1,420 (39.5) | 110 (52.1) | 712 (40.6) |
Male | 2,173 (60.5) | 101 (47.9) | 1,042 (59.4) |
Age, years | |||
<60 | 744 (20.7) | 36 (17.1) | 333 (19.0) |
60–<70 | 1,123 (31.3) | 59 (28.0) | 661 (31.4) |
70–<80 | 1,189 (33.1) | 61 (28.9) | 607 (34.6) |
≥80 | 537 (14.9) | 55 (26.1) | 263 (15.0) |
Median (Interquartile range) | 69 (61–76) | 71 (64–80) | 69 (62–76) |
TNM stage | |||
1 | 820 (22.8) | 30 (14.2) | 325 (18.7) |
2 | 1,088 (30.3) | 113 (53.6) | 554 (31.6) |
3 | 1,188 (32.9) | 62 (28.4) | 612 (34.9) |
4 | 502 (14.0) | 6 (2.8) | 260 (14.8) |
Site | |||
Colon | 2,164 (60.2) | 200 (94.8) | 1,036 (59.1) |
Rectum | 1,429 (39.8) | 11 (5.2) | 718 (40.9) |
CRC-specific death | |||
No | 2,811 (78.2) | 191 (90.5) | 1,334 (76.1) |
Yes | 718 (20.0) | 19 (9.0) | 404 (23.0) |
Missing | 64 (1.8) | 1 (0.5) | 16 (0.9) |
Recurrence | |||
No | 2,576 (71.7) | 185 (87.7) | 1,206 (68.8) |
Yes | 983 (27.4) | 25 (11.8) | 542 (30.9) |
Missing | 34 (0.9) | 1 (0.5) | 6 (0.3) |
BMI category | |||
Normal weight | 1,390 (38.7) | 65 (30.8) | 704 (40.1) |
Overweight | 1,527 (42.5) | 91 (43.1) | 728 (41.5) |
Obese | 676 (18.8) | 55 (26.1) | 322 (18.4) |
Current alcohol intake (g/day) | |||
No alcohol | 1,093 (30.4) | 74 (35.1) | 524 (29.9) |
0.1–6.1 | 705 (19.6) | 52 (124.6) | 331 (18.9) |
6.1–15.6 | 624 (17.4) | 34 (16.1) | 303 (17.3) |
15.6–32.6 | 607 (16.9) | 31 (14.7) | 306 (17.4) |
≥32.6 | 564 (15.7) | 20 (9.5) | 290 (16.5) |
Family history of CRC | |||
No | 3,086 (85.9) | 174 (82.5) | 1,505 (85.8) |
Yes | 507 (14.1) | 37 (17.5) | 249 (14.2) |
Abbreviations: BMI, body mass index; CRC, colorectal cancer.
Three SNPs showed an association with overall survival (nominal P < 0.01) in the single SNP analysis (Table 2; see Supplementary Table S2 for results of all SNPs). The minor alleles of two genetic variants were associated with an increased risk of dying, rs2290065 (CCR7) with a HR of 1.31 per allele (95% CI, 1.07–1.61) and rs10815237 (CD274) with a HR of 1.13 per minor allele (95% CI, 1.04–1.23). The minor allele of rs2421826 (CD44) was associated with lower overall survival (HR: 0.89; 95% CI, 0.82–0.97).
. | . | DACHS . | ISACC . | ||
---|---|---|---|---|---|
SNP . | Gene . | HR per minor allele (95% CI) . | P value . | HR per minor allele (95% CI) . | P value . |
Overall survival | |||||
rs10815237 | CD274 | 1.13 (1.04–1.23) | 0.0064 | 1.06 (0.94–1.19) | 0.38 |
rs2421826 | CD44 | 0.89 (0.82–0.97) | 0.0090 | 0.98 (0.87–1.10) | 0.71 |
rs2290065 | CCR7 | 1.31 (1.07–1.61) | 0.0095 | 0.92 (0.72–1.19) | 0.87 |
CRC-specific survival | |||||
rs10815237 | CD274 | 1.19 (1.07–1.32) | 0.0018 | 1.14 (0.97–1.35) | 0.16 |
rs7524066 | TGFBR3 | 0.83 (0.74–0.94) | 0.0033 | 0.82 (0.68–0.98) | 0.03 |
rs17571088 | TGFBR3 | 0.82 (0.71–0.94) | 0.0050 | 0.95 (0.78–1.17) | 0.65 |
rs1495578 | TGFBR2 | 1.17 (1.05–1.30) | 0.0046 | 1.09 (0.94–1.27) | 0.25 |
rs17623772 | TGFBR2 | 1.16 (1.04–1.29) | 0.0081 | 1.07 (0.91–1.25) | 0.41 |
rs4252328 | TGFB3 | 0.84 (0.73–0.95) | 0.0070 | 1.06 (0.90–1.27) | 0.48 |
rs16906568 | IL7 | 1.18 (1.05–1.33) | 0.0050 | 1.20 (1.00–1.43) | 0.05 |
rs7845577 | IL7 | 1.22 (1.05–1.42) | 0.0084 | 1.23 (0.98–1.55) | 0.07 |
rs2421826 | CD44 | 0.87 (0.78–0.97) | 0.0098 | 0.92 (0.79–1.08) | 0.31 |
. | . | DACHS . | ISACC . | ||
---|---|---|---|---|---|
SNP . | Gene . | HR per minor allele (95% CI) . | P value . | HR per minor allele (95% CI) . | P value . |
Overall survival | |||||
rs10815237 | CD274 | 1.13 (1.04–1.23) | 0.0064 | 1.06 (0.94–1.19) | 0.38 |
rs2421826 | CD44 | 0.89 (0.82–0.97) | 0.0090 | 0.98 (0.87–1.10) | 0.71 |
rs2290065 | CCR7 | 1.31 (1.07–1.61) | 0.0095 | 0.92 (0.72–1.19) | 0.87 |
CRC-specific survival | |||||
rs10815237 | CD274 | 1.19 (1.07–1.32) | 0.0018 | 1.14 (0.97–1.35) | 0.16 |
rs7524066 | TGFBR3 | 0.83 (0.74–0.94) | 0.0033 | 0.82 (0.68–0.98) | 0.03 |
rs17571088 | TGFBR3 | 0.82 (0.71–0.94) | 0.0050 | 0.95 (0.78–1.17) | 0.65 |
rs1495578 | TGFBR2 | 1.17 (1.05–1.30) | 0.0046 | 1.09 (0.94–1.27) | 0.25 |
rs17623772 | TGFBR2 | 1.16 (1.04–1.29) | 0.0081 | 1.07 (0.91–1.25) | 0.41 |
rs4252328 | TGFB3 | 0.84 (0.73–0.95) | 0.0070 | 1.06 (0.90–1.27) | 0.48 |
rs16906568 | IL7 | 1.18 (1.05–1.33) | 0.0050 | 1.20 (1.00–1.43) | 0.05 |
rs7845577 | IL7 | 1.22 (1.05–1.42) | 0.0084 | 1.23 (0.98–1.55) | 0.07 |
rs2421826 | CD44 | 0.87 (0.78–0.97) | 0.0098 | 0.92 (0.79–1.08) | 0.31 |
Note: HRs, CIs, and P values are estimated from Cox proportional hazards regression analysis. Models were adjusted by age, sex, TNM stage, BMI at diagnosis, and current alcohol intake.
Abbreviations: BMI, body mass index; CRC, colorectal cancer.
For colorectal cancer–specific survival, nine SNPs were associated at P < 0.01. Here, the minor alleles of two SNPs in each of the genes IL7 (rs7845577 and rs16906568; LD r2 = 0.35) and TGFBR2 (rs1495578 and rs17623772; LD r2 = 0.8) as well as rs10815237 in gene CD274 were associated with poorer colorectal cancer–specific survival. The latter SNP rs10815237 was also associated with overall survival. The minor alleles of the other four SNPs, two in TGFBR3 (rs7524066 and rs17571088; LD r2 = 0.36) and one each in TGFB3 (rs4252328) and CD44 (rs2421826), were associated with better colorectal cancer–specific survival (Table 2; see Supplementary Table S3 for all SNPs).
Differential associations by MSI status
The results of the effect modifications by MSI status (211 MSI-high tumors; 1,754 non–MSI-high) are shown for all SNPs in Supplementary Table S4. In the replication sample (ISACC), 18.3% of samples were MSI-high and 81.7% non–MSI-high. Thirteen SNPs showed statistically significant interaction (nominal P < 0.01) with MSI status (Table 3). Three of the SNPs lie in gene CD4 (rs7957426, rs10774451, and rs10849524), the minor alleles of two of which were associated with decreased survival in MSI-high tumors (rs7957426 and rs10774451) and of one SNP rs10849524 was associated with increased survival in MSI-high tumors. These SNPs were not associated with survival in non–MSI-high tumors. Five of the SNPs that showed significant heterogeneity were annotated to gene HLA-DRA, of which the minor alleles of four SNPs were also associated with decreased survival (rs3129848, rs17496549, rs3135392, and rs9268644) in MSI-high tumors. None of these five SNPs were associated with survival in non–MSI-high tumors. Two SNPs in IL15RA (rs2228059 and rs1998521) also showed associations with overall survival, although in different directions for the minor alleles, and only in MSI-high tumors but not in non–MSI-high tumors. The remaining SNPs, rs2069772 (IL2), rs11165376 (TGFBR3), and rs7135373 (IFNG), were also found associated with decreased survival solely in MSI-high tumors.
. | . | DACHS . | ISACC . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | . | MSI-high (211 CRC patients) . | Non–MSI-high (1,754 CRC patients) . | . | MSI-high (239 CRC patients) . | Non–MSI-high (1,068 CRC patients) . | . | ||||
SNP . | Gene . | HR per minor allele (95% CI) . | P value . | HR per minor allele (95% CI) . | P value . | P-value interaction . | HR per minor allele (95% CI) . | P value . | HR per minor allele (95% CI) . | P value . | P-value interaction . |
Overall survival | |||||||||||
rs7957426 | CD4 | 2.13 (1.39–3.26) | 0.0005 | 1.00 (0.89–1.13) | 0.9754 | 0.0035 | 1.28 (0.81–2.02) | 0.2888 | 0.94 (0.82–1.07) | 0.3493 | 0.3445 |
rs3129848 | CD4 | 2.00 (1.28–3.12) | 0.0024 | 0.97 (0.86–1.09) | 0.5737 | 0.0045 | 1.58 (0.99–2.53) | 0.0573 | 0.92 (0.81–1.05) | 0.2209 | 0.3800 |
rs10774451 | CD4 | 0.51 (0.32–0.80) | 0.0031 | 1.02 (0.90–1.14) | 0.7973 | 0.0052 | 1.59 (1.00–2.54) | 0.0501 | 0.91 (0.81–1.04) | 0.1711 | 0.3195 |
rs17496549 | HLA-DRA | 2.09 (1.20–3.64) | 0.0088 | 1.16 (0.97–1.39) | 0.0995 | 0.0052 | 0.49 (0.25–0.96) | 0.0389 | 1.34 (1.11–1.62) | 0.0026 | 0.2190 |
rs10849524 | HLA-DRA | 1.75 (1.10–2.78) | 0.0179 | 0.97 (0.85–1.10) | 0.6514 | 0.0041 | 0.50 (0.30–0.84) | 0.0080 | 1.15 (1.00–1.32) | 0.0513 | 0.3972 |
rs3135392 | HLA-DRA | 1.50 (1.00–2.26) | 0.0512 | 0.93 (0.82–1.05) | 0.2387 | 0.0067 | 0.59 (0.37–0.94) | 0.0269 | 0.99 (0.87–1.14) | 0.9220 | 0.1468 |
rs2069772 | HLA-DRA | 1.51 (0.99–2.31) | 0.0554 | 0.96 (0.85–1.07) | 0.4497 | 0.0097 | 2.30 (1.38–3.81) | 0.0013 | 0.82 (0.72–0.93) | 0.0027 | 0.8181 |
rs6911419 | HLA-DRA | 0.72 (0.48–1.09) | 0.1165 | 1.10 (0.98–1.23) | 0.0939 | 0.0075 | 2.28 (1.38–3.75) | 0.0012 | 0.90 (0.79–1.03) | 0.1388 | 0.5347 |
rs2228059 | IL15RA | 1.80 (1.15–2.82) | 0.0107 | 0.92 (0.82–1.03) | 0.1455 | 0.0075 | 0.98 (0.64–1.49) | 0.9113 | 0.96 (0.84–1.10) | 0.5549 | 0.8178 |
rs1998521 | IL15RA | 0.60 (0.38–0.94) | 0.0267 | 1.03 (0.92–1.16) | 0.6011 | 0.0076 | 0.86 (0.55–1.34) | 0.5042 | 1.00 (0.87–1.14) | 0.9774 | 0.6695 |
rs11165376 | TGFBR3 | 0.54 (0.31–0.93) | 0.0275 | 0.98 (0.86–1.12) | 0.8021 | 0.0082 | 1.13 (0.69–1.85) | 0.6343 | 0.92 (0.81–1.06) | 0.2581 | 0.5994 |
rs7135373 | IFNG | 0.47 (0.27–0.82) | 0.0079 | 0.99 (0.87–1.12) | 0.8357 | 0.0089 | 1.07 (0.65–1.77) | 0.7889 | 0.92 (0.80–1.05) | 0.2149 | 0.7597 |
rs9268644 | IL2 | 0.56 (0.33–0.95) | 0.0326 | 0.98 (0.86–1.11) | 0.7510 | 0.0075 | 1.51 (0.95–2.39) | 0.0830 | 0.95 (0.82–1.09) | 0.4507 | 0.6975 |
. | . | DACHS . | ISACC . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | . | MSI-high (211 CRC patients) . | Non–MSI-high (1,754 CRC patients) . | . | MSI-high (239 CRC patients) . | Non–MSI-high (1,068 CRC patients) . | . | ||||
SNP . | Gene . | HR per minor allele (95% CI) . | P value . | HR per minor allele (95% CI) . | P value . | P-value interaction . | HR per minor allele (95% CI) . | P value . | HR per minor allele (95% CI) . | P value . | P-value interaction . |
Overall survival | |||||||||||
rs7957426 | CD4 | 2.13 (1.39–3.26) | 0.0005 | 1.00 (0.89–1.13) | 0.9754 | 0.0035 | 1.28 (0.81–2.02) | 0.2888 | 0.94 (0.82–1.07) | 0.3493 | 0.3445 |
rs3129848 | CD4 | 2.00 (1.28–3.12) | 0.0024 | 0.97 (0.86–1.09) | 0.5737 | 0.0045 | 1.58 (0.99–2.53) | 0.0573 | 0.92 (0.81–1.05) | 0.2209 | 0.3800 |
rs10774451 | CD4 | 0.51 (0.32–0.80) | 0.0031 | 1.02 (0.90–1.14) | 0.7973 | 0.0052 | 1.59 (1.00–2.54) | 0.0501 | 0.91 (0.81–1.04) | 0.1711 | 0.3195 |
rs17496549 | HLA-DRA | 2.09 (1.20–3.64) | 0.0088 | 1.16 (0.97–1.39) | 0.0995 | 0.0052 | 0.49 (0.25–0.96) | 0.0389 | 1.34 (1.11–1.62) | 0.0026 | 0.2190 |
rs10849524 | HLA-DRA | 1.75 (1.10–2.78) | 0.0179 | 0.97 (0.85–1.10) | 0.6514 | 0.0041 | 0.50 (0.30–0.84) | 0.0080 | 1.15 (1.00–1.32) | 0.0513 | 0.3972 |
rs3135392 | HLA-DRA | 1.50 (1.00–2.26) | 0.0512 | 0.93 (0.82–1.05) | 0.2387 | 0.0067 | 0.59 (0.37–0.94) | 0.0269 | 0.99 (0.87–1.14) | 0.9220 | 0.1468 |
rs2069772 | HLA-DRA | 1.51 (0.99–2.31) | 0.0554 | 0.96 (0.85–1.07) | 0.4497 | 0.0097 | 2.30 (1.38–3.81) | 0.0013 | 0.82 (0.72–0.93) | 0.0027 | 0.8181 |
rs6911419 | HLA-DRA | 0.72 (0.48–1.09) | 0.1165 | 1.10 (0.98–1.23) | 0.0939 | 0.0075 | 2.28 (1.38–3.75) | 0.0012 | 0.90 (0.79–1.03) | 0.1388 | 0.5347 |
rs2228059 | IL15RA | 1.80 (1.15–2.82) | 0.0107 | 0.92 (0.82–1.03) | 0.1455 | 0.0075 | 0.98 (0.64–1.49) | 0.9113 | 0.96 (0.84–1.10) | 0.5549 | 0.8178 |
rs1998521 | IL15RA | 0.60 (0.38–0.94) | 0.0267 | 1.03 (0.92–1.16) | 0.6011 | 0.0076 | 0.86 (0.55–1.34) | 0.5042 | 1.00 (0.87–1.14) | 0.9774 | 0.6695 |
rs11165376 | TGFBR3 | 0.54 (0.31–0.93) | 0.0275 | 0.98 (0.86–1.12) | 0.8021 | 0.0082 | 1.13 (0.69–1.85) | 0.6343 | 0.92 (0.81–1.06) | 0.2581 | 0.5994 |
rs7135373 | IFNG | 0.47 (0.27–0.82) | 0.0079 | 0.99 (0.87–1.12) | 0.8357 | 0.0089 | 1.07 (0.65–1.77) | 0.7889 | 0.92 (0.80–1.05) | 0.2149 | 0.7597 |
rs9268644 | IL2 | 0.56 (0.33–0.95) | 0.0326 | 0.98 (0.86–1.11) | 0.7510 | 0.0075 | 1.51 (0.95–2.39) | 0.0830 | 0.95 (0.82–1.09) | 0.4507 | 0.6975 |
Note: HRs, CIs, and corresponding P values are estimated from Cox proportional hazards regression analysis. Interaction P values calculated using likelihood ratio tests comparing the model with and without interaction term.
Abbreviation: CRC, colorectal cancer.
Replication analysis
The characteristics of the participants from the ISACC studies included in the replication analysis are shown in Supplementary Table S5. None of the SNPs associated with overall survival in the DACHS discovery set showed significant association with overall survival in the ISACC replication sample (see Table 2). One of the nine associations with improved colorectal cancer–specific survival in DACHS, for rs7524066 (TGFBR3), was confirmed in the replication analysis, with a similar magnitude of association (HR: 0.82; 95% CI, 0.68–0.98). Two more SNPs (rs16906568 and rs7845577) in gene IL7, associated with poorer survival, were replicated with similar effect sizes and borderline significance (P = 0.05 and 0.07, respectively). Although not reaching statistical significance, most SNPs showed the same direction of association in the replication sample except for rs4252328 (TGFB3). None of the SNPs that showed differential association by MSI status in DACHS were confirmed in ISACC.
Functional genomic annotation
The SNP rs7524066 (TGFBR3), for which the association with disease-specific survival was replicated, was further investigated to add functional information. We found 33 SNPs to be in LD with the investigated variant (R2 > 0.4; Supplementary Table S6). We further assessed the function of these variants using the VEP tool of the Ensembl webpage (Supplementary Table S7). We found that 15 of the LD SNPs are located in regulatory regions of the gene. Two of the SNPs are located in transcription factor binding sites (Supplementary Table S7).
Discussion
We investigated the association of 771 Treg-related genetic variants with overall and colorectal cancer–specific survival in a large cohort of colorectal cancer patients and performed replication of top findings in an independent cohort of colorectal cancer patients from the ISACC consortium. Although none of the SNPs associated with overall survival were confirmed in the independent data set, one of nine SNPs associated with colorectal cancer–specific survival in the discovery data set, rs7524066 (TGFBR3), was confirmed in the independent replication data set. The minor allele was similarly associated with better colorectal cancer–specific survival in the discovery sample (HR: 0.83; 95% CI, 0.74–0.94) and the replication sample (HR: 0.82; 95% CI, 0.68–0.98).
Based on previous observations of differential T-cell infiltration of colorectal tumors according to MSI status (26, 49, 50), we also evaluated SNPs' association with overall survival by MSI status. We found 13 SNPs in six different genes that showed interactions with MSI in the discovery set (P < 0.01) but were not able to replicate any of these interactions.
The SNP rs7524066, for which the association was replicated in the primary analysis of colorectal cancer–specific survival, is annotated to gene TGFBR3 (transforming growth factor β type III receptor), encoding one of the transforming growth factor β (TGFβ) receptors, which is also known as betaglycan (51). TGFβ is an important growth factor for normal development and homeostasis of all cells in the human body and can have both tumor-suppressor and tumor-promoting functions depending on context (52, 53). In contrast to the other two TGFβ receptors, TGFBR1 and TGFBR2, TGFBR3 does not have kinase activity (53). Still, it is not only a coreceptor but seems to act as a tumor suppressor for several cancer types (54, 55). TGFBR3 appears to suppress WNT/CTNNB1 (β-catenin) signaling (56), which is linked to colorectal cancer development and progression. Our in silico functional analyses indicated that several SNPs in LD with rs7524066 lie in regulatory regions of the gene TGFBR3 and therefore might modify gene regulation. These mechanisms support the plausibility of TGFBR3 being associated with colorectal cancer–specific survival.
Furthermore, the association with worse colorectal cancer–specific survival of the minor allele of two SNPs (rs16906568 and rs7845577) related to gene IL7 (LD r2 = between the SNPs = 0.35) is of interest. IL7 is a cytokine that is important for B- and T-cell development. Expression of IL7 was higher in colorectal cancer patients compared with controls and was associated with metastatic disease (57). Therefore, IL7 variants could have a role in survival after colorectal cancer diagnosis.
The CD274 variant rs10815237 was associated with both overall and colorectal cancer–specific survival and showed fairly similar magnitude of association particularly for colorectal cancer–specific survival in the replication sample albeit nonsignificant. Tumor CD274 [programmed cell death ligand 1 (PD-L1)] expression has been associated inversely with Treg density in colorectal cancer (58). Tumor CD274 expression may modify prognostic association of aspirin (59). These data suggest that CD274 (PD-L1) may modify colorectal cancer behavior depending on other factors in the tumor immune microenvironment. It would be of interest to examine the interaction of the CD274 variant and Treg density (or aspirin use) in future prognostic studies.
One SNP (rs2421826) in the CD44 gene was also found associated with improved both overall and colorectal cancer–specific survival in the discovery analysis but was not statistically significantly associated in the replication set. CD44 is a multistructural and multifunctional cell-surface adhesion molecule that is highly expressed in many cancers and involved in physiologic processes. Through interaction with extracellular matrix ligands, it promotes the migration and invasion processes involved in metastases. Functionally active CD44 is associated with enhanced suppressor activity of Treg (60). Expression of stem-like factors including CD44 has been associated with metastatic disease and poorer prognosis in colorectal cancer (61) Two SNPs in gene TGFBR2 were associated with worse colorectal cancer–specific survival, but were not replicated. One of the roles of TGFBR2 is Treg suppression (62), and its inactivation has been associated with the development of colorectal cancer (63). These mechanisms make it plausible that genetic variants in TGFBR2 could be associated with colorectal cancer–specific survival.
We were not able to confirm any of the results stratified by MSI status, which could be partly due to the difference in the panels used for MSI characterization. The inability to confirm many of the SNPs associated with overall or colorectal cancer–specific survival in the discovery sample could also be in part due to the limited power because the replication sample was smaller than the discovery sample. New association findings are generally biased upward in discovery data sets so that larger study samples are required for replication. Measurement of Treg/FOXP3 cell expression within the tumor may improve future studies investigating Treg-related SNP associations with colorectal cancer survival.
Research into Tregs remains challenging as the definition of Tregs has changed over the past decade. The interplay with other T helper cells, expression of surface markers, as well as expression of cytokines influencing functionality of T-cell subtypes add to the complexity of this research field. Several studies have shown that different factors, including activated state of the immune cells, their location in the cellular matrix, and ratio of different immune cells, can influence their involvement in tumor progression and subsequent colorectal cancer survival (21, 64). Two distinct subpopulations of Tregs were recently identified and shown to have differential impact on colorectal cancer prognosis (65). Genetic variation, which can be robustly measured, could help to provide further evidence for the prognostic impact of Tregs on colorectal cancer prognosis.
This study is one of the first studies investigating genetic variations in the Treg pathway with respect to colorectal cancer survival. Replication was attempted in an independent sample of colorectal cancer patients. A large amount of genotype data were available, which enabled the comprehensive investigation of the Treg pathways. The quality control measures all showed that the genotype data were of high quality. There were no opportunities to perform functional analyses for the SNPs that were implicated through these analyses.
Although no strong associations were found, there is suggestive evidence based on these analyses, particularly for the TGFβ receptors and the biological functions of the implicated genes, to support further investigations of the aforementioned SNPs and genes with respect to colorectal cancer prognosis in large study samples.
Disclosure of Potential Conflicts of Interest
A.T. Chan reports personal fees from Bayer Pharma AG, Pfizer Inc., and Boehringer Ingelheim outside the submitted work. M. Gala has equity in New Amsterdam Genomics, Inc. This firm has not provided any funding for the research involved or had any role in study design. The firm provides clinical sequencing to medical providers and patients. S. Ogino reports grants from National Institutes of Health (R35 CA197735) during the conduct of the study. C.M. Ulrich reports being Cancer Center Director. Dr. Ulrich oversees all research activities, including some funded by pharmaceutical industry. Dr. Ulrich has personally not received any funds from for-profit institutions or corporations in the past 5 years. F. Macrae reports other from Rhythym Biosciences (funding support for testing diagnostic for colorectal cancer) outside the submitted work, as well as receiving aspirin from Bayer for the Australian CaPP3 noninferiority dose-finding trial in Lynch syndrome. The trial is supported by the Victorian Cancer Agency. R.L. Milne reports grants from National Health and Medical Research Council during the conduct of the study. H. Brenner reports grants from the German Federal Ministry of Education and Research during the conduct of the study. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
S. Neumeyer: Conceptualization, methodology, writing–original draft, project administration, writing–review and editing. X. Hua: Data curation, software, formal analysis, writing–review and editing. P. Seibold: Validation, writing–review and editing. L. Jansen: Data curation. A. Benner: Validation and methodology. B. Burwinkel: Validation and methodology. N. Halama: Data curation and validation. S.I. Berndt: Conceptualization, resources, funding acquisition, writing–review and editing. A.I. Phipps: Funding acquisition, writing–review and editing. L.C. Sakoda: Resources, supervision, writing–review and editing. R.E. Schoen: Resources, funding acquisition, writing–review and editing. M.L. Slattery: Resources, funding acquisition, writing–review and editing. A.T. Chan: Resources, funding acquisition, writing–review and editing. M. Gala: Resources, funding acquisition, writing–review and editing. A.D. Joshi: Resources, funding acquisition, writing–review and editing. S. Ogino: Resources, funding acquisition, writing–review and editing. M. Song: Resources, funding acquisition, writing–review and editing. E. Herpel: Data curation and validation. H. Bläker: Resources, data curation, and validation. M. Kloor: Resources, validation, and investigation. D. Scherer: Investigation and methodology. A. Ulrich: Investigation and methodology. C.M. Ulrich: Resources, funding acquisition, writing–review and editing. A.K. Win: Resources, funding acquisition, writing–review and editing. J.C. Figueiredo: Resources, funding acquisition, writing–review and editing. J.L. Hopper: Resources, funding acquisition, writing–review and editing. F. Macrae: Resources, funding acquisition, writing–review and editing. R.L. Milne: Resources and funding acquisition. G.G. Giles: Resources and funding acquisition. D.D. Buchanan: Resources, funding acquisition, and investigation. U. Peters: Resources, data curation, funding acquisition, investigation, writing–review and editing. M. Hoffmeister: Conceptualization, resources, funding acquisition, validation, investigation, writing–review and editing. H. Brenner: Resources, data curation, funding acquisition, writing–review and editing. P.A. Newcomb: Resources, funding acquisition, investigation, methodology, writing–review and editing. J. Chang-Claude: Conceptualization, resources, supervision, funding acquisition, investigation, methodology, writing–original draft, writing–review and editing.
Acknowledgments
DACHS: We thank all participants and cooperating clinicians, and Ute Handte-Daub, Utz Benscheid, Muhabbet Celik, and Ursula Eilber for excellent technical assistance.
Harvard cohort (NHS): The study protocol was approved by the institutional review boards of the Brigham and Women's Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required. We would like to thank the participants and staff of the NHS for their valuable contributions as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, WY. The authors assume full responsibility for analyses and interpretation of these data.
CPS-II: The authors thank the CPS-II participants and Study Management Group for their invaluable contributions to this research. The authors would also like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention National Program of Cancer Registries, and cancer registries supported by the NCI Surveillance Epidemiology and End Results program.
CCFR: We graciously thank the generous contributions of our study participants, the dedication of study staff, and the financial support from the U.S. NCI, for without each of these this important registry would not exist.
DACHS: This work was supported by the German Research Council (BR 1704/6-1, BR 1704/6-3, BR1704/6-4, CH 117/1-1, HO 5117/2-1, HE 5998/2-1, KL 2354/3-1, RO 2270/8-1, and BR 1704/17-1); the German Federal Ministry of Education and Research (01KH0404, 01ER0814, 01ER0815, 01ER1505A, and 01ER1505B); the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany; and German Cancer Research Center.
Fred Hutch core grant: This research was funded in part through the NIH/NCI Cancer Center Support Grant P30 CA015704 awarded to T. Lynch.
Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO): NCI, NIH, U.S. Department of Health and Human Services (U01 CA137088; R01 CA059045 and R01 CA248857 to U. Peters and R01 CA176272 to P.A. Newcomb).
DALS: NIH (R01 CA48998 to M.L. Slattery).
Harvard cohorts (HPFS, NHS, PHS): HPFS is supported by the NIH (P01 CA055075 to E. Giovannucci, UM1 CA167552 to W. Willett, U01 CA167552 to W. Willett, R01 CA137178 to A.T. Chan, R01 CA151993 and R35CA197735 to S. Ogino), NHS by the NIH (R01 CA137178 to A.T Chan, P01 CA087969 to E. Giovannucci, UM1 CA186107 to M. Stampfer, R01 CA151993 and R35 CA197735 to S. Ogino), and PHS by the NIH (R01 CA042182 to M. Stampfer).
Melbourne Collaborative Cohort Study (MCCS) cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further augmented by Australian National Health and Medical Research Council grants 209057, 396414, and 1074383 and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry and the Australian Institute of Health and Welfare, including the National Death Index and the Australian Cancer Database.
CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study II (CPS-II) cohort. This study was conducted with Institutional Review Board approval.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.