Molecular features underlying colorectal cancer disparities remain uncharacterized. Here, we investigated somatic mutation patterns by race/ethnicity and sex among 5,856 non-Hispanic white (NHW), 535 non-Hispanic Black (NHB), and 512 Asian/Pacific Islander (API) patients with colorectal cancer (2,016 early-onset colorectal cancer patients: sequencing age <50 years). NHB patients with early-onset nonhypermutated colorectal cancer, but not API patients, had higher adjusted tumor mutation rates than NHW patients. There were significant differences for LRP1B, FLT4, FBXW7, RNF43, ATRX, APC, and PIK3CA mutation frequencies in early-onset nonhypermutated colorectal cancers between racial/ethnic groups. Heterogeneities by race/ethnicity were observed for the effect of APC, FLT4, and FAT1 between early-onset and late-onset nonhypermutated colorectal cancer. By sex, heterogeneity was observed for the effect of EP300, BRAF, WRN, KRAS, AXIN2, and SMAD2. Males and females with nonhypermutated colorectal cancer had different trends in EP300 mutations by age group. These findings define genomic patterns of early-onset nonhypermutated colorectal cancer by race/ethnicity and sex, which yields novel biological clues into early-onset colorectal cancer disparities.

Significance:

NHBs, but not APIs, with early-onset nonhypermutated colorectal cancer had higher adjusted tumor mutation rates versus NHWs. Differences for FLT4, FBXW7, RNF43, LRP1B, APC, PIK3CA, and ATRX mutation rates between racial/ethnic groups and EP300, KRAS, AXIN2, WRN, BRAF, and LRP1B mutation rates by sex were observed in tumors of young patients.

See related commentary by Shen et al., p. 530.

This article is highlighted in the In This Issue feature, p. 517

The incidence of colorectal cancer among adults younger than age 50 years (early-onset colorectal cancer) has continued to steadily increase over the last several decades, leading to approximately 18,000 new cases of early-onset colorectal cancer now diagnosed annually in the United States (1). It is projected that 1 in 10 colon cancer and 1 in 4 rectal cancer diagnoses will occur among adults younger than age 50 years by 2030 (2). Thus, to uncover the biological mechanisms that may be contributing to this alarming early-onset colorectal cancer epidemic, numerous studies have undertaken investigations into molecular features of early-onset colorectal cancer in comparison with colorectal tumors from adults diagnosed at 50+ years (late-onset colorectal cancer). Recent evidence from one single-institution study posited that colorectal cancers in young individuals may not biologically differ from late-onset tumors (3). However, findings from other studies worldwide are accumulating to support that early-onset colorectal cancer may harbor a distinct molecular phenotype and that tumor biology is a strong prognostic factor in early-onset colorectal cancer (4–10). Despite these key biological clues into early-onset colorectal cancer, no definitive causality has been derived for its etiology to date—which can be largely attributed to the complex and multifactorial nature of this disease.

The complexity of early-onset colorectal carcinogenesis reflects an intricate interplay of biology and genetics with health behaviors, early-life exposures, and social determinants of health. Collectively, these factors are also posited to be major drivers of pronounced disparities in the early-onset colorectal cancer burden—including race/ethnicity as well as sex (11). However, few studies have specifically examined early-onset colorectal tumor biology across diverse populations, such that there remains a significant barrier in discovering potential mechanisms for the development of precision therapeutic modalities that may help mitigate a disproportionate disease burden. In the present study of 6,903 non-Hispanic white (NHW), non-Hispanic Black (NHB), and Asian/Pacific Islander (API) patients (2,016 early-onset; 29.2%) with colorectal adenocarcinoma as well as clinical-grade targeted sequencing and clinicodemographic data from the American Association for Cancer Research (AACR) Project Genomics Evidence Neoplasia Information Exchange (GENIE) international consortium (Supplementary Table S1; refs. 12, 13), we investigated tumor mutational burden (TMB) and somatic cancer gene mutation patterns of early-onset colorectal cancer by race/ethnicity and sex.

TMB Patterns in Hypermutated Colorectal Tumors

In total, 9.5% of colorectal cancers (653 of 6,903) in our cohort were hypermutated [≥17.78 mutations/megabase (Mb); Fig. 1A; Supplementary Fig. S1A–S1C and Supplementary Table S1]. Among patients with hypermutated colorectal cancers, young individuals (n = 184) had a significantly higher TMB compared with late-onset cases (n = 469; P = 0.008; Supplementary Fig. S2). Although racial/ethnic patterns of TMB in hypermutated colorectal tumors were unable to be analyzed due to limited sample sizes, we found that males with early-onset hypermutated tumors had a higher TMB compared with females (P = 0.005). A similar sex-specific trend was noted among patients with late-onset hypermutated colorectal tumors (P = 0.003; Supplementary Fig. S2). Given the distinct biology of tumors with hypermutation (14), these 653 cases were excluded from further study.

Figure 1.

Genomic landscape of early-onset vs. late-onset nonhypermutated colorectal cancer: AACR Project GENIE. A, Mutation rates among 6,903 tumor samples from patients with colorectal cancer. Nonhypermutated tumors were defined using a cutoff (red line) of 17.78+ mutations/Mb. B, Box plot of adjusted mutation rates between early-onset and late-onset cases with nonhypermutated colorectal cancer. The residual of adjusted mutation rates and P value were derived from models adjusted for race and ethnicity, sex, tumor site and histology, sequencing assay, and sample type. C, Forest plot and mutation frequencies of genes differentially expressed between early-onset and late-onset nonhypermutated colorectal cancers in adjusted models that reached statistical significance (P < 0.05). Genes with FDR < 0.05 are shaded in dark gray. CI, confidence interval; OR, odds ratio.

Figure 1.

Genomic landscape of early-onset vs. late-onset nonhypermutated colorectal cancer: AACR Project GENIE. A, Mutation rates among 6,903 tumor samples from patients with colorectal cancer. Nonhypermutated tumors were defined using a cutoff (red line) of 17.78+ mutations/Mb. B, Box plot of adjusted mutation rates between early-onset and late-onset cases with nonhypermutated colorectal cancer. The residual of adjusted mutation rates and P value were derived from models adjusted for race and ethnicity, sex, tumor site and histology, sequencing assay, and sample type. C, Forest plot and mutation frequencies of genes differentially expressed between early-onset and late-onset nonhypermutated colorectal cancers in adjusted models that reached statistical significance (P < 0.05). Genes with FDR < 0.05 are shaded in dark gray. CI, confidence interval; OR, odds ratio.

Close modal

Distinct Genomic Patterns of Early-Onset Nonhypermutated Colorectal Cancer

Among 6,250 nonhypermutated colorectal cancers, tumors from patients with early-onset disease (n = 1,832) had an overall lower TMB compared with late-onset nonhypermutated colorectal cancers (n = 4,418; P = 7.2 × 10−10; Fig. 1B; Supplementary Fig. S3). The frequency of nonsilent somatic mutations for the most commonly mutated genes in early-onset and late-onset nonhypermutated colorectal cancers is presented in Supplementary Fig. S4. Overall, young patients with nonhypermutated colorectal tumors had significantly higher odds of presenting with nonsilent mutations in TP53, LRP1B, TCF7L2, and FBXW7 versus late-onset cases after adjustment for sex, race/ethnicity, tumor site and histology, sequencing assay, sample type, and TMB (Fig. 1C; Supplementary Table S2). In contrast, young patients had decreased odds of presenting with KDR, FLT4, and AMER1 nonsilent mutations in nonhypermutated tumors. Notably, patterns for TP53, LRP1B, and TCF7L2 persisted after adjusting for multiple comparisons (all FDR < 0.05). Together, these results suggest that young patients with nonhypermutated colorectal cancers harbor unique nonsilent somatic mutation patterns compared with late-onset cases—providing additional evidence to support distinct tumor characteristics of early-onset nonhypermutated colorectal cancer in a large and diverse patient cohort.

Racial/Ethnic Differences in Somatic Cancer Gene Mutations among Patients with Early-Onset Nonhypermutated Colorectal Cancer

Given the significant variation in early-onset colorectal cancer incidence and outcomes across racial and ethnic groups (15, 16), we next sought to explore genomic patterns of early-onset colorectal cancer between NHW, NHB, and API individuals. Among young patients with nonhypermutated colorectal tumors, NHB patients (n = 157)—but not API patients (n = 164)—had a significantly higher TMB compared with NHW patients (n = 1,511; PNHB/NHW = 0.01; Fig. 2A). A similar trend was observed between NHB and NHW individuals with late-onset nonhypermutated colorectal cancer (PNHB/NHW = 0.009; Fig. 2B). Together, the higher TMB observed in nonhypermutated colorectal tumors of NHB individuals yields potential clinical relevance given the emerging role of TMB as a predictive biomarker for therapeutic response.

Figure 2.

Racial/ethnic patterns of nonsilent somatic cancer gene mutations among patients with early-onset nonhypermutated colorectal cancer. A and B, Box plots of adjusted mutation rate residuals (TMB) across racial/ethnic groups for early-onset (A) and late-onset (B) nonhypermutated colorectal cancer. The residual of adjusted mutation rates and P values were derived from models adjusted for sex, tumor site and histology, sequencing assay, and sample type. C–E, Mutation frequencies between genes differentially expressed between early-onset vs. late-onset nonhypermutated colorectal cancer cases that reached statistical significance for API (C), NHB (D), and NHW (E) patients.

Figure 2.

Racial/ethnic patterns of nonsilent somatic cancer gene mutations among patients with early-onset nonhypermutated colorectal cancer. A and B, Box plots of adjusted mutation rate residuals (TMB) across racial/ethnic groups for early-onset (A) and late-onset (B) nonhypermutated colorectal cancer. The residual of adjusted mutation rates and P values were derived from models adjusted for sex, tumor site and histology, sequencing assay, and sample type. C–E, Mutation frequencies between genes differentially expressed between early-onset vs. late-onset nonhypermutated colorectal cancer cases that reached statistical significance for API (C), NHB (D), and NHW (E) patients.

Close modal

Among API patients with nonhypermutated colorectal cancers (n = 469), early-onset cases had decreased odds of presenting with nonsilent mutations in APC, RIF1, and PIK3CA compared with API individuals ages 50+ years at cancer sequencing in adjusted models (Fig. 2C and Table 1; Supplementary Table S3). In contrast, young APIs had increased odds of presenting with nonsilent mutations in FAT1, FLT4, FBXW7, and MTOR versus API individuals with nonhypermutated late-onset colorectal cancer. Young NHB patients with nonhypermutated colorectal cancers (n = 157) had statistically significantly higher odds of presenting with ATRX nonsilent mutations versus late-onset cases in adjusted models (Fig. 2D and Table 1; Supplementary Table S3). Although the results observed across NHB and API populations did not persist after adjustment for multiple testing, this may be in part attributable to the limited sample size and warrants an additional and independent study in diverse cohorts. Within the NHW population, individuals with early-onset nonhypermutated colorectal cancers had higher odds of presenting with LRP1B, TP53, TCF7L2, SMAD3, and FBXW7 nonsilent mutations (all P ≤ 0.02) and decreased odds of presenting with KDR, FLT4, RNF43, and BRAF mutations (all P ≤ 0.04) versus late-onset nonhypermutated colorectal cancer cases in adjusted models (Fig. 2E and Table 1; Supplementary Table S3). We also observed that the patterns for TP53, LRP1B, and TCF7L2 in nonhypermutated colorectal tumors from early-onset cases within the NHW population persisted after adjusting for multiple comparisons [TP53: odds ratio (OR) 1.39, 95% confidence interval (CI), 1.20–1.61, P = 1.36 × 10−5, FDR = 0.001; LRP1B: OR 4.75, 95% CI, 2.21–10.23, P = 6.77 × 10−5, FDR = 0.003; TCF7L2: OR 1.45, 95% CI, 1.17–1.80, P = 0.0008, FDR = 0.02; Table 1; Supplementary Table S3].

Table 1.

Baseline mutation probability, comparison, and heterogeneity of selected nonsilent somatic gene mutations by race and ethnicity among patients with early-onset and late-onset nonhypermutated colorectal cancer

NHWNHBAPI
Baseline mutation probability by age at cancer sequencingBaseline mutation probability by age at cancer sequencingBaseline mutation probability by age at cancer sequencingMutation frequencyHeterogeneity
Gene symbolBaseline mutation probabilityEarly-onset CRCLate-onset CRCORa(95% CI)aPFDRBaseline mutation probabilityEarly-onset CRCLate-onset CRCORa(95% CI)aPFDRBaseline mutation probabilityEarly-onset CRCLate-onset CRCORa(95% CI)aPFDRPCochran's Q-testP-Hetb
TP53 0.75 0.79 0.74 1.39 (1.20–1.61) 1.36E−05 0.001 0.74 0.77 0.72 1.31 (0.80–2.14) 0.29 0.995 0.80 0.84 0.78 1.66 (0.97–2.84) 0.06 0.60 0.25 0.49 0.78 
LRP1B 0.08 0.18 0.04 4.75 (2.21–10.23) 6.77E−05 0.003 0.07 0.00 0.09 — — — 0.11 0.10 0.13 — — — 2.45E–10 — — 
TCF7L2 0.09 0.10 0.08 1.45 (1.17–1.80) 0.0008 0.02 0.07 0.06 0.07 0.80 (0.32–1.97) 0.62 0.999 0.08 0.08 0.07 1.36 (0.63–2.93) 0.43 0.96 0.14 1.60 0.45 
SMAD3 0.03 0.04 0.03 1.76 (1.14–2.71) 0.01 0.18 0.04 0.01 0.06 0.25 (0.03–2.05) 0.20 0.995 0.04 0.06 0.04 1.65 (0.58–4.67) 0.35 0.93 0.07 3.16 0.21 
FLT4 0.02 0.01 0.03 0.53 (0.32–0.87) 0.01 0.18 0.02 0.01 0.03 0.19 (0.02–2.06) 0.17 0.995 0.03 0.05 0.02 3.45 (1.05–11.37) 0.04 0.51 0.0006 9.19 0.01 
KDR 0.02 0.02 0.03 0.58 (0.37–0.92) 0.02 0.23 0.03 0.01 0.03 0.30 (0.05–1.84) 0.19 0.995 0.03 0.02 0.03 0.73 (0.16–3.31) 0.69 0.99 0.91 0.60 0.74 
FBXW7 0.11 0.11 0.10 1.26 (1.04–1.54) 0.02 0.23 0.10 0.10 0.10 0.90 (0.45–1.80) 0.76 0.999 0.14 0.17 0.12 1.86 (1.03–3.38) 0.04 0.51 0.04 2.53 0.28 
RNF43 0.04 0.03 0.04 0.67 (0.45–0.98) 0.04 0.35 0.02 0.03 0.02 1.60 (0.40–6.36) 0.50 0.995 0.05 0.06 0.05 1.35 (0.54–3.36) 0.52 0.99 0.046 3.09 0.21 
BRAF 0.09 0.07 0.09 0.79 (0.63–0.99) 0.04 0.35 0.04 0.04 0.04 1.21 (0.42–3.53) 0.72 0.999 0.06 0.07 0.05 1.15 (0.49–2.71) 0.75 0.99 0.27 1.23 0.54 
APC 0.74 0.75 0.74 1.15 (1.00–1.33) 0.06 0.39 0.78 0.76 0.79 0.94 (0.56–1.58) 0.82 0.999 0.69 0.59 0.74 0.53 (0.34–0.83) 0.006 0.43 0.00001 10.58 0.005 
RIF1 0.04 0.03 0.04 0.77 (0.39–1.53) 0.45 0.85 0.03 0.03 0.02 1.88 (0.09–37.61) 0.68 0.999 0.09 0.04 0.12 0.01 (0.00–0.58) 0.03 0.51 0.83 4.67 0.10 
PIK3CA 0.17 0.16 0.18 1.01 (0.86–1.20) 0.88 0.93 0.20 0.19 0.21 1.00 (0.58–1.70) 0.99 0.999 0.13 0.07 0.16 0.47 (0.23–0.94) 0.03 0.51 0.003 4.49 0.11 
MTOR 0.03 0.02 0.03 0.73 (0.50–1.08) 0.12 0.58 0.02 0.03 0.02 0.99 (0.27–3.65) 0.98 0.999 0.01 0.02 0.01 8.65 (1.15–64.90) 0.04 0.51 0.98 5.65 0.06 
FAT1 0.04 0.04 0.04 0.93 (0.66–1.32) 0.68 0.92 0.04 0.06 0.03 2.35 (0.76–7.28) 0.14 0.995 0.02 0.05 0.01 4.37 (1.02–18.71) 0.047 0.51 0.32 6.07 0.048 
ATRX 0.03 0.02 0.03 0.84 (0.58–1.24) 0.39 0.76 0.03 0.06 0.02 3.29 (1.09–9.95) 0.035 0.995 0.04 0.04 0.04 1.16 (0.40–3.38) 0.78 0.99 0.03 5.29 0.07 
NHWNHBAPI
Baseline mutation probability by age at cancer sequencingBaseline mutation probability by age at cancer sequencingBaseline mutation probability by age at cancer sequencingMutation frequencyHeterogeneity
Gene symbolBaseline mutation probabilityEarly-onset CRCLate-onset CRCORa(95% CI)aPFDRBaseline mutation probabilityEarly-onset CRCLate-onset CRCORa(95% CI)aPFDRBaseline mutation probabilityEarly-onset CRCLate-onset CRCORa(95% CI)aPFDRPCochran's Q-testP-Hetb
TP53 0.75 0.79 0.74 1.39 (1.20–1.61) 1.36E−05 0.001 0.74 0.77 0.72 1.31 (0.80–2.14) 0.29 0.995 0.80 0.84 0.78 1.66 (0.97–2.84) 0.06 0.60 0.25 0.49 0.78 
LRP1B 0.08 0.18 0.04 4.75 (2.21–10.23) 6.77E−05 0.003 0.07 0.00 0.09 — — — 0.11 0.10 0.13 — — — 2.45E–10 — — 
TCF7L2 0.09 0.10 0.08 1.45 (1.17–1.80) 0.0008 0.02 0.07 0.06 0.07 0.80 (0.32–1.97) 0.62 0.999 0.08 0.08 0.07 1.36 (0.63–2.93) 0.43 0.96 0.14 1.60 0.45 
SMAD3 0.03 0.04 0.03 1.76 (1.14–2.71) 0.01 0.18 0.04 0.01 0.06 0.25 (0.03–2.05) 0.20 0.995 0.04 0.06 0.04 1.65 (0.58–4.67) 0.35 0.93 0.07 3.16 0.21 
FLT4 0.02 0.01 0.03 0.53 (0.32–0.87) 0.01 0.18 0.02 0.01 0.03 0.19 (0.02–2.06) 0.17 0.995 0.03 0.05 0.02 3.45 (1.05–11.37) 0.04 0.51 0.0006 9.19 0.01 
KDR 0.02 0.02 0.03 0.58 (0.37–0.92) 0.02 0.23 0.03 0.01 0.03 0.30 (0.05–1.84) 0.19 0.995 0.03 0.02 0.03 0.73 (0.16–3.31) 0.69 0.99 0.91 0.60 0.74 
FBXW7 0.11 0.11 0.10 1.26 (1.04–1.54) 0.02 0.23 0.10 0.10 0.10 0.90 (0.45–1.80) 0.76 0.999 0.14 0.17 0.12 1.86 (1.03–3.38) 0.04 0.51 0.04 2.53 0.28 
RNF43 0.04 0.03 0.04 0.67 (0.45–0.98) 0.04 0.35 0.02 0.03 0.02 1.60 (0.40–6.36) 0.50 0.995 0.05 0.06 0.05 1.35 (0.54–3.36) 0.52 0.99 0.046 3.09 0.21 
BRAF 0.09 0.07 0.09 0.79 (0.63–0.99) 0.04 0.35 0.04 0.04 0.04 1.21 (0.42–3.53) 0.72 0.999 0.06 0.07 0.05 1.15 (0.49–2.71) 0.75 0.99 0.27 1.23 0.54 
APC 0.74 0.75 0.74 1.15 (1.00–1.33) 0.06 0.39 0.78 0.76 0.79 0.94 (0.56–1.58) 0.82 0.999 0.69 0.59 0.74 0.53 (0.34–0.83) 0.006 0.43 0.00001 10.58 0.005 
RIF1 0.04 0.03 0.04 0.77 (0.39–1.53) 0.45 0.85 0.03 0.03 0.02 1.88 (0.09–37.61) 0.68 0.999 0.09 0.04 0.12 0.01 (0.00–0.58) 0.03 0.51 0.83 4.67 0.10 
PIK3CA 0.17 0.16 0.18 1.01 (0.86–1.20) 0.88 0.93 0.20 0.19 0.21 1.00 (0.58–1.70) 0.99 0.999 0.13 0.07 0.16 0.47 (0.23–0.94) 0.03 0.51 0.003 4.49 0.11 
MTOR 0.03 0.02 0.03 0.73 (0.50–1.08) 0.12 0.58 0.02 0.03 0.02 0.99 (0.27–3.65) 0.98 0.999 0.01 0.02 0.01 8.65 (1.15–64.90) 0.04 0.51 0.98 5.65 0.06 
FAT1 0.04 0.04 0.04 0.93 (0.66–1.32) 0.68 0.92 0.04 0.06 0.03 2.35 (0.76–7.28) 0.14 0.995 0.02 0.05 0.01 4.37 (1.02–18.71) 0.047 0.51 0.32 6.07 0.048 
ATRX 0.03 0.02 0.03 0.84 (0.58–1.24) 0.39 0.76 0.03 0.06 0.02 3.29 (1.09–9.95) 0.035 0.995 0.04 0.04 0.04 1.16 (0.40–3.38) 0.78 0.99 0.03 5.29 0.07 

Abbreviation: CRC, colorectal cancer.

aORs, 95% CIs, P, and FDR values were calculated for genes from models adjusted for patient sex, histology and site, sequencing assay, sample type, and TMB.

bP values were derived from Cochran's Q-test for heterogeneity across the racial/ethnic groups. Only genes with significant associations for nonsilent somatic mutations between early-onset and late-onset nonhypermutated colorectal cancer cases in at least one racial/ethnic group were tested.

Of these 15 identified genes, significant heterogeneities across racial/ethnic groups were observed for the effect of APC, FLT4, and FAT1 between early-onset and late-onset nonhypermutated colorectal cancer cases (Cochran's Q-test: Phet = 0.005, 0.01, and 0.048, respectively; Table 1). Moreover, statistically significantly different mutation frequencies in nonhypermutated colorectal cancers among young patients across racial/ethnic groups were observed for seven genes: FLT4 (X2 test: P = 0.0006), FBXW7 (P = 0.04), RNF43 (P = 0.046), LRP1B (P = 2.45 × 10−10), APC (P = 0.00001), PIK3CA (P = 0.003), and ATRX (P = 0.03; Table 1). In summary, these findings point to unique somatic gene mutation landscapes by race/ethnicity specifically within the population of individuals with early-onset nonhypermutated colorectal cancer.

Sex Differences in Nonsilent Somatic Gene Mutation Profiles of Early-Onset Nonhypermutated Colorectal Cancer

The biological features contributing to early-onset colorectal cancer disparities by sex remain presently unknown (11). Therefore, we also sought to examine sex-specific differences in nonsilent somatic gene mutation profiles of early-onset nonhypermutated colorectal cancer cases in our cohort. Investigation of TMB among 1,832 early-onset nonhypermutated colorectal cancer cases by sex revealed that males presented with a lower TMB versus females after adjusting for race/ethnicity, tumor site and histology, sequencing assay, and sample type, although this difference did not reach statistical significance (P = 0.07; Fig. 3A). A significant pattern was observed for TMB by sex among 4,418 patients with late-onset nonhypermutated colorectal cancer in adjusted models (P = 0.004; Fig. 3A).

Figure 3.

Tumor genomic profiles by sex among patients with early-onset nonhypermutated colorectal cancer. A, Box plot of adjusted mutation rate residuals (TMB) by sex for early-onset and late-onset nonhypermutated colorectal cancer. The residual of adjusted mutation rates and P values were derived from models adjusted for race and ethnicity, tumor site and histology, sequencing assay, and sample type. B and C, Mutation frequencies between genes differentially expressed between early-onset vs. late-onset nonhypermutated colorectal cancer cases that reached statistical significance (P < 0.05) for females (B) and males (C). D, Inverse mutation frequencies for EP300 in nonhypermutated colorectal cancers among young patients by sex.

Figure 3.

Tumor genomic profiles by sex among patients with early-onset nonhypermutated colorectal cancer. A, Box plot of adjusted mutation rate residuals (TMB) by sex for early-onset and late-onset nonhypermutated colorectal cancer. The residual of adjusted mutation rates and P values were derived from models adjusted for race and ethnicity, tumor site and histology, sequencing assay, and sample type. B and C, Mutation frequencies between genes differentially expressed between early-onset vs. late-onset nonhypermutated colorectal cancer cases that reached statistical significance (P < 0.05) for females (B) and males (C). D, Inverse mutation frequencies for EP300 in nonhypermutated colorectal cancers among young patients by sex.

Close modal

Among females, young patients with nonhypermutated colorectal cancer had statistically significantly lower odds of presenting with nonsilent mutations in EP300, AXIN2, WRN, BRAF, and KDR compared with late-onset cases in adjusted models. In contrast, females with early-onset nonhypermutated colorectal cancer had statistically significantly higher odds of presenting with TP53, SMAD2, APC, TCF7L2, and LRP1B nonsilent mutations in adjusted models (Fig. 3B; Supplementary Table S4). In particular, our observation that young female patients with nonhypermutated colorectal cancer had 54% increased odds of presenting with a TP53 mutation persisted after adjustment for multiple comparisons (OR 1.54, 95% CI, 1.26–1.88, P = 2.73 × 10−5, FDR = 0.002; Supplementary Table S4). Associations for BRAF and EP300 reached marginal significance after FDR adjustment (FDR = 0.057 and 0.09, respectively).

Young males with nonhypermutated colorectal cancer were statistically significantly more likely to present with nonsilent mutations in TCF7L2 and TP53, and less likely to present with KRAS mutations, versus males with late-onset nonhypermutated colorectal cancer (Fig. 3C; Supplementary Table S4). Although these findings did not remain significant after adjustment for multiple comparisons, the patterns for TCF7L2 and TP53 among young males are consistent with our observations among young females. In contrast to the observation that females with early-onset nonhypermutated colorectal cancer were 70% less likely to present with nonsilent somatic mutations in EP300 (females: OR 0.30, 95% CI, 0.13–0.67, P = 0.004, FDR = 0.09), young males were 59% more likely to present with EP300 nonsilent mutations (OR 1.59, 95% CI, 1.04–2.43, P = 0.03, FDR = 0.64; Fig. 3D; Supplementary Table S4).

Of these 11 identified genes, significant heterogeneities between males and females were observed for the effect of EP300, BRAF, WRN, KRAS, AXIN2, and SMAD2 between early-onset and late-onset nonhypermutated colorectal cancer cases in our cohort (Cochran's Q-test: Phet = 0.0004, 0.002, 0.03, 0.03, 0.04, and 0.04, respectively; Supplementary Table S4). Moreover, unique EP300 mutation frequencies were observed in nonhypermutated colorectal cancers among young patients by sex (X2 test: P = 0.00003). Differences in KRAS, AXIN2, WRN, BRAF, and LRP1B mutation rates by sex were also noted in nonhypermutated colorectal tumors of young patients (all P < 0.03; Fig. 3BD; Supplementary Table S4). Together, these results indicate that differences in the molecular landscape of early-onset nonhypermutated colorectal cancer persist between males and females and may present potential targets for future validation and mechanistic studies among early-onset nonhypermutated colorectal cancer patients by sex.

Here, we defined distinct molecular patterns of early-onset nonhypermutated colorectal cancer by race/ethnicity and sex using clinical-grade sequencing of colorectal adenocarcinoma. This work was performed using a large cohort of 6,903 NHW, NHB, and API individuals from the AACR Project GENIE international consortium. Specific to nonhypermutated colorectal tumors, we observed striking differences for APC mutation rates in young patients across racial/ethnic groups, as 59% of API individuals, 76% of NHB individuals, and 75% of NHW individuals with early-onset colorectal cancer harbored a nonsilent APC mutation in nonhypermutated tumors. We also found significant heterogeneity for the effect of APC between early-onset and late-onset cases by race/ethnicity in nonhypermutated tumors. Indeed, our observations for APC align with prior publications (17, 18), including a recent study of patients with early-onset colorectal cancer—including 137 non-Hispanic Asian, 128 NHB, and 105 white Hispanic individuals—that explored tumor mutation patterns for 22 genes. Among patients with early-onset colorectal cancer with both nonhypermutated and hypermutated tumors, Hein and colleagues noted differences in APC mutation rates when comparing racial/ethnic groups, with lower rates of APC mutations observed among non-Hispanic Asian patients compared with NHW patients (17). This study also revealed that 21% of tumors from young non-Hispanic Asian individuals had FBXW7 mutations versus 15% of young NHW or 11.7% of young NHB patients. Herein, mutation rates for FBXW7 (17% of young API, 11% NHW, and 10% NHB patients), as well as for FLT4, RNF43, LRP1B, PIK3CA, and ATRX, also significantly differed across racial/ethnic groups for early-onset nonhypermutated tumors. One advantage to our approach was that we investigated approximately 4-fold more genes and restricted all analyses to nonsilent somatic gene mutations specific to patients with nonhypermutated colorectal tumors. Moreover, our comparison of early-onset and late-onset cases provided the opportunity to identify genomic features distinct to early-onset nonhypermutated colorectal cancer across both race/ethnicity and sex.

Although these findings provide novel clues into biological mechanisms that may be underpinning the disproportionate early-onset colorectal cancer burden across racial/ethnic groups, one important acknowledgment with respect to these findings is that race and ethnicity is a social construct. It is vital to consider the role of several complex and related factors, particularly genetic ancestry as a biological construct, in early-onset colorectal cancer disparities (19) and in the interpretation of our results. It is equally important to compare our present findings with recent work that explored genomic features of colorectal cancer by genetic ancestry (8). Among 33,770 individuals of European ancestry and 5,301 individuals of African ancestry diagnosed with colorectal cancer, including hypermutated tumors, Myer and colleagues (8) observed that among all samples and specific to microsatellite stable, POLE/POLD-1–negative colorectal cancers (TMB < 10), the median TMB was significantly higher among individuals of African versus European ancestry. Although more than half of all cases in that cohort were ages 59 years and older at cancer sequencing, in our present study, we noted similar TMB patterns by race/ethnicity for both early-onset and late-onset nonhypermutated colorectal cancers. In particular, we found that for young patients with nonhypermutated colorectal tumors, NHB patients—but not API patients—had a significantly higher TMB compared with NHW patients. As genomic patterns of colorectal cancer have not yet been explored among individuals of East/South Asian ancestry to date, this also suggests that independent validation of our results in future cohorts with diverse, well-annotated early-onset colorectal cancer cases and available genetic ancestry data will be vital to accelerate the translation of these distinct early-onset colorectal cancer patterns by race/ethnicity into clinical application and reduce marked disparities in this disease burden.

Beyond these racial/ethnic differences in somatic cancer gene mutations for early-onset nonhypermutated colorectal cancers, our study is the first to our knowledge to identify genomic patterns specific to early-onset colorectal cancer by sex. As a biological variable, sex affects the function of the immune system. The complex interplay and effects of genetics, hormones, and the environment (e.g., gut microbiome) can all contribute to sex differences in immune responses (20) and to variations in the burden of colorectal cancer. This is supported by a comprehensive untargeted metabolomics study of colon tumor and normal tissues from patients ages 55+ years, in which Cai and colleagues revealed sex-specific metabolic subphenotypes in colon cancer (21). Here, we showed sex-specific differences for EP300, KRAS, AXIN2, WRN, BRAF, and LRP1B mutation rates specific to nonhypermutated colorectal tumors of young patients. We also identified significant heterogeneity by sex for the effect of EP300, BRAF, WRN, KRAS, AXIN2, and SMAD2 between early-onset and late-onset nonhypermutated colorectal cancer cases. Together, these findings support a hypothesis that differences in tumor biology may be contributing to a sex-specific disease burden in early-onset colorectal cancer for further study. As males harbor a 12% to 18% increased hazard of disease-specific death compared with females after early-onset colorectal cancer diagnosis (15), further investigation into the prognostic significance of these genes in early-onset colorectal cancer by sex may also yield significant implications in the clinical setting.

Use of clinical-grade targeted sequencing and clinicodemo­graphic data for nearly 7,000 pathologically confirmed colorectal adenocarcinoma cases—of which nearly 30% had early-onset disease—from AACR Project GENIE (12, 13) is a considerable strength of this study. It is also of value to draw from the limitations of this work. As the present study was limited to individuals who self-identified as NHW, NHB, and API, we do not yet know how diversity within these groups (e.g., Asian vs. Pacific Islander individuals) or in other populations (e.g., Hispanic, American Indian populations) contributes to the biology of early-onset colorectal cancer disparities. GENIE data largely stem from tertiary care centers and may not completely represent the target populations. Although the genomic data provided by GENIE are clinical grade and passed through stringent processing pipelines prior to release (22), variant calling was not independently validated by orthogonal approaches. The data released also precluded our ability to derive genetic ancestry or reliably define microsatellite instability status for colorectal cancer cases. Further, GENIE does not make available information on clinical outcomes that limited our ability to investigate the possible prognostic value of these genes in early-onset nonhypermutated colorectal cancer by race/ethnicity or sex. GENIE also does not presently release tumor stage, grade, primary colon tumor site codes, or treatment history data. Consequently, we adjusted for tumor site (colon vs. rectum), histology, and primary sample type in our study. To consider that primary sample type (primary tumor vs. metastasis) may contribute to somatic mutation differences by race/ethnicity and sex in early-onset nonhypermutated colorectal cancer, we also repeated our primary TMB analyses while excluding patients with metastatic tissue used for clinical-grade sequencing with concordant results. However, it is still possible that molecular patterns observed herein may be in part related to the disease stage and warrant validation in independent cohort studies. Although we were able to exclude 653 patients with hypermutated tumors to specifically focus on nonhypermutated colorectal cancer, information on family history of cancer, as well as germline genetic features, was also unavailable for query.

In conclusion, this study provides first-of-its-kind evidence to our knowledge that molecular features of early-onset colorectal cancer may differ by both race/ethnicity and sex. We also defined sex and racial/ethnic-specific differences in TMB among patients with early-onset colorectal cancer that may begin to better inform treatment decision-making. Together, these findings warrant subsequent epidemiologic and laboratory-based studies for validation from which the knowledge gained could yield unprecedented mechanistic insights into the biology underlying early-onset colorectal cancer disparities.

Study Population

Next-generation clinical sequencing data from tumor tissues and associated pathology reports have been released by the AACR Project GENIE consortium (12, 13). This study has been granted data access through the database of Genotypes and Phenotypes (dbGaP) project #24541. Somatic cancer gene mutation data as well as clinicopathologic and demographic data for colorectal cancer cases were downloaded from the GENIE project via Synapse (release 11.0; http://www.synapse.org/genie). This study was exempted from Institutional Review Board approval and informed consent, as deidentified GENIE data are publicly available (12, 13, 22). A total of 6,903 pathologically confirmed colorectal adenocarcinoma cases, with a unique patient record and matched clinical and sequencing data for patients who self-identified as NHB, NHW, or API, were included in the present study.

Available clinical and pathologic data for colorectal cancer cases with clinical-grade targeted sequencing in GENIE included site (colon, rectum), histology (colon, rectal, and colorectal adenocarcinoma; colorectal mucinous adenocarcinoma), and sample type (primary tumor or metastatic site). Demographic data included sex, race, and ethnicity (NHW, NHB, API), age at sequencing (a surrogate for diagnosis age; ref. 23), and sequencing center and assay (panel/platform).

Clinical-Grade Targeted Sequencing Data

Somatic mutation data from tumor tissues have been previously generated using clinical-grade targeted sequencing panels from multiple sequencing centers (12, 13). Detailed summaries of sequencing pipelines—including distributions of library selection, library strategy, platform, and specimen tumor cellularity; coverage and alteration types per panel/pipeline; preservation techniques; sequence assay genomic information; and genomic profiling at each center—have been previously described and are publicly accessible in the AACR GENIE Data Guide (22). Read depth for colorectal cancer tissues across the 10 sequencing centers included in this study is listed in Supplementary Table S5.

The bioinformatics pipelines used to detect mutations are also described in depth in the AACR GENIE Data Guide (22), including data preprocessing and alignment of reads, quality filters/controls, single-nucleotide somatic mutation and small insertion and deletion (indel) calls, and filtering of putative germline single-nucleotide variants and indels. GENIE has applied a stringent filtering pipeline to remove putative germline variants and minimize artifacts (e.g., using pooled blood samples as controls, existing databases of known artifacts, and common germline variants from the 1000 Genomes Project or Exome Sequencing Project with allele frequencies >0.1%) to ensure consistent calling of somatic variations in tumor tissues, as well as to minimize artifacts and germline events. GENIE has provided extensive functional annotation for somatic mutations based on curated bioinformatics analysis of functional genomic databases. To focus on putative functional mutations, we limited our analyses to nonsilent mutations (e.g., bin variable for mutation carrier vs. noncarrier), which includes missense, splicing, nonsense, truncating, frameshift insertion and deletions, and nonframeshift deletions.

TMB and Hypermutation Status

We analyzed sequencing panel coverage for each sequencing assay (panel/platform) based on relevant genomic information released by AACR Project GENIE (Synapse; http://www.synapse.org/genie) with detailed covered gene regions for each sequencing assay (22). We calculated the total covered genomic regions based on the intragenic regions included in panels for each sequencing assay. Patients with sequencing assay coverage of less than 500 kb target regions were excluded from our study. TMB for each colorectal cancer case was quantified by the total number of somatic mutations per 1 Mb in tumor tissue. To focus our analyses to nonhypermutated colorectal cancer cases, a total of 653 cases with ≥17.78 somatic mutations/1 Mb (defined as hypermutated colorectal cancer) were removed based on our conserved inflection point estimation of the TMB distribution (Fig. 1A). Nonhypermutated colorectal cancers were defined as tumors with fewer than 17.78 somatic mutations/1 Mb.

Statistical Analysis

Clinical and demographic features of the study population were summarized by frequency. Given the observed variation in TMB across individual sequencing assays (platforms/panels; Supplementary Fig. S3), all analyses were adjusted for sequencing assay in our study. Comparison of TMB between groups (early-onset vs. late-onset, sex, and race/ethnicity) was evaluated using multivariable linear regression adjusted for patient sex, race/ethnicity, colorectal tumor site and histology, sequencing assay, and sample type as appropriate. Consequently, the residual of adjusted mutation rates was presented as a proxy to visualize TMB using multivariable linear regression models adjusted for patient sex, race/ethnicity, colorectal tumor site and histology, sequencing assay, and sample type as appropriate (Figs. 1B, 2A and B, and 3A; Supplementary Fig. S2).

The baseline mutation probability for each gene was estimated based on nonsilent somatic mutation frequency calculated from mutation carriers divided by total cases, as we have previously described (24). Overall comparison of nonsilent somatic mutations between early-onset and late-onset nonhypermutated colorectal cancer cases, as well as comparisons by racial/ethnic groups and sex, was performed using multivariable logistic regression analyses adjusted for patient sex, race/ethnicity, colorectal tumor site and histology, sequencing assay, sample type, and TMB as appropriate. All covariates were used as fixed effects. To control for multiple comparisons, FDR correction was performed on the nominal P values derived from our association analyses.

Mutation frequencies for early-onset versus late-onset nonhypermutated colorectal cancer cases were visualized using bar graphs. Differences in mutation frequencies for each gene of interest by race/ethnicity and sex for early-onset nonhypermutated colorectal cancer cases were compared using X2 tests. Heterogeneity tests were conducted using Cochran's Q-test. All statistical tests were two-sided, with P < 0.05 considered to be statistically significant. Analyses were conducted using R software version 3.3.3 (R Project for Statistical Computing).

Data Availability Statement

Data for AACR Project GENIE are available at http://www.synapse.org/genie, with terms of access provided at https://www.aacr.org/wp-content/uploads/2022/03/GENIE_data_guide_11.0-public.pdf. Data supporting the findings from this study are also available from the corresponding authors upon reasonable request.

A.N. Holowatyj reports grants from the NIH and the American Cancer Society during the conduct of the study, as well as grants from the Dalton Family Foundation, Pfizer, and the Appendix Cancer Pseudomyxoma Peritonei (ACPMP) Research Foundation outside the submitted work. No disclosures were reported by the other authors.

A.N. Holowatyj: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. W. Wen: Data curation, formal analysis, validation, investigation, visualization, writing–review and editing. T. Gibbs: Investigation, writing–review and editing. H.M. Seagle: Investigation, writing–review and editing. S.R. Keller: Investigation, writing–review and editing. D.R. Velez Edwards: Investigation, writing–review and editing. M.K. Washington: Investigation, writing–review and editing. C. Eng: Investigation, writing–review and editing. J. Perea: Investigation, writing–review and editing. W. Zheng: Investigation, writing–review and editing. X. Guo: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing.

A.N. Holowatyj was supported by the NIH (K12 HD043483 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development). This work was also supported by NIH/NCI grants (R37 CA227130, X. Guo; R01 CA188214, W. Zheng; and P50 CA236733, A.N. Holowatyj) and by the American Cancer Society (#IRG-19-139-59, A.N. Holowatyj). We acknowledge all the families and clinicians who contributed to the AACR Project GENIE international clinicogenomic data-sharing consortium.

The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Note: Supplementary data for this article are available at Cancer Discovery Online (http://cancerdiscovery.aacrjournals.org/).

1.
Siegel
RL
,
Miller
KD
,
Goding Sauer
A
,
Fedewa
SA
,
Butterly
LF
,
Anderson
JC
, et al
.
Colorectal cancer statistics, 2020
.
CA Cancer J Clin
2020
;
70
:
145
64
.
2.
Bailey
CE
,
Hu
CY
,
You
YN
,
Bednarski
BK
,
Rodriguez-Bigas
MA
,
Skibber
JM
, et al
.
Increasing disparities in the age-related incidences of colon and rectal cancers in the United States, 1975–2010
.
JAMA Surg
2015
;
150
:
17
22
.
3.
Cercek
A
,
Chatila
WK
,
Yaeger
R
,
Walch
H
,
Fernandes
GDS
,
Krishnan
A
, et al
.
A comprehensive comparison of early-onset and average-onset colorectal cancers
.
J Natl Cancer Inst
2021
;
113
:
1683
92
.
4.
Holowatyj
AN
,
Gigic
B
,
Herpel
E
,
Scalbert
A
,
Schneider
M
,
Ulrich
CM
, et al
.
Distinct molecular phenotype of sporadic colorectal cancers among young patients based on multiomics analysis
.
Gastroenterology
2020
;
158
:
1155
8
.
5.
Lieu
CH
,
Golemis
EA
,
Serebriiskii
IG
,
Newberg
J
,
Hemmerich
A
,
Connelly
C
, et al
.
Comprehensive genomic landscapes in early and later onset colorectal cancer
.
Clin Cancer Res
2019
;
25
:
5852
8
.
6.
Kirzin
S
,
Marisa
L
,
Guimbaud
R
,
De Reynies
A
,
Legrain
M
,
Laurent-Puig
P
, et al
.
Sporadic early-onset colorectal cancer is a specific sub-type of cancer: a morphological, molecular and genetics study
.
PLoS One
2014
;
9
:
e103159
.
7.
Willauer
AN
,
Liu
Y
,
Pereira
AAL
,
Lam
M
,
Morris
JS
,
Raghav
KPS
, et al
.
Clinical and molecular characterization of early-onset colorectal cancer
.
Cancer
2019
;
125
:
2002
10
.
8.
Myer
PA
,
Lee
JK
,
Madison
RW
,
Pradhan
K
,
Newberg
JY
,
Isasi
CR
, et al
.
The genomics of colorectal cancer in populations with African and European ancestry
.
Cancer Discov
2022
;
12
:
1282
93
.
9.
Jin
Z
,
Dixon
JG
,
Fiskum
JM
,
Parekh
HD
,
Sinicrope
FA
,
Yothers
G
, et al
.
Clinicopathological and molecular characteristics of early-onset stage III colon adenocarcinoma: an analysis of the ACCENT database
.
J Natl Cancer Inst
2021
;
113
:
1693
704
.
10.
Eng
C
,
Jacome
AA
,
Agarwal
R
,
Hayat
MH
,
Byndloss
MX
,
Holowatyj
AN
, et al
.
A comprehensive framework for early-onset colorectal cancer research
.
Lancet Oncol
2022
;
23
:
e116
e28
.
11.
Holowatyj
AN
,
Perea
J
,
Lieu
CH
.
Gut instinct: a call to study the biology of early-onset colorectal cancer disparities
.
Nat Rev Cancer
2021
;
21
:
339
40
.
12.
AACR Project GENIE Consortium
.
AACR project GENIE: powering precision medicine through an international consortium
.
Cancer Discov
2017
;
7
:
818
31
.
13.
Pugh
TJ
,
Bell
JL
,
Bruce
JP
,
Doherty
GJ
,
Galvin
M
,
Green
MF
, et al
.
AACR project GENIE: 100,000 cases and beyond
.
Cancer Discov
2022
;
12
:
2044
57
.
14.
Campbell
BB
,
Light
N
,
Fabrizio
D
,
Zatzman
M
,
Fuligni
F
,
de Borja
R
, et al
.
Comprehensive analysis of hypermutation in human cancer
.
Cell
2017
;
171
:
1042
56
.
15.
Holowatyj
AN
,
Ruterbusch
JJ
,
Rozek
LS
,
Cote
ML
,
Stoffel
EM
.
Racial/ethnic disparities in survival among patients with young-onset colorectal cancer
.
J Clin Oncol
2016
;
34
:
2148
56
.
16.
Theuer
CP
,
Wagner
JL
,
Taylor
TH
,
Brewster
WR
,
Tran
D
,
McLaren
CE
, et al
.
Racial and ethnic colorectal cancer patterns affect the cost-effectiveness of colorectal cancer screening in the United States
.
Gastroenterology
2001
;
120
:
848
56
.
17.
Hein
DM
,
Deng
W
,
Bleile
M
,
Kazmi
SA
,
Rhead
B
,
De
L
, et al
.
Racial and ethnic differences in genomic profiling of early onset colorectal cancer
.
J Natl Cancer Inst
2022
;
114
:
775
8
.
18.
Liu
Z
,
Yang
C
,
Li
X
,
Luo
W
,
Roy
B
,
Xiong
T
, et al
.
The landscape of somatic mutation in sporadic Chinese colorectal cancer
.
Oncotarget
2018
;
9
:
27412
22
.
19.
Eng
C
,
Holowatyj
AN
.
Colorectal cancer genomics by genetic ancestry
.
Cancer Discov
2022
;
12
:
1187
8
.
20.
Klein
SL
,
Flanagan
KL
.
Sex differences in immune responses
.
Nat Rev Immunol
2016
;
16
:
626
38
.
21.
Cai
Y
,
Rattray
NJW
,
Zhang
Q
,
Mironova
V
,
Santos-Neto
A
,
Hsu
K-S
, et al
.
Sex differences in colon cancer metabolism reveal a novel subphenotype
.
Sci Rep
2020
;
10
:
4905
.
22.
AACR project GENIE
. [
cited 2022 Jul 1
].
Available from
: https://www.aacr.org/wp-content/uploads/2022/03/GENIE_data_guide_11.0-public.pdf.
23.
Gagan
J
,
Van Allen
EM
.
Next-generation sequencing to guide cancer therapy
.
Genome Medicine
2015
;
7
:
80
.
24.
Chen
Z
,
Wen
W
,
Beeghly-Fadiel
A
,
Shu
XO
,
Díez-Obrero
V
,
Long
J
, et al
.
Identifying putative susceptibility genes and evaluating their associations with somatic mutations in human cancers
.
Am J Hum Genet
2019
;
105
:
477
92
.

Supplementary data