Abstract
Purpose: Patients with inflammatory bowel diseases, that is, ulcerative colitis and Crohn's disease (CD), face an increased risk of developing colorectal cancer (CRC). Evidence, mainly from ulcerative colitis, suggests that TP53 mutations represent an initial step in the progression from inflamed colonic epithelium to CRC. However, the pathways involved in the evolution of CRC in patients with CD are poorly characterized.
Experimental Design: Here, we analyzed 73 tissue samples from 28 patients with CD-CRC, including precursor lesions, by targeted next-generation sequencing of 563 cancer-related genes and array-based comparative genomic hybridization. The results were compared with 24 sporadic CRCs with similar histomorphology (i.e., mucinous adenocarcinomas), and to The Cancer Genome Atlas data (TCGA).
Results: CD-CRCs showed somatic copy-number alterations (SCNAs) similar to sporadic CRCs with one notable exception: the gain of 5p was significantly more prevalent in CD-CRCs. CD-CRCs had a distinct mutation signature: TP53 (76% in CD-CRCs vs. 33% in sporadic mucinous CRCs), KRAS (24% vs. 50%), APC (17% vs. 75%), and SMAD3 (3% vs. 29%). TP53 mutations and SCNAs were early and frequent events in CD progression, while APC, KRAS, and SMAD2/4 mutations occurred later. In four patients with CD-CRC, at least one mutation and/or SCNAs were already present in non-dysplastic colonic mucosa, indicating occult tumor evolution.
Conclusions: Molecular profiling of CD-CRCs and precursor lesions revealed an inflammation-associated landscape of genome alterations: 5p gains and TP53 mutations occurred early in tumor development. Detection of these aberrations in precursor lesions may help predicting disease progression and distinguishes CD-associated from sporadic colorectal neoplasia. Clin Cancer Res; 24(20); 4997–5011. ©2018 AACR.
This article is featured in Highlights of This Issue, p. 4913
The distinction between Crohn's disease (CD)-associated and sporadic colorectal neoplasia is crucial for treatment decisions but extremely difficult based on endoscopy and histology alone. Our findings indicate a possible approach to distinguish CD–associated dysplasia from sporadic adenoma based on TP53 mutations and gains of chromosome arm 5p as molecular biomarkers. In addition, the detection of these aberrations in non-dysplastic precursor lesions may help predicting progression to CRC in patients with CD.
Introduction
Crohn's disease (CD), a condition when associated with chronic inflammation of the large intestine, considerably increases the risk for the development of colorectal cancer (CRC), comparable with ulcerative colitis (UC; ref. 1). In both CD and UC the risk for developing CRC depends on disease duration, and on the extent and severity of colorectal inflammation (2). CD-CRCs are predominantly located in the distal colorectum (40%–50%), followed by the cecum/ascending colon (20%–30%), and can also occur in anorectal fistulae (3, 4). Compared with sporadic CRCs, CD-CRCs develop at an earlier age, are usually diagnosed at more advanced stages and are therefore associated with a poorer prognosis (5). The histomorphology of CD-CRCs often resembles a mucinous and/or signet ring cell phenotype, which is associated with poor prognosis in sporadic CRC (3, 6). Many patients with CD are diagnosed with multifocal CRCs (10%) and frequently show synchronous dysplastic lesions (30%–50%), which may be the consequence of extended chronic inflammation causing the so-called field cancerization (4, 7).
Inflammation-induced CRCs arise in a stepwise fashion from dysplastic precursor lesions, comparable with the development of sporadic CRCs from adenomas (8). The development of sporadic CRCs is caused by the sequential accumulation of cancer gene mutations and specific chromosomal copy-number changes (9, 10). For instance, inactivating mutations of the tumor suppressor gene APC and gains of chromosome 7 occur before the development of invasive disease, and are maintained during tumorigenesis, while mutations of TP53 and copy-number increases of chromosome arm 20q manifest later in the progression of sporadic CRC (9, 11, 12). However, the genetic events that define CD-CRCs, in particular the dynamics of their development from histologically undetectable precursor lesions to invasive disease, remain largely elusive. Most studies on inflammation-related colorectal carcinogenesis focused on UC-CRCs or “colitis-associated cancer” (CAC) subsuming UC-CRCs and CD-CRCs without differentiating. Two recent studies using next-generation sequencing suggest differences in the mutational landscape not only between sporadic CRCs and CACs, but also between CD-CRCs and UC-CRCs (13, 14). While TP53 mutations occurred at a slightly higher frequency in CACs than in sporadic CRCs, APC and KRAS mutations were less common in CACs compared with sporadic CRCs (13–15). Yaeger and colleagues reported that IDH1 mutations were significantly more frequent in CD-CRCs compared with both UC-CRCs and sporadic CRCs (14). Neither of these studies investigated genome-wide copy-number changes, nor the sequence of mutational events during CD–related tumorigenesis. Molecular data regarding precursor lesions of CD-CRCs, including both inflamed and dysplastic epithelium, are sparse. By targeted sequencing of TP53, CDKN2A, and KRAS in individual crypts, Galandiuk and colleagues showed frequent and occasionally very extensive field cancerization in the chronically inflamed bowel of five patients with CD (16).
In this study, we aimed (i) to characterize the landscape of somatic gene mutations and chromosomal copy-number alterations of CD-CRCs compared with histomorphologically similar sporadic mucinous CRCs, (ii) to elucidate the genetic pathways of tumor development in CD, and (iii) to explore the dynamics of the development of genome alterations in CD–associated colorectal neoplasia by analyzing multiple CD–related lesions at different stages of development from individual patients. To this end, we analyzed 73 samples from 28 patients with CD-CRC, including carcinomas and lymph node metastases, dysplastic lesions, inflamed mucosa, and histologically normal colonic mucosa, by targeted next-generation sequencing of 563 cancer-related genes and array-based comparative genomic hybridization (aCGH). As a control collective, we investigated 24 sporadic mucinous CRCs and matched normal colonic mucosa, and compared our results to CRC data from The Cancer Genome Atlas (TCGA).
Materials and Methods
Patients and tissue samples
We collected 73 formalin-fixed, paraffin-embedded (FFPE) tissue samples from 28 patients with CD-CRC diagnosed between 2003 and 2016 from the archives of the Institutes of Pathology in Mannheim and Regensburg, including primary CD-CRCs, lymph node metastases, dysplastic lesions, inflamed mucosa, and normal colonic mucosa (Supplementary Fig. S1, top). CD–related etiology of a CRC was assumed (i) if the patient had long-standing CD with colonic involvement at the time of CRC diagnosis, (ii) if inflammation in the colon involved the large bowel segment, in which the CRC was located, and (iii) in resection specimens: if inflammation and/or chronic inflammatory changes in adjacent mucosa were visible. In addition, 24 sporadic CRCs (histologically mucinous adenocarcinomas with and without signet ring cells) and corresponding normal mucosa were collected, diagnosed between 2004 and 2015 (Supplementary Fig. S1, bottom). All sporadic mucinous CRCs were mismatch repair proficient. In addition to the initial diagnosis, all samples were re-evaluated by two pathologists (T. Gaiser and D. Hirsch). Tumor staging was performed according to the current American Joint Committee on Cancer/Union for International Cancer Control staging system. All experiments were conducted in accordance with the Declaration of Helsinki and approved by the institutional review boards (2016-819R-MA, OHSRP#13229/MTA#41436) that waived the need for informed consent for this retrospective and anonymized analysis of archival samples.
Histopathologic criteria
Histopathologic classification was performed according to the World Health Organization by two pathologists (T. Gaiser and D. Hirsch; ref. 17). Mucinous colorectal adenocarcinoma diagnosis in the sporadic control group was confirmed if the tumors were composed of >50% extracellular mucin. Tumors were also evaluated for the presence of signet ring cells. In CD-CRCs, we reviewed for the presence or absence of extracellular mucin (mucinous component) and/or signet ring cells (signet ring cell component). All intestinal tissue samples were screened for inflammation, regenerative, that is, inflammatory, changes, and dysplasia according to the histopathologic criteria defined by the Inflammatory Bowel Disease-Dysplasia Morphology Study Group (18).
DNA isolation
DNA was isolated from FFPE tissue after histopathologic determination of tumor areas on H&E sections, avoiding foci of inflammation as well as foci of high mucin and low cellular content, as published previously (12). DNA concentration was measured by fluorometric quantitation (Qubit 3.0 Fluorometer, Life Technologies, Thermo Fisher Scientific) using the Qubit dsDNA HS (High Sensitivity) Assay Kit (Life Technologies). DNA integrity was evaluated on the basis of Bioanalyzer traces (2100 Bioanalyzer Instrument, Agilent Technologies) using the High Sensitivity DNA Kit (Agilent).
Microsatellite PCR
Tumor DNA and corresponding normal DNA were subjected to microsatellite PCR using a panel of five mononucleotide markers (BAT25, BAT26, NR-21, NR-24, and MONO-27; cf. MSI Analysis System, Promega), and a panel of two mononucleotide (BAT25 and BAT26) and three dinucleotide markers (D5S346, D2S123, and D17S250; so-called Bethesda panel; ref. 19). Briefly, 10-ng DNA was used to amplify the microsatellite loci in multiplex PCR reactions. PCR products were separated by capillary electrophoresis using an ABI 3130 Genetic Analyzer (Applied Biosystems). A tumor was classified as high-level microsatellite instable (MSI-H) when two or more markers of the Bethesda panel and/or of the Promega panel showed an allelic size variation (i.e., a band shift compared with corresponding normal DNA), as low-level microsatellite instable (MSI-L) when one marker of either panel showed an allelic size variation, and as microsatellite stable (MSS) when no marker showed an allelic size variation.
Array-based comparative genomic hybridization
aCGH was performed as previously described using ULS labeling (Agilent) and SurePrint G3 CGH 4 × 180K microarrays (Agilent; ref. 20). Data were visualized and analyzed using Nexus Copy Number software version 8.0 (BioDiscovery, Inc.). Arm-level somatic copy-number alterations (SCNAs) were defined as single alteration or an aggregate of alterations encompassing half or more (≥50%) of a chromosome arm (21). Array-based CGH data have been deposited in Gene Expression Omnibus (GEO) database (data accession number: GSE113015).
Fluorescence in situ hybridization
A bacterial artificial chromosome contig centering on the 5p14.3 region (CDH12) was assembled in UCSC Genome Browser (http://genome.ucsc.edu), and the two overlapping bacterial artificial chromosome clones were labeled with DY-505-dUTP (Dyomics) using nick translation. The centromere probe (CEP 10) was obtained from Abbott Molecular (Vysis CEP10 SpectrumOrange, catalog No. 06J36–090). Pretreatment, denaturation, hybridization, and detection were done using the ZytoLight FISH-Tissue Implementation Kit (catalog No. Z-2028–20, ZytoVision GmbH) according to the manufacturer's instructions with slight modifications. Slides were analyzed using an Olympus BX41 fluorescence microscope (Olympus Deutschland GmbH) connected to an F-View II CCD-Camera (Soft Imaging System GmbH). Between 40 and 100 non-overlapping nuclei were counted per sample. We used a ratio (number of target locus signals/number of centromere signals) of ≥1.2 as threshold for copy-number gain of the target locus as published previously (22).
Targeted next-generation sequencing
The targeted sequence capture approach, named OncoVar, was designed to span coding exons of 563 cancer-related genes (see Supplementary Table S1 for gene list). Briefly, library construction was done with the KAPA Hyper Prep Kits for Illumina (https://www.kapabiosystems.com), and the resulting paired-end libraries were sequenced on NextSeq 500 systems (Illumina). The mean read depth for targeted regions (mean coverage) was 185×. Data processing and variant calling procedure mainly followed the Best Practices workflow recommended by the Broad Institute (http://www.broadinstitute.org/gatk/guide/best-practices). Briefly, the raw sequencing reads were mapped to human genome build 19 by Burrows-Wheeler Aligner (23), followed by local realignment using the GATK suit from the Broad Institute. Duplicated reads were marked by Picard tools (http://picard.sourceforge.net). Somatic variant calling was performed on sequencing reads of matched tumor–normal samples by the Strelka somatic variant caller (version 1.0.15; ref. 24), and by MuTect (version 2; ref. 25). Germline variant calling was done with the UnifiedGenotyper from the Broad Institute. SnpEff (26) was used to annotate and predict the effects of variants with multiple annotation databases, including dbNSFP (27), dbSNP 147 (NCBI; https://www.ncbi.nlm.nih.gov/projects/SNP/), ESP6500 (NHLBI Exome Sequencing Project; http://evs.gs.washington.edu/EVS/), and COSMIC database (Catalogue of Somatic Mutations in Cancer; http://cancer.sanger.ac.uk/cosmic; ref. 28). The following filtering criteria were used for germline variation calls: (i) passed variant caller filters; (ii) read depth ≥10 reads with fraction of alternative reads of ≥0.25; (iii) MAPQ score of >55; (iv) Exome Aggregation Consortium (ExAC) filter (ExAC AF) ≤0.001; (v) annotation impact “high” or “moderate”; and (vi) candidate variants were examined in HGMD (Human Gene Mutation Database; http://www.hgmd.cf.ac.uk/ac/index.php; ref. 27). The following filtering criteria were used for somatic variation calls: (i) passed variant caller filters; (ii) fraction of alternative reads in the matched normal is ≤0.01; (iii) read depth in the tumor ≥10 reads with fraction of alternative reads ≥0.1 (≥0.05 for samples C01CA2, C13CA, C15CA, C16CA, C16LR, C25CA, C27CA, M10CA, and M24CA because tumor cell content was very low due to extensive extracellular mucin); (iv) MAPQ score of >49 (for Strelka only); (v) ExAC filter (ExAC AF) ≤0.001; and (vi) annotation impact “high” or “moderate”. Visual inspection of SNVs and indels was done using Integrative Genomics Viewer (IGV, Broad Institute, Cambridge, MA; refs. 29, 30). A supplementary excel file is provided containing the list of mutations used for all analyses in this study. Those mutations comprise (i) mutations called by both MuTect and Strelka algorithm, (ii) mutations in genes known to be significantly mutated in non-hypermutated CRC according to TCGA data (15), that is, APC, TP53, KRAS, PIK3CA, FBXW7, SMAD4, NRAS, TCF7L2, SMAD2, CTNNB1, and ACVR1B, which were called by either MuTect or Strelka and validated by an alternative method, and (iii) mutations called by either MuTect or Strelka that were present in more than one neoplastic lesion of an individual patient. In addition, as histologically normal mucosa does not necessarily represent germline, we inversely analyzed normal mucosa samples for mutations that were present in normal mucosa but absent in the tumor. These mutations could either be germline mutations, sequencing artifacts or real mutations in normal mucosa. Germline mutations seem unlikely since one would expect germline mutations to be present in both tumor and normal mucosa. To ensure that we do not call sequencing artifacts, mutations in normal mucosa were only reported if (i) they could be verified by an alternative method and (ii) were listed in COSMIC and predicted to be pathogenic. Copy-number analysis from targeted sequencing data was done with the CNVkit from Eric Talevich (31). Sequencing data have been deposited in the Sequence Read Archive (SRA) database (data accession number: SRP140665).
TCGA data retrieval and processing
Clinicopathologic data of TCGA colorectal adenocarcinoma cohort were downloaded from the NCI's Genomic Data Commons (GDC) Data Portal (https://portal.gdc.cancer.gov/) and cBioPortal for Cancer Genomics (http://www.cbioportal.org/; refs. 15, 32, 33). Data of the 276 patients included were processed as follows: (i) samples without sequencing data were removed (n = 52); (ii) samples classified as MSI-H (n = 28), MSI-L (n = 36) or with not evaluable microsatellite status (n = 1) were removed; (iii) remaining hypermutated samples were removed (n = 5); (iv) remaining samples without copy-number data were removed (n = 7); and (v) histopathologic designation as provided by TCGA was verified for all samples by inspection of the deposited pathology reports and tissue images, and samples with unclear or inconclusive histology (intestinal type vs. mucinous) were removed (n = 3). This approach resulted in a total of 144 microsatellite stable, non-hypermutated CRC samples with available sequencing and copy-number data, of which 15 showed a mucinous histology, and 129 were of intestinal type. Those 144 samples were used for comparison with our data.
Curated pathway analysis
Analogous to TCGA's study of CRC, we performed a focused analysis of pathways known to be frequently altered in CRC, namely the WNT, TGFβ, PI3K, RTK-RAS, and TP53 signaling pathways (15). For all pathway analyses, we used the set of microsatellite stable/non-hypermutated CD-CRC samples (n = 29) and sporadic mucinous CRC samples (n = 24). Samples were stratified by etiology of CRC (inflammation associated vs. sporadic). The approach relies on the general abstraction of gene alterations per sample, which were assigned to manually curated pathways based on TCGA's study of CRC (15), on studies by Robles and colleagues (13) and Yaeger and colleagues (14), and on KEGG PATHWAY database. Pathways were composed of the following genes: WNT (APC, ARID1A, AXIN2, CTNNB1, FBXW7, and TCF7L2), TGFβ (ACVR1B, ACVR2A, SMAD2, SMAD3, SMAD4, TGFBR1, and TGFBR2), PI3K (AKT1, AKT2, AKT3, PIK3CA, PIK3R1, PTEN, TSC1, and TSC2), RTK/RAS (BRAF, EGFR, ERBB2, ERBB3, FGFR1, FGFR2, KRAS, MET, NF1, and NRAS), and TP53 (ATM and TP53). A pathway was considered altered in a given sample, if at least one gene in the pathway was altered. A particular gene in a specific sample was considered altered if it was either altered (i) by mutation (non-synonymous, somatic mutation in a protein-coding region) or (ii) by high-level copy-number amplification (aCGH, log2 ratio >1.0). The ERBB2 and MYC amplifications (ERBB2: C11CA; MYC: C04CA1, C08CA, C26CA, and M10CA) detected by aCGH were verified by FISH using ZytoLight SPEC ERBB2/CEN 17 Dual Color Probe (catalog No. Z-2015, ZytoLight) and ZytoLight SPEC MYC/CEN 8 Dual Color Probe (catalog No. Z-2092, ZytoLight). A gene was assumed to be a likely oncogene if it was primarily altered by missense mutations or high-level copy-number amplification, and to be a likely tumor suppressor gene if it was primarily affected by truncating mutations.
Phylogenetic analysis
Phylogenetic trees representing the evolutionary relationship between the tissue samples sequenced from each patient were inferred by comparing lists of mutations in each lesion as described by Izumchenko and colleagues (34). Briefly, a lesion that contained all mutations present in another lesion was considered its ancestor. If there was no such lesion, putative precursors were inferred from the set of mutations common to multiple lesions. Lesions with no genetic alterations were considered parallel branches, although an alternative phylogenetic tree could have been created if these lesions were considered ancestors of lesions with mutations. All phylogenetic trees were drawn with a common stem (trunk), which represents the normal, that is, diploid, genome.
Immunohistochemistry
Immunohistochemistry (IHC) was performed using the following primary antibodies: MLH1 (1:25; clone ES05, catalog No. M3640, Dako, Agilent Pathology Solutions, Agilent), MSH2 (ready-to-use; clone FE11, catalog No. IR085, Dako), MSH6 (ready-to-use; clone EP49, catalog No. IR086, Dako), PMS2 (1:50; clone EP51, catalog No. M3647, Dako) and TP53 (1:50; clone DO-7, catalog No. M7001, Dako). Heat-induced antigen retrieval was performed in Target Retrieval Solution, pH 9.0 (Tris/EDTA, 1:10; catalog No. S2367, Dako) in a water bath at 95 °C for 20 minutes. Detection was done using the EnVision Detection System, Peroxidase/DAB, Rabbit/Mouse (catalog No. K5007, Dako) according to the manufacturer's instructions with slight modifications. All stainings were validated by internal and/or external positive controls as well as negative control specimens. IHC stainings were evaluated by two pathologists (D. Hirsch and T. Gaiser). Tumor samples lacking nuclear staining for MLH1, MSH2, MSH6, and/or PMS2 were considered microsatellite instable while tumor samples with retained expression of MLH1, MSH2, MSH6, and PMS2 in the tumor cells were considered microsatellite stable. TP53 staining could be classified into four staining patterns according to nuclear staining intensity and distribution of positive cells as published by Sato and colleagues (35): (i) sporadic: only a few of weakly positive cells were sporadically dispersed in a tubule, (ii) scattered: a small number of weakly positive cells were focused in a tubule, (iii) nested: moderate to strongly positive cells were aggregated in restricted areas of tubule, and (iv) diffuse: strongly positive cells existed in most areas of tubules, and (v) negative: tubule did not contain a single TP53 positive nucleus. Alternatively, TP53 immunoreactivity could be classified into three basic patterns according to nuclear staining intensity and distribution of positive cells as published by Noffsinger and colleagues (36): (i) isolated immunoreactive cells in the crypt bases, (ii) strongly positive cells confined to the basal half of the glands, and (iii) diffusely staining cells. Microscopy images were acquired with a digital microscope and scanner M8 (PreciPoint GmbH).
Statistical analysis
Statistical analysis was performed using GraphPad Prism software version 7.03 (GraphPad Software; www.graphpad.com). Differences in clinicopathologic variables were estimated by unpaired t test (age), Fisher's exact test (sex), or χ2 test (stage and location). For all statistical tests involving molecular data, only microsatellite stable, non-hypermutated CRCs were considered. Differences in SCNAs between CD-CRC and sporadic mucinous CRC were estimated using Mann–Whitney test (number of SCNAs, fraction of genome altered by SCNAs) or Fisher's exact test with correction for multiple comparisons via false discovery rate (FDR; arm-level SCNA frequencies). Differences in gene mutation frequencies between CD-CRC and sporadic mucinous CRC were estimated using Fisher's exact test; no multiplicity adjustment was done here because significant differences in mutation rates were restricted to those genes with the highest mutation frequencies in sporadic mucinous CRC (APC, TP53, and SMAD3) rather than distributing uniformly in the whole gene list, indicating their disease-related nature.
Results
Clinicopathologic characteristics of CD-CRCs
Our cohort comprised 28 patients with CD who developed CRC (Table 1; Supplementary Table S2). Patients with CD were diagnosed with CRC at a relatively young age (median 50 years) and had a long history of CD (median 24 years). The majority of CD-CRCs (53%) were located in the rectum, followed by the right hemicolon (34%; Supplementary Fig. S2A). Mucinous and/or signet ring cell histology was observed in two-thirds of the CD-CRCs (Fig. 1A; Supplementary Figs. S2B–S2D and S3). Two CD-CRCs were high-level microsatellite instable (MSI-H), and one CD-CRC was microsatellite stable but showed a POLE mutation resulting in hypermutation (Supplementary Figs. S4 and S5). The reference group of sporadic mucinous CRCs differed significantly from CD-CRCs in terms of age at CRC diagnosis (P = 0.0001, unpaired t test) and CRC location (P = 0.04; χ2 test), reflecting the different etiology of CRC (Table 1; Supplementary Table S3). The two CRC groups of different etiology were matched in terms of sex (P = 0.57; Fisher's exact test) and tumor stage at diagnosis (P = 0.31; χ2 test).
Variable . | CD–associated CRC n = 28 patients . | Sporadic mucinous CRC n = 24 patients . |
---|---|---|
Age at CRC diagnosis, years | ||
Mean ± SD | 50 ± 11 | 65 ± 13 |
Median (range) | 50 (28–76) | 65.5 (33–87) |
Duration of CD at CRC diagnosisa | ||
Mean ± SD | 23 ± 9 | N/A |
Median (range) | 24 (5–40) | N/A |
Sex | ||
Male | 16 | 16 |
Female | 12 | 8 |
AJCC stage at diagnosis | ||
I | 5 | 4 |
II | 8 | 4 |
III | 7 | 9 |
IV | 5 | 7 |
Data not available | 3 | 0 |
Site of primary carcinomab | ||
Right hemicolon | 11 | 12 |
Left hemicolon | 4 | 7 |
Rectum | 17 | 5 |
Synchronous colorectal carcinoma(s) | ||
Present | 3 | 0 |
Absent | 25 | 24 |
Carcinoma associated to (anorectal) fistula | ||
Yes | 9 | 0 |
No | 23 | 24 |
Histology | ||
Mucinous with signet ring cells | 9 | 7 |
Mucinous without signet ring cells | 11 | 17 |
Intestinal type | 12 | 0 |
Microsatellite status | ||
Microsatellite stable | 30 | 24 |
Microsatellite instable | 2 | 0 |
Variable . | CD–associated CRC n = 28 patients . | Sporadic mucinous CRC n = 24 patients . |
---|---|---|
Age at CRC diagnosis, years | ||
Mean ± SD | 50 ± 11 | 65 ± 13 |
Median (range) | 50 (28–76) | 65.5 (33–87) |
Duration of CD at CRC diagnosisa | ||
Mean ± SD | 23 ± 9 | N/A |
Median (range) | 24 (5–40) | N/A |
Sex | ||
Male | 16 | 16 |
Female | 12 | 8 |
AJCC stage at diagnosis | ||
I | 5 | 4 |
II | 8 | 4 |
III | 7 | 9 |
IV | 5 | 7 |
Data not available | 3 | 0 |
Site of primary carcinomab | ||
Right hemicolon | 11 | 12 |
Left hemicolon | 4 | 7 |
Rectum | 17 | 5 |
Synchronous colorectal carcinoma(s) | ||
Present | 3 | 0 |
Absent | 25 | 24 |
Carcinoma associated to (anorectal) fistula | ||
Yes | 9 | 0 |
No | 23 | 24 |
Histology | ||
Mucinous with signet ring cells | 9 | 7 |
Mucinous without signet ring cells | 11 | 17 |
Intestinal type | 12 | 0 |
Microsatellite status | ||
Microsatellite stable | 30 | 24 |
Microsatellite instable | 2 | 0 |
Abbreviations: CD, Crohn's disease; N/A, not applicable.
aOn the basis of n = 23 patients because for n = 5 patients no exact data on duration of CD were available.
bRight hemicolon was defined as cecum, ascending, and transverse colon; left hemicolon as descending and sigmoid colon.
Landscape of genomic imbalances in CD-CRCs versus sporadic mucinous CRCs
To determine genome alterations that characterize CD-CRCs, we performed aCGH and sequence analysis of a panel of 563 cancer-related genes (Supplementary Table S1 for gene list). As control groups, we used histomorphologically similar sporadic mucinous adenocarcinomas, and the TCGA CRC dataset (15). The number of SCNAs per tumor tended to be higher in CD-CRCs compared with sporadic mucinous CRCs (P = 0.07; Mann–Whitney test; Fig. 1B), while the fraction of the genome subject to SCNAs was similar (P = 0.88; Mann–Whitney test; Fig. 1C). Array-CGH derived patterns of SCNAs in CD-CRC were similar to sporadic mucinous CRC and to previously published data on sporadic CRC, including gains of chromosomes and chromosome arms 7, 8q, 13q, and 20q, and losses of 5q, 8p, 17p, and 18q (10, 37). However, we observed a significant difference: gain of chromosome arm 5p occurred in 20 of 29 CD-CRCs (69%) compared with five of 24 (21%) sporadic mucinous CRCs (FDR adjusted P = 0.03; Fisher's exact test; Fig. 1D). In addition, a loss of 5p was never observed in CD-CRCs, in contrast to 2 of 24 (8.3%) sporadic mucinous CRCs. This is in line with TCGA data on intestinal type (n = 129) and mucinous (n = 15) CRCs that rarely showed a 5p copy-number gain (20 of 129, 16% and 1 of 15, 7%, respectively) or loss (10 of 129, 8% and 0 of 15, 0%, respectively; Supplementary Fig. S6). The gain of 5p in CD-CRCs was confirmed by interphase FISH on tumor sections (Fig. 1E; Supplementary Table S4), and by copy-number profiles derived from targeted sequencing data (Supplementary Fig. S7).
Landscape of somatic mutations in CD-CRCs versus sporadic mucinous CRCs
For sequencing data analysis, the applied Strelka and MuTect variant calling algorithms resulted in a total of 814 variant calls with a core fraction of 84% of concordant calls (Supplementary Fig. S8). The highest mutation counts were observed in the POLE-mutated CD-CRC (C21CA2) followed by MSI-H tumors (C20 and C28; Supplementary Fig. S9A). Those samples with the highest mutation counts had the lowest amount of SCNAs (Supplementary Fig. S9B and S9C). The base substitution patterns in CD-CRCs and sporadic mucinous CRCs were very similar (Supplementary Fig. S9D) and typical for colonic tissue (38). Somatic mutations in CD-CRCs involved genes known to be significantly altered in CRC; however, mutation frequencies were different compared with sporadic mucinous CRC (Fig. 2A): TP53 (76% in CD-CRCs vs. 33% in sporadic mucinous CRCs), KRAS (24% vs. 50%), APC (17% vs. 75%), and SMAD3 (3% vs. 29%). While TP53 missense mutations showed a strongly increased nuclear positivity for TP53 compared to the wild-type staining pattern, a TP53 truncating mutation was associated with a complete absence of staining (Fig. 2B). Both in CD-CRCs and sporadic mucinous CRCs, the majority of TP53 mutations occurred as missense mutations, located predominantly in the DNA binding domain of the protein (Supplementary Fig. S10). Mutations in KRAS, the second most commonly mutated gene in both CRC entities, occurred exclusively as missense mutations and involved the known mutation hotspots, independent of CRC etiology (Supplementary Fig. S11). In contrast to TP53 and KRAS mutations, APC mutations in both CD-CRC and sporadic mucinous CRC were truncating mutations. While the distribution of APC mutations in sporadic mucinous CRCs reflected the distribution expected from TCGA data on sporadic CRC and involved high-frequency mutations according to the COSMIC database, the few APC mutations detected in CD-CRCs were primarily located at positions infrequently mutated in sporadic CRC, which was reflected by the reported low frequency of these mutations in COSMIC (Supplementary Fig. S12). In summary, CD-CRCs demonstrated major mutational differences compared to sporadic CRCs including TCGA CRC data (Table 2). Of note is the high frequency of TP53 mutations in inflammation-associated CRCs compared with sporadic intestinal type CRCs and in particular compared with sporadic mucinous CRCs.
Altered pathways in CD-CRC versus sporadic mucinous CRC
To better understand the consequences of mutations and how pathways were altered in CD-CRCs compared with sporadic mucinous CRC, we integrated mutation and copy-number data to analyze alterations in WNT, TGF-β, PI3K, RTK-RAS, and TP53 signaling pathways (Fig. 3; Supplementary Figs. S13 and S14). While genetic alterations in the TP53 signaling pathway were predominant in CD-CRCs, WNT, TGFβ and, although to a lesser degree, RTK/RAS signaling was more often affected in sporadic mucinous CRC.
Analysis of CD–associated precursor lesions and matched primary CD-CRCs: tumor evolution and sequence of genetic events
Our collection of samples included 11 patients (patients C01, C02, C04, C05, C06, C07, C11, C12, C18, C21, and C26), from which we could analyze multiple, synchronous lesions at different stages of tumor development accompanying the CD-CRC (Supplementary Fig. S1 for an overview of samples for analysis per patient). In general, the mutation load increased during disease progression, and so did the number of SCNAs and the fraction of the genome affected by SCNAs (Supplementary Fig. S15). Across all synchronous lesions from the above-mentioned patients (except patient C21, who was excluded due to POLE mutation in one of his carcinomas), the gain of 5p was the most frequent SCNA present in three of 11 non-dysplastic colonic mucosa samples, two of three dysplastic lesions, 11 of 13 carcinomas, and two of two lymph node metastases. TP53 was the most frequently mutated gene occurring in four of 14 non-dysplastic colonic mucosa samples, two of three dysplastic lesions, 13 of 13 carcinomas, and two of two lymph node metastases (Supplementary Fig. S16 for an overview of all mutations in patients with CD-CRC with multiple lesions). Interestingly, TP53 mutations in synchronous lesions from separate sites, both synchronous carcinomas and synchronous precursor lesions/carcinomas, were distinct. For instance, in patient C04 both carcinomas showed copy-number increases of 5p and mutations of TP53, yet different ones, while inflamed mucosa harbored three mutations (FGF23, PLAG1, and SMARCA4) but no SCNAs, and no genome alterations were present in normal mucosa (Fig. 4A). Similarly, three carcinomas and histologically normal colonic mucosa from patient C01, all spatially separated, were affected by distinct TP53 mutations. Only the inflamed mucosa samples and the dysplasia adjacent to carcinoma 1 shared their TP53 mutation with carcinoma 1 (Fig. 4B). The gain of 5p was present in all lesions from patient C01, except carcinoma 2. Of note, carcinomas 1 and 3 from patient C01 shared an identical APC mutation (p.T1556fs), while the respective TP53 mutations (p.R175H in carcinoma 1 and p.E286K in carcinoma 3), among other mutations, were distinct. The TP53 p.R175H mutation, in contrast to the APC p.T1556fs mutation, could also be detected in the inflamed mucosa samples adjacent to carcinoma 1, indicating that the TP53 p.R175H mutation had occurred before the APC p.T1556fs mutation. However, the TP53 p.R175H mutation, despite equivalent or higher mutant allele fractions, could not be detected in carcinoma 3, which instead harbored a distinct TP53 p.E286K mutation. On the basis of these observations, we assumed that carcinoma 1 and carcinoma 3 evolved independently, though in general the presence of an identical mutation in a tumor suppressor gene in two independent lesions is considered very unlikely. Patient C12 had one carcinoma and one dysplastic lesion, which both harbored a 5p gain and a TP53 mutation, yet mutations were different (Supplementary Fig. S17). In patient C26, the carcinoma had a TP53 mutation and a gain of 5p, while the dysplastic lesion was TP53 wild-type but had mutations in APC and KRAS, and no 5p gain (Supplementary Fig. S18). Patient C18 had TP53 mutations, both in the carcinoma and in histologically normal colonic mucosa; however, they were distinct from each other (Supplementary Fig. S19). A gain of 5p could be detected in the carcinoma but not in histologically normal colonic mucosa despite the presence of SCNAs. The carcinoma of patient C05 showed a TP53 mutation, and a 5p gain, while interestingly, the inflamed mucosa, histologically without signs of dysplasia, showed five mutations, including an FBXW7 mutation, but neither a TP53 mutation nor SCNAs (Supplementary Fig. S20). Patients C02 and C11 each had a TP53-mutated carcinoma with a 5p gain, while no aberrations were observed in normal or inflamed mucosa (Supplementary Figs. S21 and S22). Patient C07 revealed no genetic aberrations in the inflamed mucosa sample, but a TP53 mutation and SCNAs, however without a 5p gain, in the carcinoma (Supplementary Fig. S23). In patient C06, mutation load and SCNAs increased from the CRC to its corresponding lymph node metastasis, though both harbored similar changes, including identical TP53 mutations and the gain of 5p (Supplementary Fig. S24).
To further delineate the sequence of genetic events underlying tumor development in CD, we analyzed mutant allele fractions. Potential founder mutations (and/or mutations providing a great selective advantage) would be expected at fractions of 0.28 ± 0.13 (mean tumor cell content divided by 2 ± 2 × SD; Supplementary Table S5) for CRCs (n = 53), and 0.29 ± 0.06 (mean epithelial cell content divided by 2 ± 2 × SD; Supplementary Table S6) for precursor lesions (n = 12), while higher allele fractions indicate an additional loss of heterozygosity, and lower allele fractions indicate subclonal mutations (39). Applying these criteria, clonal and thus potential founder mutations in CD-CRCs mainly occurred in TP53, sometimes accompanied by an allelic loss, while in sporadic mucinous CRCs clonal mutations were distributed among APC, KRAS, and TP53 (Supplementary Fig. S25A). Interestingly, in precursor lesions TP53 mutations were virtually all clonal, in few samples accompanied by additional allelic loss, while mutations in other genes were mostly subclonal (Supplementary Fig. S25B). Of note, the majority of these subclonal mutations occurred in genes that did not show alterations in CD-CRCs, indicating that these mutant clones do not necessarily have the potential to progress and may disappear over time.
On the basis of our analyses, we could reconstruct the putative sequence of genetic events of CD–related colorectal carcinogenesis, which is different from sporadic adenoma to carcinoma progression (Fig. 5A; refs. 9, 10). In CD, TP53 mutation and 5p gain occurred early and were frequent, while APC and KRAS mutations occurred later and were rare.
Detection of mutant clones at non-dysplastic sites distinct from carcinoma: evidence for occult tumor evolution in patients with CD
The fact that we observed genome alterations in normal and inflamed colonic mucosa without histologic evidence for dysplasia led to further investigations. In 19 of 28 patients with CD-CRC, we analyzed samples of non-dysplastic mucosa comprising histologically normal colonic mucosa (n = 16) and inflamed mucosa (n = 7; Supplementary Fig. S1). None of the normal and inflamed colonic mucosa samples showed evidence for MSI (Supplementary Table S7). By aCGH, we detected SCNAs in two normal colonic mucosa samples (C01NOR and C18NOR), and two inflamed mucosa samples (C01INF1 and C01INF2; Fig. 4B; Supplementary Fig. S19). Sequencing revealed mutations in TP53 in four of the normal and inflamed mucosa samples (C01INF1, C01INF2, C01NOR, and C18NOR), and mutations in genes other than TP53 in two other inflamed mucosa samples (C04INF and C05INF; Fig. 4; Supplementary Figs. S19 and S20). Thus, in total, four of 19 (21%) patients with CD-CRC harbored at least one mutation and/or SCNAs in normal or inflamed, that is, non-dysplastic, colonic mucosa (six of 23 samples), pointing to an occult tumor evolution in patients with CD. Interestingly, all normal and inflamed mucosa samples with the presence of SCNAs had a TP53 mutation, while normal and inflamed colonic mucosa with wild-type TP53 status showed no SCNAs (Supplementary Table S8). TP53 mutation status could be visualized by IHC, showing a characteristic “nested” or “diffuse” staining pattern that can be confined to the basal half of the crypts, as originally described in TP53-mutated, non-dysplastic mucosa from patients with UC (Supplementary Fig. S26–29; refs. 35, 36).
Development and progression of CRC in patients with CD through clonal sweeps and clonal mosaicism
The distribution of mutations across multiple samples along the colorectum at different stages of tumor development from 11 individual patients suggests two distinct mechanisms for CRC development and progression in patients with CD: (i) clonal sweeps, that is, a clone with a specific mutation expands to cover large colonic segments, from which further clones with additional mutations can emerge; and (ii) clonal mosaicism, that is, multiple clones with distinct (founder) mutations arise independently at different locations of the colorectum (Fig. 5B; ref. 40). There was a predilection for TP53 mutations, and, interestingly, all synchronous CRCs from individual patients showed distinct TP53 mutations, as did spatially unrelated precursor lesions.
Discussion
Our study of CD-CRCs and corresponding precursor lesions analyzed by targeted NGS and aCGH revealed a significant difference in the pattern of mutations and SCNAs between CD-CRC and sporadic CRC. According to data from TCGA, APC mutations are the major event in sporadic non-hypermutated CRC (observed in 81% of cases), followed by TP53 (60%) and KRAS mutations (43%; ref. 15). In contrast, we found in CD-CRC a much lower frequency of APC mutations (17%) and a higher rate of TP53 mutations (76%). The genetic differences suggest distinct pathways of carcinogenesis. TP53 mutations likely play an initiating role in CRC associated with chronic inflammation of the large intestine, whereas in sporadic CRC, they are more important for the progression from late adenoma to invasive carcinoma (41). The high frequency of TP53 mutations in non-dysplastic and dysplastic colonic mucosa of patients with UC was already reported, and for diagnostic purposes, TP53 status can be determined by IHC as a surrogate for the detection of TP53 mutations (35, 36). This is far less well studied in CD and CD-CRC. One study of a cohort of 14 patients with CD revealed that IHC TP53 positivity was associated with dysplasia and could predict progression to CRC in some cases (42). In contrast to sporadic CRC, we detected TP53 mutations in both dysplasia and non-dysplastic mucosa from patients with CD-CRC. As a side note, IHC for TP53 was able to confirm mutational status and could therefore be helpful in the evaluation of colonic biopsies from patients with CD with respect to their progression risk.
We observed distinct TP53 mutations in different, spatially unrelated lesions in four patients in our cohort, a genetic phenomenon termed “clonal mosaicism.” This finding further supports the critical role of TP53 signaling in CD–associated tumorigenesis. Inactivation of TP53 signaling was a nearly ubiquitous event in CD-CRC in our cohort, while, in contrast to sporadic CRC, WNT pathway activation by mutation was rare. However, this does not exclude WNT pathway activation through epigenetic or microenvironmental influences.
In general, the mutational spectrum for CD-CRC confirms findings from previous studies (13, 14). However, by studying the so far largest collective of CD-CRCs and precursor lesions, distinct features emerged. While Robles and colleagues found similar or slightly lower mutations rates within CD-CRCs compared with our study by whole exome sequencing, Yaeger and colleagues detected a higher TP53 (94% vs. 76%) and IDH1 (28% vs. 3%) mutation rate within CD-CRCs using a targeted sequencing approach covering some 300 genes (13, 14). Overall, our mutation frequencies match the average of the two above-mentioned studies. However, we could not confirm the high IDH1 mutation rate in CD-CRC reported by Yaeger and colleagues (14). In line with previous studies and in contrast to sporadic CRC, BRAF mutations were absent from CD-CRCs (overall mutation rate 19% in sporadic CRC; 3% in non-hypermutated CRC and 46% in hypermutated CRC; 37% in mucinous CRC and 6% in nonmucinous CRC; refs. 13, 14, 43, 44).
In our cohort of CD-CRCs, we detected MSI at a similar frequency (2 of 32, 6.3%) as Lennerz and colleagues (2 of 33, 6.1%; ref. 43). This frequency is slightly lower compared with sporadic CRC and to that reported by Svrcek and colleagues (7 of 50, 14%; ref. 45). This slight difference might be a reflection of small sample numbers in both studies. In contrast to UC where MSI usually occurs early (46), we did not detect MSI in CD–related precursor lesions. Consistently, Svrcek and colleagues did not detect MSI in any of the 14 dysplastic CD samples included in their study (45).
As summarized by Ullman and Itzkowitz, the development of inflammation-related carcinoma appears to progress through a sequence consisting of inflamed mucosa, dysplasia, carcinoma, in contrast to the classical progression of a discrete focus of neoplasia from a polypoid adenoma to an invasive carcinoma in sporadic CRC (8). However, patients with CD can also develop sporadic adenomas but endoscopic and histologic distinction of an inflammation-/CD–associated lesion and a sporadic adenoma arising in an inflamed colon segment is extremely difficult. The distinction is critical: while a sporadic lesion offers the therapeutic option for a safe local removal followed by surveillance, in case of an inflammation-/CD–associated dysplastic lesion and possible field cancerization, proctocolectomy should be considered as treatment (47). As shown recently, the detection of aneuploidy could be useful in identifying patients with UC and CD with an increased progression risk toward CRC (48). Similarly, the presence of SCNAs in gastric intestinal metaplasia has recently been identified as a molecular determinant of progression to dysplasia or gastric cancer (49).
Our findings confirm aneuploidy as an early event in CD progression; however, we identified a specific significant difference in the pattern of SCNAs, namely the gain of chromosome arm 5p, in CD-CRCs compared with sporadic tumors. The copy-number increase of 5p is particularly interesting because it was not only frequently present in CD-CRC, but also a common finding in precursor lesions including yet non-dysplastic colonic mucosa. In contrast, sporadic colorectal adenomas very rarely show a gain of 5p. In our own previously published cohort of hyperplastic polyps, tubulovillous adenomas and serrated polyps (n = 84), we did not detect any 5p gain (50). Adenomatous regions of malignant polyps did not harbor a 5p gain either (n = 13; ref. 12). Richter and colleagues found extra copies of 5p only in non-polypoid dysplastic lesions (2 of 23, 9%), while polypoid neoplasia never showed this gain (0 of 28, 0%; ref. 51). Interestingly, they observed gains of the entire chromosome 5, while in our CD–associated lesions the gain was restricted to the short arm of the chromosome, sometimes extending to the 5q pericentromeric region, and sometimes accompanied by a loss of 5q indicating isochromosome formation. Our findings are consistent with CGH data from 13 UC-CRCs, which revealed in about 50% of samples a gain of 5p, and in about 25% a concomitant loss of 5q (52). This is particularly intriguing because in that cohort 5p was also an early event that could already be observed in ulcerative colitis–associated dysplastic lesions. Interestingly, while early sporadic colorectal lesions are often characterized by gains of chromosome 7 (10, 11), 5p gains appear to be the distinctive feature of inflammation-associated, and in particular, CD–associated intestinal neoplasia.
Gains of 5p have previously been implicated in progression of lung and cervical cancer, but it appeared difficult to relate this SCNA to a specific candidate gene (53, 54). Nevertheless, the TERT gene (5p15.33) encodes one of the main functional subunits of the telomerase enzyme and high TERT expression was shown to be associated with progression and unfavorable outcome of CRC (55, 56). Another study demonstrated that increased expression of TERT can promote antiapoptotic response through inactivation of TP53 via induction of basic fibroblast growth factor (57). Furthermore, CDH12 (5p14.3) was reported to enhance proliferation and tumorigenicity of CRC cells and to increase progression by promoting epithelial–mesenchymal transition (58, 59).
In conclusion, our study of CD–related colorectal lesions indicates a new biomarker, that is, the gain of 5p, which, in combination with TP53 IHC or TP53 mutation analysis, could assist in the assessment of CD precursor lesions that might progress to CRC. This intriguing finding should be pursued in further clinical validation studies, in particular in the context of occult tumor evolution in patients with CD.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: D. Hirsch, D.C. Edelman, P. Kienle, C. Galata, K. Horisberger, T. Ried, T. Gaiser
Development of methodology: D. Hirsch, D.C. Edelman, C. Galata
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): D. Hirsch, D. Wangsa, D.C. Edelman, P.S. Meltzer, C. Ott, P. Kienle, K. Horisberger
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): D. Hirsch, D. Wangsa, Y.J. Zhu, Y. Hu, P.S. Meltzer, K. Heselmeyer-Haddad, P. Kienle, T. Ried, T. Gaiser
Writing, review, and/or revision of the manuscript: D. Hirsch, D. Wangsa, Y.J. Zhu, Y. Hu, D.C. Edelman, P.S. Meltzer, K. Heselmeyer-Haddad, C. Ott, P. Kienle, C. Galata, K. Horisberger, T. Ried, T. Gaiser
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): D. Hirsch, D. Wangsa, D.C. Edelman, P.S. Meltzer, K. Heselmeyer-Haddad, C. Ott, P. Kienle, C. Galata, K. Horisberger, T. Ried, T. Gaiser
Study supervision: D. Hirsch, P. Kienle, T. Ried, T. Gaiser
Acknowledgments
The authors would like to thank David Petersen (Molecular Genetics Section, Genetics Branch, CCR, NCI, and NIH) for help with library preparation, Bao Tran (Sequencing Facility, CCR, NCI-Frederick, NIH) for performing sequencing, Yonca Ceribas, Alexandra Eichhorn and Romina Laegel (Institute of Pathology, University Medical Center Mannheim) for technical assistance, Ferdinand Hofstaedter and Matthias Evert (Institute of Pathology, University of Regensburg) for administrative/material support, Buddy Chen for help with figures and IT-related support, and Reinhard Ebner for critical comments on the manuscript. This study was supported in part by the Intramural Research Program of the NIH, NCI, and by a grant from the Manfred Stolte-Foundation (to D. Hirsch and T. Gaiser). D. Hirsch received an intramural research scholarship from the Medical Faculty Mannheim, Heidelberg University.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.