Human colorectal cancer cell lines are used widely to investigate tumor biology, experimental therapy, and biomarkers. However, to what extent these established cell lines represent and maintain the genetic diversity of primary cancers is uncertain. In this study, we profiled 70 colorectal cancer cell lines for mutations and DNA copy number by whole-exome sequencing and SNP microarray analyses, respectively. Gene expression was defined using RNA-Seq. Cell line data were compared with those published for primary colorectal cancers in The Cancer Genome Atlas. Notably, we found that exome mutation and DNA copy-number spectra in colorectal cancer cell lines closely resembled those seen in primary colorectal tumors. Similarities included the presence of two hypermutation phenotypes, as defined by signatures for defective DNA mismatch repair and DNA polymerase ϵ proofreading deficiency, along with concordant mutation profiles in the broadly altered WNT, MAPK, PI3K, TGFβ, and p53 pathways. Furthermore, we documented mutations enriched in genes involved in chromatin remodeling (ARID1A, CHD6, and SRCAP) and histone methylation or acetylation (ASH1L, EP300, EP400, MLL2, MLL3, PRDM2, and TRRAP). Chromosomal instability was prevalent in nonhypermutated cases, with similar patterns of chromosomal gains and losses. Although paired cell lines derived from the same tumor exhibited considerable mutation and DNA copy-number differences, in silico simulations suggest that these differences mainly reflected a preexisting heterogeneity in the tumor cells. In conclusion, our results establish that human colorectal cancer lines are representative of the main subtypes of primary tumors at the genomic level, further validating their utility as tools to investigate colorectal cancer biology and drug responses. Cancer Res; 74(12); 3238–47. ©2014 AACR.

Colorectal cancer is a leading cause of cancer-related morbidity and mortality (1). Human colorectal cancer cell lines are an important, commonly used preclinical model system for studying this disease, and have provided essential insights into tumor molecular and cell biology. Cell lines are a fundamental tool used in the discovery of new antitumor compounds and for the discovery of drug sensitivity, resistance, and toxicity biomarkers, with molecular markers of response to conventional chemotherapies and targeted agents showing clinical utility in patients (2–7). Examples include the relationship between tumor microsatellite instability-high (MSI-H) status and lack of 5-fluorouracil (5FU) response (2), and mutations in KRAS, BRAF, and PIK3CA exon 20 and resistance to anti-EGFR antibody therapy (3). However, to what extent colorectal cancer cell lines represent and maintain the genetic diversity of primary cancers remains controversial.

More than the past two decades, major cancer genes and pathways central to colorectal cancer development have been delineated, including the WNT, MAPK, PI3K, TGFβ, and p53 pathways. Two broad molecular subtypes of colorectal cancer have emerged, characterized by MSI-H (∼15%) or chromosomal instability (CIN, ∼60%; refs. 8–10). MSI-H colorectal cancers occur predominantly in the proximal colon, and often show poor differentiation, mucinous histology, and increased peritumoral lymphocytic infiltration (11). These tumors exhibit hypermutation caused by defective DNA mismatch repair (MMR), tend to be near-diploid and to have a CpG island methylator phenotype (CIMP), and harbor mutations in a distinct set of driver genes, including BRAF, PTEN, and TGFBR2 (12–14). In contrast, chromosomally unstable tumors are more common in the distal colorectum and tend to develop along the classical genetic pathway of colorectal tumorigenesis, with mutations in APC, KRAS, SMAD4, and TP53 (15).

Recent cancer genome sequencing studies have revealed additional details of the genetic profiles of human colorectal cancer, highlighting their diversity. An initial whole-exome sequencing study on 11 microsatellite stable (MSS) colorectal cancers demonstrated that such tumors harbor ∼80 coding sequence mutations with a small number of commonly mutated driver genes and a large number of “private” mutations (16). Subsequently, The Cancer Genome Atlas (TCGA) Network reported comprehensive data on 223 unselected sporadic colorectal cancers (17). Hypermutation was identified in ∼15% of carcinomas, with three quarters of these displaying the expected MSI-high (MSI-H) phenotype associated with epigenetic silencing and/or mutation of MMR genes. However, one-quarter of hypermutated tumors did not show MSI-H and seemed to be associated with DNA polymerase ϵ (POLE) mutations, perhaps representing a distinct colorectal cancer subtype. Twenty-four genes were identified as significantly mutated in colorectal cancer, including several novel candidate genes such as ATM, ARID1A, TCF7L2, SOX9, and FAM123B. A number of recurrent DNA copy-number alterations were reported comprising potentially drug-targetable amplifications of ERBB2 and IGF2. A similar study on 74 pairs of primary human colon tumors reported highly concordant results, and in addition identified low frequency fusion transcripts involving R-spondin family members RSPO2 and RSPO3 (18).

Here, we report the first comprehensive whole-exome and DNA copy-number analyses of 70 of the most widely used colorectal cancer cell lines, and undertake a detailed comparison of genetic alterations with those reported for TCGA-analyzed primary cancers. We demonstrate that the spectra of mutations and DNA copy-number aberrations in colorectal cancer cell lines are representative of primary tumors, including hypermutation phenotypes and targeting of major signaling pathways. Our data further highlight novel aspects of colorectal cancer biology, including significant enrichment of mutated genes involved in chromatin remodeling and histone methylation or acetylation. Our results verify established colorectal cancer cell lines as a useful preclinical model system, and provide a comprehensive genomic data resource enabling informed choices when selecting cell lines for functional and pharmacogenomics research.

Colorectal cancer cell lines and TCGA-analyzed primary cancers

The 70 colorectal cancer cell lines studied were obtained from a range of sources, listed below, over a period spanning several years (Supplementary Table S1). C10, C106, C125, C135, C32, C70, C80, C84, and C99 were generated by the W.F. Bodmer laboratory, CACO2, COLO201, COLO205, COLO320-DM, DLD1, HCC2998, HCT116, HCT15, HCT8, HT29, LOVO, LS174T, LS180, LS411, LS513, NCI-H716, NCI-H747, RKO, SKCO-1, SNU-C1, SNU-C2B, SW1116, SW1417, SW1463, SW403, SW48, SW480, SW620, SW837, SW948, and T84 were obtained from the American Type Culture Collection, KM12 was obtained from DCTD, COLO678, and CX-1 were obtained from DSMZ, Gp2D, Gp5D, HT115, and HT55 were obtained from ECACC, CCK81, and CoCM-1 were obtained from HSRRB, VACO10, and VACO4S were provided by Dr. J.A. McBain (University of Wisconsin School of Medicine, Madison, WI), SNU-175, SNU-283, and SNU-C4 were obtained from KCLB, LIM1215, LIM1899, LIM2099, LIM2405, and LIM2551 were obtained from the Ludwig Institute for Cancer Research, SW1222 were provided by Prof. M. Herlyn (The Wistar Institute, Philadelphia, PA), HDC101, HDC54, HDC82, HDC87, and HDC90 were provided by Prof. M, Schwab (DKFZ, Germany), HCA46, HCA7, and HRA19 were provided by Dr. S.C. Kirkland (Royal Postgraduate Medical School, United Kingdom), and RW2982 and RW7213 were provided by Dr. P. Calabresi (Roger Williams General Hospital, Providence, RI). Cells were cultured with Dulbecco's Modified Eagle Medium and 10% FBS at 37°C and 5% CO2. Cell line authentication was performed using the Promega StemElite ID System at the Queensland Institute of Medical Research (QMIR, Queensland, Australia) DNA Sequencing and Fragment Analysis Facility (January 2013). Cells were tested for mycoplasma by the MycoAlert Mycoplasma Detection Kit (Lonza). Published exome-capture sequencing data on 223 patients with colorectal cancer were retrieved from the TCGA (19). SNP array data were available for 213 of these patients. Mutation data were filtered for exons and at splice sites (±3 bp).

Exome-capture sequencing

Exome-capture was performed using the TruSeq Exome Enrichment Kit (Illumina) and 100 bp paired-end read sequencing performed on an Illumina HiSeq 2000 at the Australian Genome Research Facility (AGRF). Variant detection was implemented using a modified GATK Best Practice protocol; variants were filtered against known germline variations and those detected in 114 in-house normal colorectal tissues (Supplementary Methods). Regions of known germline chromosomal segmental duplications were excluded (20).

Transcriptome sequencing

RNA-Seq analysis was performed at the AGRF on an Illumina HiSeq2000 to a depth of >100 million paired reads. Alignment to the human reference genome (hg19) was achieved using TopHat (21), and RPKM values calculated. Absence of gene expression was defined as a RPKM value of <1 (Supplementary Methods).

SNP microarray analysis

SNP assays were performed by the AGRF using Illumina Human610-Quad BeadChips (GSE55832) and processed using GenomeStudio (Illumina). SNPs showing germline alterations, based on 637 normal samples, were excluded. DNA copy-number segmentation was performed using OncoSNPv2.18 (22). Regions of significantly altered genome were identified using GISTIC2.0 (23).

Microsatellite instability analysis

MSI analysis was performed for the National Cancer Institute recommended microsatellite marker panel BAT25, BAT26, D2S123, D5S346, and D17S250 using fluorescently labeled primers on a 3130xl Genetic Analyzer (Applied Biosystems; ref. 24). MSI-H was diagnosed if instability was evident at 2 or more markers.

CIMP marker and MLH1 promoter methylation analysis

MethyLight real-time PCR was performed for the Weisenberger and colleagues CIMP marker panel (IGF2, SOCS1, RUNX3, CACNA1G, NGN1), MLH1, and the reference ALU (C-4; ref. 25). The percentage of methylated reference (PMR) was calculated for GENE:ALU ratio of template amount in a sample against GENE:ALU ratio of template amount in methylated reference DNA. Samples with a PMR greater than 10 for ≥3 CIMP markers were classified as CIMP-positive.

Statistical analysis

Statistical analyses were conducted using the R statistical computing software (http://www.R-project.org/). Differences between groups were assessed using Fisher exact test or χ2 test for categorical variables and the Student t test for continuous variables. Correlation between RNA-Seq and microarray gene expression data were assessed using Spearman rank correlation coefficient. Overlap of gene lists was assessed using a hypergeometric test. GISTIC2.0 analysis for significantly altered focal regions of chromosomal deletion or gain/amplification were adjusted for multiple testing, implementing a BH-FDR adjusted Q-value cut-off of 0.05. All statistical analyses were 2-sided and considered significant when P < 0.05.

Exome mutation profiles in 70 colorectal cancer cell lines

Seventy commonly used colorectal cancer cell lines, comprising 63 unique lines and 7 duplicate lines, were analyzed for mutations in protein-coding exons and exon–intron boundaries by whole-exome capture sequencing (Supplementary Table S2). In all cases, >20-fold coverage of >80% of targeted exons was achieved. As matched normal tissue for these cell lines was not available, putative somatic mutations were identified by annotation against databases of known human germline variants. Regions of known germline chromosomal segmental duplications were excluded to reduce the possibility of false-positive variants caused by read mismapping. Mean pipeline sensitivity and specificity for nonsilent variants were shown to be 79.2% and 98.6% in an analysis of 10 primary colorectal cancers with and without paired normal tissue, with the majority (93.4%) of false-negative calls resulting from somatic mutations mimicking annotated germline variants (Supplementary Data). Accuracy of mutation calling was verified by Sanger resequencing for 12 selected genes (APC, CTNNB1, KRAS, BRAF, NRAS, PIK3CA, PTEN, SMAD2, SMAD3, SMAD4, FBXW7, and TP53) on 43 cell lines, with validation of 97.6% (165/169) of mutations (Supplementary Table S3).

For the 63 unique cell lines, the total number of putative somatic mutations ranged from 219 to 8657 (mean = 1498.3). Similar to the primary colorectal cancers reported by The TCGA (17), the prevalence of mutations varied substantially, ranging from 6.6 to 260.0 per 106 bases, with evidence for nonhypermutated and hypermutated groups (Fig. 1). Cell lines with a mutation prevalence of >25 per 106 bases were designated as hypermutated. Hypermutation was observed in 34.9% (22/63) of cell lines as compared with 15.7% (35/223) of the TCGA cancers (P < 0.001). 86.3% (19/22) of hypermutated cell lines exhibited MSI-H, similar to the proportion found in cancers (80.0%, 28/35, P = 0.725). MSI-H status in cell lines was associated with MLH1 hypermethylation, mutations in MMR genes and presence of CIMP (P < 0.009; Fig. 1). Consistent with the established association between MSI-H and proximal tumor location in colorectal cancer, all MSI-H cell lines with available site details originated from the proximal colon (Supplementary Table S1). There was no evidence of imbalance in site distribution between cell lines and cancers (P = 0.937).

Figure 1.

Mutation frequencies in 70 human colorectal cancer cell lines. Cell lines were separated into distinct hypermutated and nonhypermutated cases. MSI-H, microsatellite instability-high; CIMP, CpG island methylator phenotype; MLH1 meth., MLH1 promoter hypermethylation, MMR mut., DNA mismatch repair gene mutation; inset, mutations in MMR genes among the hypermutated samples, with cases in the same order as in the main graph.

Figure 1.

Mutation frequencies in 70 human colorectal cancer cell lines. Cell lines were separated into distinct hypermutated and nonhypermutated cases. MSI-H, microsatellite instability-high; CIMP, CpG island methylator phenotype; MLH1 meth., MLH1 promoter hypermethylation, MMR mut., DNA mismatch repair gene mutation; inset, mutations in MMR genes among the hypermutated samples, with cases in the same order as in the main graph.

Close modal

Mutation spectra in hypermutated colorectal cancer cell lines

Compared with nonhypermutated cell lines, MSI-H cell lines displayed a higher number of InDels (41.1-fold; P < 0.001) and SNVs (9.7-fold; P < 0.001) as observed in the TCGA cancers (InDels: 18.3-fold, SNVs: 10.8-fold; P < 0.001; Fig. 2A and B). As expected, InDels in MSI-H cases tended to occur at nucleotide-repeat (≥N5 or ≥NN3) sequences (MSI-H vs. nonhypermutated: cell lines 82.7% vs. 17.9%, cancers 73.4% vs. 38.4%; P < 0.001), a bias not observed for the SNVs (cell lines 2.9% vs. 3.7%, cancers 2.3% vs. 2.9%; Supplementary Table S2). The SNV spectrum differed significantly between MSI-H and nonhypermutated cases, with an increased proportion of A>G/T>C transitions (cell lines 1.5-fold, cancers 1.8-fold; P < 0.011) and decreased C>G/G>C transversions (cell lines −4.5-fold, cancers −5.2-fold; P < 0.001; Fig. 2C).

Figure 2.

Types of mutations for 70 human colorectal cancer cell lines and 223 TCGA-analyzed primary cancers. A–C, counts of SNVs and InDels (A and B), and proportions of nucleotide transitions and transversions (C). Paired cell lines originating from the same tumor are indicated by identical colored stars. MSI-H, microsatellite instability-high; NSHP, nucleotide-substitution hypermutator phenotype; POLE EDM, missense mutation in the POLE exonuclease domain.

Figure 2.

Types of mutations for 70 human colorectal cancer cell lines and 223 TCGA-analyzed primary cancers. A–C, counts of SNVs and InDels (A and B), and proportions of nucleotide transitions and transversions (C). Paired cell lines originating from the same tumor are indicated by identical colored stars. MSI-H, microsatellite instability-high; NSHP, nucleotide-substitution hypermutator phenotype; POLE EDM, missense mutation in the POLE exonuclease domain.

Close modal

A second type of hypermutated tumor was observed among cell lines and the TCGA cancers. These cases were MSS and compared with nonhypermutated cases exhibited a nucleotide-substitution hypermutator phenotype (NSHP) characterized by a substantial increase of SNVs (cell lines 16.8-fold, cancers 56.8-fold; P < 0.005; Fig. 2A and B). The SNV spectrum in 2 of 3 NSHP cell lines (HT115, HCC2998) and 7 of 7 NSHP cancers further displayed increased proportions of C>A/G>T transversions (MSI-H vs. nonhypermutated: cell lines 1.5-fold, cancers 2.4-fold) and A>C/T>G transversions (cell line 3.8-fold, cancers 2.0-fold), as well as decreased C>G/G>C transversions (cell lines −13.6-fold, cancers −22.3-fold; Fig. 2C). Combining cell lines and cancers, this mutator phenotype was significantly associated with the presence of POLE mutations (NSHP: 90.0%, 9/10 vs. MSI-H: 29.8%, 14/47 vs. nonhypermutated: 1.3%, 3/229; P < 0.001). In addition, all nine NSHP cases with POLE mutation had at least 1 missense mutation in the POLE exonuclease domain (EDM), as compared with only 1 of 17 non-NSHP cases with POLE mutation (P < 0.001; Supplementary Table S4). 82.4% (14/17) of POLE mutated non-NSHP samples were MSI-H, and there was no evidence that the non-EDM POLE mutations in these cases modified the MSI-H phenotype with similar SNV frequencies and transition/transversion spectra compared with MSI-H cancers without POLE mutation (P > 0.05 for all comparisons).

A single NSHP cell line, HT55, exhibited a distinct bias to A>T/T>A transversions (49.0% vs. nonhypermutated mean 4.6%) and was wild type for POLE, but a similar case was not present among the TCGA cancers (Fig. 2C). This cell line harbored a heterozygous truncating mutation in the MMR gene PMS1, which may be related to this mutator phenotype, although Pms1-deficient mice have been reported to show no evidence of tumor predisposition or hypermutation (26).

Chromosomal and subchromosomal aberrations

DNA copy-number alterations in cell lines were profiled using Illumina Human 610-Quad BeadChips. As anticipated, DNA copy-number profiles were similar between cell lines and primary cancers (Fig. 3A). Hypermutated MSI-H and NSHP cases showed stable profiles with a modal chromosome copy-number of 2n. In contrast, nonhypermutated cases tended to exhibit unstable profiles with modal chromosome copy numbers ranging from 2n to 4n in cell lines and 2n to 6n in cancers. For nonhypermutated groups, the most common deleted chromosome arms were 8p, 17p (including TP53), and 18q (including SMAD4), and the most common gained regions were chromosome 7, 8q (including MYC), 13, and 20q. Some differences were apparent between cell lines and primary cancers, including more frequent chromosome 4 deletions in nonhypermutated and chromosome 7 gains and 18q deletions in hypermutated cell lines.

Figure 3.

Genome-wide DNA copy-number aberrations in 63 unique colorectal cancer cell lines and 213 TCGA primary cancers stratified into nonhypermutated and hypermutated cases. A, absolute DNA copy-number gains and losses relative to diploid status (2n) deduced using oncoSNP. B and C, minimal significant regions of chromosomal loss (blue) and gain/amplification (red) deduced using GISTIC2.0. Selected candidate genes in overlapping regions are indicated and total numbers of genes are shown in brackets.

Figure 3.

Genome-wide DNA copy-number aberrations in 63 unique colorectal cancer cell lines and 213 TCGA primary cancers stratified into nonhypermutated and hypermutated cases. A, absolute DNA copy-number gains and losses relative to diploid status (2n) deduced using oncoSNP. B and C, minimal significant regions of chromosomal loss (blue) and gain/amplification (red) deduced using GISTIC2.0. Selected candidate genes in overlapping regions are indicated and total numbers of genes are shown in brackets.

Close modal

Focal regions significantly targeted by DNA copy-number alterations in both cell lines and cancers were deduced using GISTIC2.0 software (23). Eleven and 4 significant focal regions were found to overlap between nonhypermutated and hypermutated groups, respectively (Fig. 3B and C and Supplementary Table S5). For nonhypermutated cases, overlapping regions included gain/amplification of MYC, a known key mediator of colorectal tumorigenesis (27), and deletion of the candidate cancer genes PARK2 and WRN, detected in 29.5% and 36.3% of cell lines and 21.9% and 31.6% of primary cancers, respectively. The 4 recurrent regions identified in hypermutated cases—which were also present in the nonhypermutated group—were all known fragile sites (including FHIT, A2BP1, WWOX, or MACROD2), the functional relevance of which is uncertain.

Cancer genes and pathways

Gene mutation profiles, excluding silent mutations, were compared between colorectal cancer cell lines and primary cancers. Only genes with well-defined expression in cell lines [reads per kilobase per million reads mapped (RPKM) >1 in at least 3/13 colorectal cancer cell lines analyzed by RNA-Seq] were considered (n = 20,702 genes; Supplementary Table S6). NSHP cases were excluded from this comparison because of the small sample size.

Mutation landscapes of cell lines markedly resembled those of primary cancers. In both cohorts, nonhypermutated cases displayed a small number of distinct mutation peaks, with APC, TP53, and KRAS being the most common targets, whereas hypermutated MSI-H cases showed frequent mutations in multiple genes, with the anticipated bias to genes comprising nucleotide repeats (Supplementary Tables S7). Significant overlap was observed for the top 5% of mutated genes (based on the proportion of affected samples), with 54 and 62 genes intersecting for nonhypermutated and MSI-H groups, respectively (P < 0.001; Supplementary Tables S8). Overlapping genes included the expected members of the WNT, MAPK, PI3K, TGFβ, and p53 pathways. For nonhypermutated cases, these comprised APC, CTNNB1, FBXW7, KRAS, BRAF, PIK3CA, SMAD4, and TP53, as well as the candidate cancer genes, AXIN2, BCL9L, FAT1, SOX9, ERBB3, PIK3C2B, TIAM1, and ATM. For MSI-H cases these encompassed APC, FBXW7, PIK3CA, and TGFBR2, along with the candidates CREBBP, TCF7L2, RNF43, and ACVR2A. Mutation distributions across colorectal cancer-associated signaling pathways were also similar between nonhypermutated cell lines and cancers, with the same pathway members tending to show the highest mutation frequencies (Fig. 4). Greater variability in mutation frequencies between pathway members was observed for MSI-H cases, at least partly related to the higher mutation background and smaller sample size. A notable difference for MSI-H cases was a differential prevalence of CTNNB1 mutations, which were frequent in cell lines (47%) but not reported in primary cancers (P < 0.001).

Figure 4.

Mutation frequencies for WNT, MAPK, PI3K, TGFβ, and p53 pathway members as well as chromatin regulators identified among the top 5% of mutated genes overlapping between colorectal cancer cell lines and TCGA-analyzed primary cancers for nonhypermutated or hypermutated MSI-H samples.

Figure 4.

Mutation frequencies for WNT, MAPK, PI3K, TGFβ, and p53 pathway members as well as chromatin regulators identified among the top 5% of mutated genes overlapping between colorectal cancer cell lines and TCGA-analyzed primary cancers for nonhypermutated or hypermutated MSI-H samples.

Close modal

In addition to the established colorectal cancer-associated pathways, significant enrichment was observed for mutations in chromatin-state regulators (P < 0.001; gene ontology enrichment analysis). These comprised ASH1L, CHD6, PRDM2, and MLL3 in nonhypermutated and ARID1A, EP300, EP400, MLL2, CHD6, PRDM2, TRRAP, and SRCAP in MSI-H cases (Supplementary Table S8). Approximately 49% and 19% of nonhypermutated and 100% and 93% of MSI-H cell lines and primary cancers carried mutations across these candidates, respectively.

Integrating mutation and DNA copy-number data across cell lines and primary cancers showed the anticipated tumor suppressor signatures for APC, TP53, SMAD4, and SOX9 with a significant overrepresentation of 2 mutational hits (2+ mutations or 1 mutation and loss of heterozygosity; P < 0.023). There was further an association of mutation in BRAF, ERBB3, PIK3CA, and KRAS with copy-number gain of ≥5 at the respective chromosomal regions (P < 0.018). Trends to mutual exclusivity between pathway member mutations were observed for KRAS and BRAF, TP53, and ATM (P < 0.005).

Mutation and DNA copy-number differences in paired colorectal cancer cell lines

Included in our colorectal cancer cell line panel were 5 pairs/triplets originally derived from the same tumor (COLO201/COLO205, CX-1/HT29, Gp2D/Gp5D, LS174T/LS180, and DLD1/HCT8/HCT15), and 1 pair derived from a primary tumor and subsequent lymph node metastasis (SW480/SW620). LS174T and LS180 were established by trypsin treatment and scraping of primary cultures from the same tumor, respectively (28), and have been shown to differ with respect to E-cadherin expression and cell adhesion, with LS174T displaying complete loss of E-cadherin protein (2, 29).

Assessment of the overlap between mutations detected in paired cell lines identified substantial discrepancies, with 63 and 356 mutational differences in the nonhypermutated pairs COLO201/COLO205 and CX-1/HT29, and 2,763, 480, 6,503, 5,377, and 6,369 mutational differences in the MSI-H pairs Gp2D/Gp5D, LS174T/LS180, DLD1/HCT8, DLD1/HCT15, and HCT8/HCT15 (Fig. 5). Eight hundred and forty-nine mutations differed in the nonhypermutated primary/metastasis pair SW480/SW620. Nonsilent and silent mutations contributed in similar proportions to these discrepancies (54.4% vs. 56.6%), suggesting no selection for these differential alterations.

Figure 5.

Overlap of mutations and DNA copy-number states between paired colorectal cancer cell lines. COLO201/COLO205, CX-1/HT29, Gp2D/Gp5D, LS174T/LS180, and DLD1/HCT8/HCT15 were derived from the same tumor, SW480/SW620 from a primary tumor and subsequent lymph node metastasis.

Figure 5.

Overlap of mutations and DNA copy-number states between paired colorectal cancer cell lines. COLO201/COLO205, CX-1/HT29, Gp2D/Gp5D, LS174T/LS180, and DLD1/HCT8/HCT15 were derived from the same tumor, SW480/SW620 from a primary tumor and subsequent lymph node metastasis.

Close modal

DNA copy-number profiles showed multiple differences for nonhypermutated pairs, with 41.1% and 53.8% of the genome varying for COLO201/COLO205 and CX-1/HT29 (Fig. 5). In contrast, discrepancies were limited in MSI-H pairs with 0.2%, 0.4%, 6.6%, 2.9%, and 4.3% of the genome differing between Gp2D/Gp5D, LS174T/LS180, DLD1/HCT8, DLD1/HCT15, and HCT8/HCT15, respectively. 47.5% of the genome differed in the nonhypermutated primary/metastasis pair SW480/SW620.

Notably, mutation differences between paired cell lines did not obscure known driver genes. For the established colorectal cancer genes APC, TP53, SMAD4, PIK3CA, KRAS, and BRAF, 29 of 33 (87.9%) nonsilent mutations were concordant between cell line pairs.

In this study we show that the mutation and DNA copy-number landscapes determined for 70 colorectal cancer cell lines closely resemble those of primary tumors, underscoring the utility of cell lines as an appropriate model system for this malignancy. The 3 molecular subtypes of colorectal cancer recently defined by the TCGA, nonhypermutated, hypermutated with microsatellite instability, and hypermutated without microsatellite instability were faithfully captured in the cell line panel. As expected, MSI-H cell lines exhibited hypermethylation of the MLH1 promoter and/or mutations in MMR genes, whereas hypermutated lines without microsatellite instability were instead characterized by mutations in the exonuclease domain of POLE. Consistent with defective DNA POLE proofreading function, the latter tumors showed a nucleotide substitution hypermutator phenotype with a bias to C>A/G>T and A>C/T>G transversions. Germline missense mutations in the POLE exonuclease domain have recently been associated with familial predisposition to colorectal cancer (30), and elevated tumor predisposition and base-substitution mutations observed in Pole exonuclease-mutant mice (31). Similarly, somatic missense mutations in the POLE exonuclease domain have been reported in 7% of endometrial cancers, with good evidence of associated hypermutation (32). Although differences in prognosis and chemotherapy response have been extensively documented for nonhypermutated versus MSI-H tumors (2–4), clinical characteristics of POLE-mutant NSHP tumors are unknown. Our identification of 2 cell lines representative of this colorectal cancer subtype will facilitate an investigation of the specific aspects of the biology of these tumors, particularly in relation to identifying therapeutics that may exploit their unique genomic instability.

Consistent with previous lower-resolution data (33), DNA copy-number profiles were similar between colorectal cancer cell lines and primary tumors, with nonhypermutated cases tending to exhibit chromosomal instability, whereas both MSI-H and NSHP cases had overall stable copy-number profiles. Patterns of whole and partial chromosome gains and losses were largely concordant. Besides recurrent gain/amplification of the established colorectal cancer gene MYC, recurrent regions of deletion in nonhypermutated cases included the candidate cancer genes PARK2 and WRN. WRN has important roles in homologous recombination repair, MUTYH-mediated repair of oxidative DNA damage, and telomeric recombination (34–36), and WRN germline mutations are associated with chromosomal instability and cancer predisposition (37). PARK2 deletion has been previously reported in sporadic colorectal cancer, and Park2 deficiency shown to accelerate intestinal adenoma development in Apc mutant mice (38).

The mutation landscapes in nonhypermutated and hypermutated MSI-H cases showed close resemblance in cell lines and primary tumors, with the expected alterations in the WNT, MAPK, PI3K, p53, and TGFβ pathways. Besides these main colorectal cancer–associated pathways, multiple chromatin-state regulators were enriched among the top 5% of mutated genes, including proteins involved in chromatin remodeling (ARID1A, CHD6, and SRCAP) and histone methylation or acetylation (ASH1L, EP300, EP400, MLL2, MLL3, PRDM2, and TRRAP). Overall, ∼24% of nonhypermutated and ∼96% of MSI-H cases harbored mutations in these genes. Trends for mutations to cluster in chromatin regulators have been reported in multiple other cancer types, including liver cancers (39), gastric adenocarcinoma (40), and transitional cell carcinoma of the bladder (41), suggesting potential pathogenicity of these alterations.

Paired cell lines originating from the same tumor showed considerable differences for both mutations and DNA copy-number profiles. Nonhypermutated pairs differed for up to 356 mutations and ∼54% of genome copy number, whereas MSI-H pairs differed for up to 6,503 mutations and ∼6.6% of genome copy number. The two main possible reasons for these differences are preexisting heterogeneity between the cells from the original tumor grown out to establish these paired lines, or acquisition of alterations as a result of long-term culture. Assuming a normal mutation rate of 10−8 (per bp per cell generation) for nonhypermutated and 100-fold elevated rate for hypermutated pairs and absence of selection (42), cell lines established from tumor cells separated by an average of 328 and 66 replications would be expected to exhibit the observed mutational differences with a >99% probability, respectively (Supplementary Data and Table S9). In contrast, mutations acquired during serial passaging in culture are not anticipated to reach a detectable 10% level before 9,034 and 4,266 replications, respectively (Supplementary Data and Fig. 6). Although serial cell passaging may have been common-place at the time when these paired cell lines were established >20 years ago, stock-keeping practices to limit the number of replications were soon introduced. Preexisting mutation heterogeneity in the original tumor therefore seems highly likely to account for the majority of the detected differences. Numbers of replications may be expected to be larger between cells from a primary tumor and subsequent metastasis, and accordingly our corresponding nonhypermutated cell line pair showed a ∼4-fold higher number of mutational differences than the other nonhypermutated pairs. Importantly, mutational differences across paired cell lines did not obscure known driver genes, with ∼88% of nonsilent mutations in established colorectal cancer genes concordant between pairs.

Figure 6.

Simulation of the acquisition of mutations in cell culture. The process of serial passage was modeled with cell numbers repeatedly increasing from 1 × 105 to 2 × 106 cells. Proportions of cells containing the most frequent mutation by passage number for five simulations, using fixed mutation probabilities of 10−8 per base per cell replication for nonhypermutated (A) and 10−6 for hypermutated (B) MSI-H tumors. The black diagonal trendline is the least squares fit on the log–log scale and the small vertical lines are the 99% prediction intervals. The horizontal red line corresponds to a sequencing mutation detection threshold of 10%.

Figure 6.

Simulation of the acquisition of mutations in cell culture. The process of serial passage was modeled with cell numbers repeatedly increasing from 1 × 105 to 2 × 106 cells. Proportions of cells containing the most frequent mutation by passage number for five simulations, using fixed mutation probabilities of 10−8 per base per cell replication for nonhypermutated (A) and 10−6 for hypermutated (B) MSI-H tumors. The black diagonal trendline is the least squares fit on the log–log scale and the small vertical lines are the 99% prediction intervals. The horizontal red line corresponds to a sequencing mutation detection threshold of 10%.

Close modal

Despite the high level of similarity between colorectal cancer cell lines and primary tumors, there were a number of differences. These included overall higher mutation and DNA copy-number frequencies, as well as a greater proportion of detected InDels in cell lines. These discrepancies may be related to differences in exome-capture and sequencing platforms, bioinformatics pipelines, the presence of contaminating normal tissue in primary cancers, and accuracy of assigning somatic alterations in cell lines. Cell lines further showed a higher proportion of hypermutated cases, and exhibited differences in the prevalence of aberrations for certain genes and genomic regions such as CTNNB1 mutations in MSI-H cases. These latter findings may be a reflection of preferential growing out of cell lines from primary tumors (or their respective subclones) that contain these aberrations, a contention supported by the observation that only ∼10% to 15% of colorectal cancers give rise to cell lines (43). Another possibility is that some of the mutations have been acquired and selected for in tissue culture.

A caveat to our analysis of protein-coding genes is that we could not report on untranslated exonic regions (UTR), as the latter were inconsistently covered by our study and the TCGA. UTR mutations can impact on RNA splicing, stability, or translation as previously highlighted for MSI-H colorectal cancer (44). In the comparison of gene mutation profiles, we chose to exclude silent mutations (other than those affecting splice sites), although some of these may similarly have functional consequences (45).

In conclusion, our comparative analysis of the genomic landscapes of human colorectal cancer cell lines and TCGA-analyzed primary cancers identified cell lines representative of the three main mutational colorectal cancer subtypes. Within these molecular subtypes, although some differences were evident, cell lines showed globally similar genetic alterations to primary cancers, including genome-wide mutation, DNA copy number, and driver gene mutation profiles. Accordingly, gene expression profiles of colorectal cancer cell lines have previously been shown to broadly represent those of primary tumors (46). Our genomic data significantly expand on cancer cell line characterization efforts by the major cancer genome centers, such as the Cancer Cell Line Encyclopedia project, which currently reports mutation data for 1,500 selected genes on 62 colorectal cancer cell lines (5). Our data will help to inform investigations of the molecular basis of colorectal cancer pathogenesis, inherent and acquired drug resistance, and exploration of novel treatment modalities for this malignancy.

No potential conflicts of interest were disclosed.

Conception and design: A.W. Burgess, R.L. Strausberg, J.M. Mariadason, O.M. Sieber

Development of methodology: D. Mouradov, R.N. Jorissen, S. Li, D. Bicknell, O.M. Sieber

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): D. Arango, D. Buchanan, S. Wormald, D. Bicknell, J.M. Mariadason, O.M. Sieber

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): D. Mouradov, C. Sloggett, R.N. Jorissen, C.G. Love, A.W. Burgess, D. Arango, D. Buchanan, L. O'Connor, J.L. Wilding, W.F. Bodmer, J.M. Mariadason, O.M. Sieber

Writing, review, and/or revision of the manuscript: D. Mouradov, C.G. Love, A.W. Burgess, D. Buchanan, J.L. Wilding, I.P.M. Tomlinson, W.F. Bodmer, J.M. Mariadason, O.M. Sieber

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): D. Mouradov, S. Li, S. Wormald

Study supervision: J.M. Mariadason, O.M. Sieber

The authors thank the Victorian Cancer BioBank for patient specimens and Prof. M. Schwab at the DKFZ for providing cell lines.

This work was supported by Cancer Australia through a Project Grant (APP1030098; O.M. Sieber), Ludwig Institute for Cancer Research (J.M. Mariadason, A.W. Burgess, O.M. Sieber), NHMRC Overseas Postdoctoral Fellowship (519795; S. Wormald), and a Victorian State Government Operational Infrastructure Support grant. J.M. Mariadason holds an Australian Research Council Future Fellowship.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Jemal
A
,
Bray
F
,
Center
MM
,
Ferlay
J
,
Ward
E
,
Forman
D
. 
Global cancer statistics
.
CA Cancer J Clin
2011
;
61
:
69
90
.
2.
Bracht
K
,
Nicholls
AM
,
Liu
Y
,
Bodmer
WF
. 
5-Fluorouracil response in a large panel of colorectal cancer cell lines is associated with mismatch repair deficiency
.
Br J Cancer
2010
;
103
:
340
6
.
3.
Ashraf
SQ
,
Nicholls
AM
,
Wilding
JL
,
Ntouroupi
TG
,
Mortensen
NJ
,
Bodmer
WF
. 
Direct and immune mediated antibody targeting of ERBB receptors in a colorectal cancer cell-line panel
.
Proc Natl Acad Sci U S A
2012
;
109
:
21046
51
.
4.
Weickhardt
AJ
,
Price
TJ
,
Chong
G
,
Gebski
V
,
Pavlakis
N
,
Johns
TG
, et al
Dual targeting of the epidermal growth factor receptor using the combination of cetuximab and erlotinib: preclinical evaluation and results of the phase II DUX study in chemotherapy-refractory, advanced colorectal cancer
.
J Clin Oncol
2012
;
30
:
1505
12
.
5.
Barretina
J
,
Caponigro
G
,
Stransky
N
,
Venkatesan
K
,
Margolin
AA
,
Kim
S
, et al
The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity
.
Nature
2012
;
483
:
603
7
.
6.
Basu
A
,
Bodycombe
NE
,
Cheah
JH
,
Price
EV
,
Liu
K
,
Schaefer
GI
, et al
An interactive resource to identify cancer genetic and lineage dependencies targeted by small molecules
.
Cell
2013
;
154
:
1151
61
.
7.
Fogli
S
,
Caraglia
M
. 
Genotype-based therapeutic approach for colorectal cancer: state of the art and future perspectives
.
Expert Opin Pharmacother
2009
;
10
:
1095
108
.
8.
Aaltonen
LA
,
Salovaara
R
,
Kristo
P
,
Canzian
F
,
Hemminki
A
,
Peltomaki
P
, et al
Incidence of hereditary nonpolyposis colorectal cancer and the feasibility of molecular screening for the disease
.
N Engl J Med
1998
;
338
:
1481
7
.
9.
Lengauer
C
,
Kinzler
KW
,
Vogelstein
B
. 
Genetic instability in colorectal cancers
.
Nature
1997
;
386
:
623
7
.
10.
Miyazaki
M
,
Furuya
T
,
Shiraki
A
,
Sato
T
,
Oga
A
,
Sasaki
K
. 
The relationship of DNA ploidy to chromosomal instability in primary human colorectal cancers
.
Cancer Res
1999
;
59
:
5283
5
.
11.
Alexander
J
,
Watanabe
T
,
Wu
TT
,
Rashid
A
,
Li
S
,
Hamilton
SR
. 
Histopathological identification of colon cancer with microsatellite instability
.
Am J Pathol
2001
;
158
:
527
35
.
12.
Duval
A
,
Reperant
M
,
Hamelin
R
. 
Comparative analysis of mutation frequency of coding and non coding short mononucleotide repeats in mismatch repair deficient colorectal cancers
.
Oncogene
2002
;
21
:
8062
6
.
13.
Kambara
T
,
Simms
LA
,
Whitehall
VL
,
Spring
KJ
,
Wynter
CV
,
Walsh
MD
, et al
BRAF mutation is associated with DNA methylation in serrated polyps and cancers of the colorectum
.
Gut
2004
;
53
:
1137
44
.
14.
Toyota
M
,
Ahuja
N
,
Ohe-Toyota
M
,
Herman
JG
,
Baylin
SB
,
Issa
JP
. 
CpG island methylator phenotype in colorectal cancer
.
Proc Natl Acad Sci U S A
1999
;
96
:
8681
6
.
15.
Rowan
A
,
Halford
S
,
Gaasenbeek
M
,
Kemp
Z
,
Sieber
O
,
Volikos
E
, et al
Refining molecular analysis in the pathways of colorectal carcinogenesis
.
Clin Gastroenterol Hepatol
2005
;
3
:
1115
23
.
16.
Sjoblom
T
,
Jones
S
,
Wood
LD
,
Parsons
DW
,
Lin
J
,
Barber
TD
, et al
The consensus coding sequences of human breast and colorectal cancers
.
Science
2006
;
314
:
268
74
.
17.
The Cancer Genome Atlas Network
. 
Comprehensive molecular characterization of human colon and rectal cancer
.
Nature
2012
;
487
:
330
7
.
18.
Seshagiri
S
,
Stawiski
EW
,
Durinck
S
,
Modrusan
Z
,
Storm
EE
,
Conboy
CB
, et al
Recurrent R-spondin fusions in colon cancer
.
Nature
2012
;
488
:
660
4
.
19.
Koo
BK
,
Spit
M
,
Jordens
I
,
Low
TY
,
Stange
DE
,
van de Wetering
M
, et al
Tumour suppressor RNF43 is a stem-cell E3 ligase that induces endocytosis of Wnt receptors
.
Nature
2012
;
488
:
665
9
.
20.
Bailey
JA
,
Gu
Z
,
Clark
RA
,
Reinert
K
,
Samonte
RV
,
Schwartz
S
, et al
Recent segmental duplications in the human genome
.
Science
2002
;
297
:
1003
7
.
21.
Trapnell
C
,
Pachter
L
,
Salzberg
SL
. 
TopHat: discovering splice junctions with RNA-Seq
.
Bioinformatics
2009
;
25
:
1105
11
.
22.
Yau
C
,
Mouradov
D
,
Jorissen
RN
,
Colella
S
,
Mirza
G
,
Steers
G
, et al
A statistical approach for detecting genomic aberrations in heterogeneous tumor samples from single nucleotide polymorphism genotyping data
.
Genome Biol
2010
;
11
:
R92
.
23.
Mermel
CH
,
Schumacher
SE
,
Hill
B
,
Meyerson
ML
,
Beroukhim
R
,
Getz
G
. 
GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers
.
Genome Biol
2011
;
12
:
R41
.
24.
Boland
CR
,
Thibodeau
SN
,
Hamilton
SR
,
Sidransky
D
,
Eshleman
JR
,
Burt
RW
, et al
A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: development of international criteria for the determination of microsatellite instability in colorectal cancer
.
Cancer Res
1998
;
58
:
5248
57
.
25.
Weisenberger
DJ
,
Siegmund
KD
,
Campan
M
,
Young
J
,
Long
TI
,
Faasse
MA
, et al
CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer
.
Nat Genet
2006
;
38
:
787
93
.
26.
Prolla
TA
,
Baker
SM
,
Harris
AC
,
Tsao
JL
,
Yao
X
,
Bronner
CE
, et al
Tumour susceptibility and spontaneous mutation in mice deficient in Mlh1, Pms1 and Pms2 DNA mismatch repair
.
Nat Genet
1998
;
18
:
276
9
.
27.
Sansom
OJ
,
Meniel
VS
,
Muncan
V
,
Phesse
TJ
,
Wilkins
JA
,
Reed
KR
, et al
Myc deletion rescues Apc deficiency in the small intestine
.
Nature
2007
;
446
:
676
9
.
28.
Rutzky
LP
,
Kaye
CI
,
Siciliano
MJ
,
Chao
M
,
Kahan
BD
. 
Longitudinal karyotype and genetic signature analysis of cultured human colon adenocarcinoma cell lines LS180 and LS174T
.
Cancer Res
1980
;
40
:
1443
8
.
29.
Efstathiou
JA
,
Liu
D
,
Wheeler
JM
,
Kim
HC
,
Beck
NE
,
Ilyas
M
, et al
Mutated epithelial cadherin is associated with increased tumorigenicity and loss of adhesion and of responsiveness to the motogenic trefoil factor 2 in colon carcinoma cells
.
Proc Natl Acad Sci U S A
1999
;
96
:
2316
21
.
30.
Palles
C
,
Cazier
JB
,
Howarth
KM
,
Domingo
E
,
Jones
AM
,
Broderick
P
, et al
Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas
.
Nat Genet
2012
;
45
:
136
44
.
31.
Albertson
TM
,
Ogawa
M
,
Bugni
JM
,
Hays
LE
,
Chen
Y
,
Wang
Y
, et al
DNA polymerase epsilon and delta proofreading suppress discrete mutator and cancer phenotypes in mice
.
Proc Natl Acad Sci U S A
2009
;
106
:
17101
4
.
32.
Church
DN
,
Briggs
SE
,
Palles
C
,
Domingo
E
,
Kearsey
SJ
,
Grimes
JM
, et al
DNA polymerase ϵ and δ exonuclease domain mutations in endometrial cancer
.
Hum Mol Genet
2013
;
22
:
2820
8
.
33.
Douglas
EJ
,
Fiegler
H
,
Rowan
A
,
Halford
S
,
Bicknell
DC
,
Bodmer
W
, et al
Array comparative genomic hybridization analysis of colorectal cancer cell lines and primary carcinomas
.
Cancer Res
2004
;
64
:
4817
25
.
34.
Prince
PR
,
Emond
MJ
,
Monnat
RJ
 Jr.
Loss of Werner syndrome protein function promotes aberrant mitotic recombination
.
Genes Dev
2001
;
15
:
933
8
.
35.
Kanagaraj
R
,
Parasuraman
P
,
Mihaljevic
B
,
van Loon
B
,
Burdova
K
,
Konig
C
, et al
Involvement of Werner syndrome protein in MUTYH-mediated repair of oxidative DNA damage
.
Nucleic Acids Res
2012
;
40
:
8449
59
.
36.
Mendez-Bermudez
A
,
Hidalgo-Bravo
A
,
Cotton
VE
,
Gravani
A
,
Jeyapalan
JN
,
Royle
NJ
. 
The roles of WRN and BLM RecQ helicases in the alternative lengthening of telomeres
.
Nucleic Acids Res
2012
;
40
:
10809
20
.
37.
Lauper
JM
,
Krause
A
,
Vaughan
TL
,
Monnat
RJ
 Jr.
Spectrum and risk of neoplasia in Werner syndrome: a systematic review
.
PLoS ONE
2013
;
8
:
e59709
.
38.
Poulogiannis
G
,
McIntyre
RE
,
Dimitriadi
M
,
Apps
JR
,
Wilson
CH
,
Ichimura
K
, et al
PARK2 deletions occur frequently in sporadic colorectal cancer and accelerate adenoma development in Apc mutant mice
.
Proc Natl Acad Sci U S A
2010
;
107
:
15145
50
.
39.
Fujimoto
A
,
Totoki
Y
,
Abe
T
,
Boroevich
KA
,
Hosoda
F
,
Nguyen
HH
, et al
Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators
.
Nat Genet
2012
;
44
:
760
4
.
40.
Zang
ZJ
,
Cutcutache
I
,
Poon
SL
,
Zhang
SL
,
McPherson
JR
,
Tao
J
, et al
Exome sequencing of gastric adenocarcinoma identifies recurrent somatic mutations in cell adhesion and chromatin remodeling genes
.
Nat Genet
2012
;
44
:
570
4
.
41.
Gui
Y
,
Guo
G
,
Huang
Y
,
Hu
X
,
Tang
A
,
Gao
S
, et al
Frequent mutations of chromatin remodeling genes in transitional cell carcinoma of the bladder
.
Nat Genet
2011
;
43
:
875
8
.
42.
Tomlinson
IP
,
Novelli
MR
,
Bodmer
WF
. 
The mutation rate and cancer
.
Proc Natl Acad Sci U S A
1996
;
93
:
14800
3
.
43.
Liu
Y
,
Bodmer
WF
. 
Analysis of P53 mutations and their expression in 56 colorectal cancer cell lines
.
Proc Natl Acad Sci U S A
2006
;
103
:
976
81
.
44.
Wilding
JL
,
McGowan
S
,
Liu
Y
,
Bodmer
WF
. 
Replication error deficient and proficient colorectal cancer gene expression differences caused by 3′UTR polyT sequence deletions
.
Proc Natl Acad Sci U S A
2010
;
107
:
21058
63
.
45.
Kudla
G
,
Murray
AW
,
Tollervey
D
,
Plotkin
JB
. 
Coding-sequence determinants of gene expression in Escherichia coli
.
Science
2009
;
324
:
255
8
.
46.
Schlicker
A
,
Beran
G
,
Chresta
CM
,
McWalter
G
,
Pritchard
A
,
Weston
S
, et al
Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines
.
BMC Med Genomics
2012
;
5
:
66
.

Supplementary data