Abstract
The gut microbiome is plausibly associated with colorectal cancer risk; however, previous studies mostly investigated this association cross-sectionally. We investigated cross-sectional and prospective associations of the rectal tissue microbiome with adenoma recurrence in the Polyp Prevention Trial (PPT).
PPT is a 4-year randomized clinical trial of the effect of a dietary intervention on adenoma recurrence among community members. We extracted DNA from rectal biopsies at baseline, end of year 1, and end of year 4 among 455 individuals and sequenced the V4 region of the 16S rRNA gene. At each timepoint, we investigated associations of alpha diversity, beta diversity, and presence and relative abundance of select taxa with adenoma recurrence using multivariable logistic regression.
Variation in beta diversity was primarily explained by subject and minimally by year of collection or time between biopsy and colonoscopy. Cross-sectionally, year 4 alpha diversity was strongly, inversely associated with adenoma prevalence [ORQ3 vs. Q1 Shannon index = 0.40 (95% confidence interval, CI: 0.21–0.76)]. Prospective alpha diversity associations (i.e., baseline/year 1 alpha diversity with adenoma recurrence 3–4 years later) were weak or null, as were cross-sectional and prospective beta diversity–adenoma associations. Bacteroides abundance was more strongly, positively associated with adenoma prevalence cross-sectionally than prospectively.
Rectal tissue microbiome profiles may be associated with prevalent adenomas, with little evidence supporting prospective associations.
Additional prospective studies, with serial fecal and tissue samples, to explore microbiome-colorectal cancer associations are needed. Eventually, it may be possible to use microbiome characteristics as intervenable risk factors or screening tools.
Introduction
In the United States, colorectal cancer is the second leading cause of cancer death among men and women combined. The majority of colorectal cancers arise from adenomatous polyps (1), particularly advanced adenomas, which have a higher risk for progression to a carcinoma (2, 3).
The human colon is host to trillions of microbes that comprise the gut microbiome. There is strong biological plausibility for the role of the gut microbiome in initiation and progression of the adenoma-carcinoma sequence, such as through their role in inflammatory signaling pathways, genetic mutations, and epigenetic regulation (4–6). For example, during the progression of dysplasia, the epithelial barriers that separate the microbiome from the immune cells in the lamina propria begin breaking down, driving a loss of homeostasis and a proneoplastic inflammatory environment (7, 8). In comparison with its luminal fecal counterparts, the mucosa may be more reflective of the colonic bacterial community that plays a direct role in the etiology of adenomas, as these bacteria are adherent to the surface polysaccharide matrices and can stimulate the mucosal immune system (9).
Multiple case–control studies suggested associations of the microbiome with colorectal neoplasms. For example, a recent meta-analysis was conducted of four case–control studies comparing 16S rRNA gene sequenced tumor tissue microbiome from colorectal cancer cases and normal tissue from healthy controls, and of eight studies comparing colorectal cancer and paired normal tissue. They found that abundance of Bacteroides fragilis, Fusobacterium nucleatum, Parvimonas micra, and Peptostreptococcus stomatis—the latter three being known oral pathogens—was higher among the colorectal cancer tumor samples (10). To date, most human studies investigating associations of the microbiome with colorectal neoplasms were case–control studies measuring the microbiome and colorectal neoplasm presence cross-sectionally (i.e., at one point in time). Almost no studies have investigated prospective associations with adenoma recurrence or compared cross-sectional and prospective associations to inform prior findings. Furthermore, case–control studies of the tissue microbiome and adenoma are subject to unique biases, such as issues with timing of colonoscopy in relation to biopsy collection and intraindividual variability of the tissue microbiome over time. Herein, we present an investigation of the cross-sectional and prospective associations of the rectal tissue microbiota with adenoma recurrence in the 4-year Polyp Prevention Trial (PPT). Our study also addresses multiple methodologic issues present in the microbiome field with the goal of informing future studies of the microbiome and disease.
Materials and Methods
Study population
The PPT has been described in detail previously (1, 11–13). Briefly, the PPT was a 4-year randomized, multicenter, nutritional intervention trial to investigate the effect of a high-fiber (≥4.30 g/MJ or 18 g/1,000 kcal), high-fruit and vegetable (≥0.84 servings/MJ or 5 servings/day), and low-fat (≤20% of energy) diet on colorectal adenoma recurrence. Men and women ages 35 years or older with at least one histologically confirmed colorectal adenoma removed in the past 6 months were randomized at baseline to the dietary intervention or control group for 4 consecutive years of follow-up. Intervention compliance was assessed by patient self-report on food frequency questionnaires (FFQ). At the conclusion of the trial, the average daily intake of fat, fiber, and fruits and vegetables across participants in the intervention arm were (i) 24% of total calories; (ii) 17 g/1,000 kcal; and (iii) 3.4 servings/1,000 kcal, respectively; the dietary intervention was not found to modify colorectal adenoma recurrence risk (12). All participants provided written informed consent and the study was approved by the Institutional Review Boards at the NCI and participating centers (OH91C0159-B), with the original trial registered under identifier: NCT00339625.
Colorectal adenoma recurrence
Within 6 months prior to randomization, all individuals in the study were diagnosed with ≥ one histologically confirmed colorectal adenoma via colonoscopy during which all adenomas were removed. At the end of the first year and at the end of the 4-year trial, the participants had another colonoscopy to identify and remove any polypoid lesions for histologic examination. The 1-year colonoscopy was conducted 180 days to 2 years after randomization to remove any lesions missed at baseline. After the 1-year colonoscopy, any colorectal adenoma identified was considered a recurrent colorectal adenoma, our primary endpoint. Those with no recurrent adenoma after the 1-year colonoscopy were considered controls for our primary analyses. For our primary analyses, hyperplastic polyps diagnosed after the 1-year colonoscopy were included in the control group. For the secondary analyses by adenoma and/or polyp type, we included hyperplastic polyps as a separate histology, combined advanced (lesions with a maximal diameter of ≥1 cm, ≥25% villous elements, evidence of high-grade dysplasia, including carcinoma) or ≥2 adenomas into a “high-risk” category, and categorized only individuals without any colorectal adenoma or nonhyperplastic polyps after the 1-year colonoscopy as controls.
Data and biopsy collection
At the clinical centers (Memorial Sloan Kettering, NY; University of Buffalo, NY; University of Pittsburgh, PA; Edward Hines Jr. Hospital, IL; Wake Forest University, NC; Walter Reed Medical Center, VA; Kaiser Foundation Research Institute, Oakland, CA; or University of Utah, UT), rectal tissue biopsies were obtained and frozen immediately without a fixative at baseline, the end of the first year, and at the end of the 4-year trial. Time between rectal tissue biopsy and colonoscopy was documented. All individuals with available rectal biopsy tissue (N = 455) from any of the three timepoints were selected for this study (N = 333 samples at baseline, N = 369 at year 1, and N = 328 at year 4). At baseline and at the end of each year of the 4-year trial, participants completed an interviewer-administered questionnaire about demographic, clinical, medication and supplement use, and a FFQ querying diet during the previous year.
DNA extraction and sequencing
For full details, see Supplementary Materials and Methods. Laboratory personnel observed safe laboratory practices, including appropriate personal protective equipment usage, during all procedures. Briefly, biopsies were batched such that tissues from one individual were included in the same batch and each batch included replicate quality controls (QC; ref. 14). Tissues were lysed using an enzymatic cocktail, homogenized in a Bead Ruptor (Omni International, Inc.), and centrifuged. The Animal Tissue DNA Extraction Kit (AutoGen) was used for DNA extraction.
The 16S rRNA gene PCR amplification and sequencing was performed as described previously (15). To account for low microbial content in tissue-derived DNA, the input for amplification was increased to 100 ng, at a concentration of 5 ng/μL, based on Quant-iT PicoGreen double-stranded DNA (Thermo Fisher Scientific) quantitation. The V4 region of the 16S rRNA gene was PCR amplified for 30 cycles and 2 × 250 bp paired end sequencing was performed on the Illumina MiSeq v2 using the 500 cycle kit (Illumina).
Bioinformatics
Using the Divisive Amplicon Denoising Algorithm (DADA) 2 pipeline 1.2.1 (16), sequence variant tables and phylogenetic trees were generated on the basis of pair-end sequence reads. For quality filtering, the first 10 bases were trimmed from forward and reverse reads. Forward and/or reverse reads were truncated at 240/220 bases separately. Then, the reads were merged using the default “mergePairs” DADA2 function. After merging and error correction, amplicon sequence variants (ASV; i.e., 100% operational taxonomic units, or OTU) were identified. After removal of chimeras, using the “removeBimeraDenovo” function, 85% of the sequence reads were retained. Taxonomy was assigned to the resulting ASVs using the SILVA v123 database. A total of 436 nonbacterial sequences were filtered.
Observed ASVs, Shannon index, and Faith phylogenetic diversity (PD) were computed using Quantitative Insights Into Microbial Ecology 1.9.1. Beta diversity measures were calculated on the basis of Bray–Curtis, weighted UniFrac, and unweighted UniFrac distance matrices. We selected taxa a priori based on prior meta-analyses of the associations of the mucosal and/or fecal microbiome with colorectal cancer (10, 17). These taxa included: Bacteroides, Fusobacterium, Porphyromonas, Parvimonas, Peptostreptococcus, Gemella, Prevotella, Solobacterium, Dialister, and order Clostridiales. For exploratory relative abundance, we restricted our analyses to genera present in 50% of the population at a mean relative abundance of >0.1% (N = 85); for exploratory presence and/or absence analyses, we restricted to those taxa present in 20% to 80% of the population (N = 144 taxa). On the basis of rarefaction curves for alpha diversity, we rarefied the alpha and beta diversity metrics to 8,000 reads. Of the 1,059 total samples, 1,030 were retained after rarefaction. All participants excluded from alpha and/or beta diversity analyses were similarly excluded from the relative abundance and presence and/or absence analyses. From the 1,030 samples included in our analysis, a median of 56,539 reads were generated per sample and 16,206 sequence variants were identified.
For QC analysis, the taxonomic composition of the artificial community was compared with the known composition and was similar based on visual inspection. For artificial community samples placed in separate batches, average interbatch alpha diversity coefficients of variation were 10.02%, 1.45%, and 5.75% for observed ASVs, Shannon index, and Faith PD, respectively.
Statistical analysis
We summarized and compared participant characteristics by recurrent adenoma case/control status using χ2 tests for categorical variables, ANOVA for continuous variables, and Kruskal–Wallis for non-normally distributed continuous variables. We compared alpha diversity in biopsies collected before and after colonoscopy using ANOVA and general linear regression models, and summarized differences visually using boxplots.
To assess stability over time, we calculated intraclass correlation coefficients (ICC) for the alpha diversity metrics, the first three principal coordinates of the Bray–Curtis and weighted and unweighted Unifrac distance matrices, a priori colorectal cancer-associated bacteria, the top three most abundant phyla, and top 10 most abundant genera. To do this, we used linear mixed effects models with a random effect for subject clustered by center and a fixed effect for timepoint. We added covariates—for example, age, sex, adenoma status, colonoscopy timing, and randomization group—to the linear mixed effects models to assess potential influence of these variables on the ICCs; however, the estimates were unchanged with or without the covariates in the model and no covariates were included in the final mixed effects model. We also transformed the taxonomic counts via the centered log ratio transformation to account for the compositional nature of the data using the clr function of the compositions package in R (18). We next estimated the percentage of variability (R2) in the beta diversity matrices explained by subject, visit, time between colonoscopy, baseline age, sex, center, education level, intervention status, and adenoma status at baseline, year 1, and year 4 using the Adonis function in the vegan package in R (19).
For all alpha diversity analyses, we categorized participants into timepoint-specific tertiles of the alpha diversity metrics based on the distribution among the controls (those without a recurrent adenoma). To test for trend, we assigned each participant the median value of their tertile and modeled the variable continuously in the logistic regression model. We estimated the associations of the microbiome metrics with adenoma recurrence using multivariable logistic regression models. As a sensitivity analysis, we removed those with hyperplastic polyps from the control group. We additionally estimated cross-sectional associations of alpha diversity with adenoma characteristics at baseline [advanced/multiple vs. early (i.e., nonadvanced) adenoma] and at year 1 (early adenoma, advanced/multiple, or hyperplastic vs. no polyps). Finally, we estimated cross-sectional and prospective associations of alpha diversity with adenoma characteristics (early, advanced/multiple, or hyperplastic polyp vs. no polyp) using polytomous logistic regression. We calculated a P value for heterogeneity by subtype using a case-only multivariable logistic regression analysis with subtype as the dependent variable and the microbiome metric and additional covariates as the independent variables.
To test for overall differences in microbiome composition by recurrent adenoma status, we conducted the microbiome regression- based kernel association test (MiRKAT; ref. 20), based on 10,000 permutations, to calculate P values based on kernel similarity matrices for Bray–Curtis and unweighted and weighted UniFrac distance, individually and overall.
Covariates for adjustment included baseline age, sex, randomization group (control arm or intervention arm), baseline body mass index (BMI; kg/m2), baseline adenoma characteristics (early adenoma or advanced/multiple adenomas), year 1 adenoma characteristics (early adenoma, advanced/multiple, hyperplastic, or no polyp), education status (college graduate or less or post-college/university education), center grouped by geography (Memorial Sloan Kettering, NY/University of Buffalo, NY/University of Pittsburgh, PA/Edward Hines Jr. Hospital, IL/Wake Forest University, NC/Walter Reed Medical Center, VA; Kaiser Foundation Research Institute, Oakland, CA; or University of Utah, UT), baseline smoking status (never smoker, former smoker, or current smoker), family history of colorectal cancer (yes or no), and regular aspirin or other NSAID use (yes or no). We also included a variable for time between biopsy collection and colonoscopy, set to zero if colonoscopy occurred after biopsy, in the pertinent models (e.g., for baseline alpha diversity models, a variable for baseline time between biopsy collection and colonoscopy was included in the model).
Statistical analyses were conducted using R, version 4.1.0. We accounted for multiple testing using Bonferroni-corrected alpha thresholds based on the number of comparisons (e.g., for a priori microbiome analyses the alpha was 0.05/10).
Data availability
The sequencing and clinical data that support the findings of this study are publicly available in the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/ bioproject/PRJNA810087; bioproject ID PRJNA810087).
Results
Participant characteristics
Characteristics of recurrent adenoma cases and controls are shown in Table 1. Recurrent adenoma cases were more likely to be male and to have an advanced adenoma or multiple adenomas at baseline and year 1. The two groups did not differ on other characteristics, such as BMI, any microbiome metrics, and time between rectal biopsy and colonoscopy at baseline, year 1, or year 4. Of the recurrent adenoma cases, 56.4% had early adenomas and 43.6% had advanced or multiple adenomas.
. | Recurrent cases (N = 179) . | Controls (N = 253) . | . | ||
---|---|---|---|---|---|
. | N (%) . | Mean (SD) . | N (%) . | Mean (SD) . | Pa . |
Assigned intervention arm at randomization | 87 (48.6) | 136 (53.8) | 0.34 | ||
Demographics | |||||
Age at randomization | 62.09 (9.37) | 60.53 (10.06) | 0.10 | ||
Center | 0.82 | ||||
CA | 37 (20.7) | 56 (22.1) | |||
NY/PA/IL/NC/VA | 94 (52.5) | 125 (49.4) | |||
UT | 48 (26.8) | 72 (28.5) | |||
Male | 134 (74.9) | 162 (64.0) | 0.02 | ||
≥Postgraduate-level education | 68 (38.0) | 70 (27.7) | 0.03 | ||
Family history of colorectal cancer | 51 (28.5) | 71 (28.1) | 1.00 | ||
Aspirin and NSAID use | 68 (38.0) | 98 (38.7) | 0.96 | ||
Smoking status at baseline | 0.66 | ||||
Current smoker | 27 (15.1) | 31 (12.3) | |||
Former smoker | 76 (42.5) | 107 (42.3) | |||
Never regular smoker | 76 (42.5) | 115 (45.5) | |||
BMI, kg/m2 | 28.04 (3.71) | 27.73 (4.20) | 0.49 | ||
Adenoma information | |||||
Adenoma characteristics, baseline | 0.07 | ||||
Advanced/Multiple | 110 (61.5) | 132 (52.2) | |||
Early adenoma | 69 (38.5) | 121 (47.8) | |||
Adenoma characteristics, year 1 | <0.001 | ||||
No polyps | 66 (36.9) | 148 (58.5) | |||
Advanced/Multiple | 56 (31.3) | 26 (10.3) | |||
Early adenoma | 32 (17.9) | 49 (19.4) | |||
Hyperplastic polyps | 25 (14.0) | 30 (11.9) | |||
Biopsy collected after colonoscopy, baseline | NA | ||||
Yes | 132 (100.0) | 184 (100.0) | |||
Missing | 47 | 69 | |||
Biopsy collected after colonoscopy, year 1 | 0.75 | ||||
Yes | 126 (85.1) | 179 (86.9) | |||
Missing | 31 | 47 | |||
Biopsy collected after colonoscopy, year 4 | 0.87 | ||||
Yes | 89 (72.4) | 121 (70.8) | |||
Missing | 56 | 82 | |||
Adenoma location at baseline | 0.01 | ||||
Right colon | 74 (44.6) | 58 (31.0) | |||
Left colon | 67 (40.4) | 104 (55.6) | |||
Rectum or rectosigmoid | 25 (15.1) | 25 (13.4) | |||
Missing | 13 | 66 | |||
Microbiome metrics | |||||
Alpha diversity metrics at baseline | |||||
Shannon index | 5.38 (1.06) | 5.47 (0.99) | 0.41 | ||
Faith PD | 21.82 (3.80) | 21.96 (4.89) | 0.79 | ||
Observed ASVs | 191.98 (62.69) | 199.84 (63.21) | 0.28 | ||
Relative abundance of top three phyla | |||||
Firmicutes | 53.26 (18.74) | 54.57 (17.68) | 0.53 | ||
Bacteroidetes | 21.27 (11.02) | 21.43 (10.33) | 0.90 | ||
Proteobacteria | 10.29 (14.33) | 10.27 (13.73) | 0.99 | ||
Relative abundance of select genus at baseline | |||||
Bacteroides | 14.09 (10.71) | 14.13 (10.14) | 0.97 | ||
Fusobacterium | 0.73 (3.63) | 0.56 (1.56) | 0.57 | ||
Porphyromonas | 0.34 (0.70) | 0.28 (0.64) | 0.42 | ||
Parvimonas | 0.05 (0.16) | 0.06 (0.23) | 0.75 | ||
Peptostreptococcus | 0.06 (0.19) | 0.08 (0.28) | 0.50 | ||
Gemella | 0.23 (0.83) | 0.30 (0.98) | 0.52 | ||
Prevotella | 0.63 (1.74) | 0.48 (1.36) | 0.39 | ||
Solobacterium | 0.04 (0.12) | 0.05 (0.24) | 0.42 | ||
Dialister | 0.36 (0.74) | 0.32 (0.73) | 0.66 | ||
Clostridiales (order) | 46.81 (19.16) | 46.90 (19.29) | 0.97 |
. | Recurrent cases (N = 179) . | Controls (N = 253) . | . | ||
---|---|---|---|---|---|
. | N (%) . | Mean (SD) . | N (%) . | Mean (SD) . | Pa . |
Assigned intervention arm at randomization | 87 (48.6) | 136 (53.8) | 0.34 | ||
Demographics | |||||
Age at randomization | 62.09 (9.37) | 60.53 (10.06) | 0.10 | ||
Center | 0.82 | ||||
CA | 37 (20.7) | 56 (22.1) | |||
NY/PA/IL/NC/VA | 94 (52.5) | 125 (49.4) | |||
UT | 48 (26.8) | 72 (28.5) | |||
Male | 134 (74.9) | 162 (64.0) | 0.02 | ||
≥Postgraduate-level education | 68 (38.0) | 70 (27.7) | 0.03 | ||
Family history of colorectal cancer | 51 (28.5) | 71 (28.1) | 1.00 | ||
Aspirin and NSAID use | 68 (38.0) | 98 (38.7) | 0.96 | ||
Smoking status at baseline | 0.66 | ||||
Current smoker | 27 (15.1) | 31 (12.3) | |||
Former smoker | 76 (42.5) | 107 (42.3) | |||
Never regular smoker | 76 (42.5) | 115 (45.5) | |||
BMI, kg/m2 | 28.04 (3.71) | 27.73 (4.20) | 0.49 | ||
Adenoma information | |||||
Adenoma characteristics, baseline | 0.07 | ||||
Advanced/Multiple | 110 (61.5) | 132 (52.2) | |||
Early adenoma | 69 (38.5) | 121 (47.8) | |||
Adenoma characteristics, year 1 | <0.001 | ||||
No polyps | 66 (36.9) | 148 (58.5) | |||
Advanced/Multiple | 56 (31.3) | 26 (10.3) | |||
Early adenoma | 32 (17.9) | 49 (19.4) | |||
Hyperplastic polyps | 25 (14.0) | 30 (11.9) | |||
Biopsy collected after colonoscopy, baseline | NA | ||||
Yes | 132 (100.0) | 184 (100.0) | |||
Missing | 47 | 69 | |||
Biopsy collected after colonoscopy, year 1 | 0.75 | ||||
Yes | 126 (85.1) | 179 (86.9) | |||
Missing | 31 | 47 | |||
Biopsy collected after colonoscopy, year 4 | 0.87 | ||||
Yes | 89 (72.4) | 121 (70.8) | |||
Missing | 56 | 82 | |||
Adenoma location at baseline | 0.01 | ||||
Right colon | 74 (44.6) | 58 (31.0) | |||
Left colon | 67 (40.4) | 104 (55.6) | |||
Rectum or rectosigmoid | 25 (15.1) | 25 (13.4) | |||
Missing | 13 | 66 | |||
Microbiome metrics | |||||
Alpha diversity metrics at baseline | |||||
Shannon index | 5.38 (1.06) | 5.47 (0.99) | 0.41 | ||
Faith PD | 21.82 (3.80) | 21.96 (4.89) | 0.79 | ||
Observed ASVs | 191.98 (62.69) | 199.84 (63.21) | 0.28 | ||
Relative abundance of top three phyla | |||||
Firmicutes | 53.26 (18.74) | 54.57 (17.68) | 0.53 | ||
Bacteroidetes | 21.27 (11.02) | 21.43 (10.33) | 0.90 | ||
Proteobacteria | 10.29 (14.33) | 10.27 (13.73) | 0.99 | ||
Relative abundance of select genus at baseline | |||||
Bacteroides | 14.09 (10.71) | 14.13 (10.14) | 0.97 | ||
Fusobacterium | 0.73 (3.63) | 0.56 (1.56) | 0.57 | ||
Porphyromonas | 0.34 (0.70) | 0.28 (0.64) | 0.42 | ||
Parvimonas | 0.05 (0.16) | 0.06 (0.23) | 0.75 | ||
Peptostreptococcus | 0.06 (0.19) | 0.08 (0.28) | 0.50 | ||
Gemella | 0.23 (0.83) | 0.30 (0.98) | 0.52 | ||
Prevotella | 0.63 (1.74) | 0.48 (1.36) | 0.39 | ||
Solobacterium | 0.04 (0.12) | 0.05 (0.24) | 0.42 | ||
Dialister | 0.36 (0.74) | 0.32 (0.73) | 0.66 | ||
Clostridiales (order) | 46.81 (19.16) | 46.90 (19.29) | 0.97 |
Abbreviations: ASV, amplicon sequence variant; BMI, body mass index; CA, California; IL, Illinois; NC, North Carolina; NSAID, non-steroidal anti-inflammatory drug; NY, New York; PA, Pennsylvania; PD, phylogenetic diversity; UT, Utah; VA, Virginia.
aP values were calculated using χ2 test for categorical variables, ANOVA for normally distributed continuous variables, and Kruskal–Wallis test for non-normally distributed continuous variables.
A summary of the number of samples per visit, time between colonoscopy and biopsy, and average alpha diversity measures by visit are shown in Table 2. In total, 225 participants had samples for all three timepoints, 125 had samples for two timepoints, and 105 had samples for only one visit. All baseline biopsies were collected after colonoscopy and most biopsies were collected after colonoscopy at year 1 and year 4. Compared with year 1 biopsies, year 4 biopsies were, on average, collected within a shorter timeframe after colonoscopy. Alpha diversity was consistently slightly lower for biopsies collected after colonoscopy (Supplementary Fig. S1). For example, comparing biopsies taken before and after colonoscopy at year 4, mean Faith PD was 22.20 and 19.95, respectively (P = 0.0003). In linear regression models estimating associations of days between colonoscopy and biopsy with alpha diversity, no associations were statistically significant at either year 1 or year 4 (all betas ≤ 0.01 and all P values ≥ 0.56).
. | Baseline . | Year 1 . | Year 4 . |
---|---|---|---|
Number of samples | 333 | 369 | 328 |
Days between colonoscopy and biopsya, mean (SD) | 129.17 (52.63) | 51.83 (76.22) | 12.80 (43.67) |
Colonoscopy performed before biopsy, n (%) | 333 (100.0) | 319 (86.7) | 210 (71.4) |
Alpha diversity metrics | |||
Shannon index, mean (SD) | 5.46 (1.01) | 5.23 (1.03) | 5.17 (1.03) |
Observed ASVs, mean (SD) | 197.58 (62.68) | 184.24 (61.09) | 178.28 (62.26) |
Faith PD, mean (SD) | 21.93 (4.42) | 20.86 (4.85) | 20.52 (4.79) |
. | Baseline . | Year 1 . | Year 4 . |
---|---|---|---|
Number of samples | 333 | 369 | 328 |
Days between colonoscopy and biopsya, mean (SD) | 129.17 (52.63) | 51.83 (76.22) | 12.80 (43.67) |
Colonoscopy performed before biopsy, n (%) | 333 (100.0) | 319 (86.7) | 210 (71.4) |
Alpha diversity metrics | |||
Shannon index, mean (SD) | 5.46 (1.01) | 5.23 (1.03) | 5.17 (1.03) |
Observed ASVs, mean (SD) | 197.58 (62.68) | 184.24 (61.09) | 178.28 (62.26) |
Faith PD, mean (SD) | 21.93 (4.42) | 20.86 (4.85) | 20.52 (4.79) |
Abbreviations: ASV, amplicon sequence variant; PD, phylogenetic diversity.
aIf the biopsy was collected before colonoscopy, the number of days was set to zero.
Rectal tissue microbiome temporal stability
Stability of the rectal tissue microbiome, as measured by ICCs, over the three timepoints (baseline, year 1, year 4) is presented in Supplementary Table S1. Stability was generally low for most microbiome metrics across the three timepoints. Alpha diversity tended to decrease incrementally over time. Alpha diversity ICCs were lowest for Shannon diversity [ICC (95% confidence interval, CI) = 0.15 (0.07–0.20)] and highest for observed ASVs [ICC (95% CI) = 0.45 (0.38–0.52)]. ICCs were similarly low for the first three principal coordinates of all beta diversity matrices, ranging 0.13 to 0.74. Relative abundance of Fusobacterium, a bacterium previously strongly associated with colorectal cancer progression, was somewhat moderate [ICC (95% CI) = 0.37 (0.31–0.43)]. Among the other a priori selected and most abundant genera and phyla, ICCs were highest for Gemella [ICC (95% CI) = 0.49 (0.43–0.54)] and lowest for Solobacterium [ICC (95% CI) = 0.07 (0.00–0.15)]. ICCs were generally similar across strata of participant characteristics, including across adenoma recurrence status, intervention arm, and whether the participant's tissue was collected before or after colonoscopy at each timepoint.
Percent of variability in beta diversity explained by participant characteristics is presented in Fig. 1. Variation was primarily explained by subject, and minimally by center, visit, time between biopsy and colonoscopy, adenoma status at each visit, age, or other participant characteristics. For example, 63%, 0.27%, 0.12%, and 3.6% of the variation in Bray–Curtis distance was explained by subject, visit, time between biopsy and colonoscopy, and center, respectively.
Associations of alpha and beta diversity with adenoma recurrence
We present the cross-sectional and prospective associations of alpha diversity with adenoma recurrence in Table 3. Overall, the alpha diversity-adenoma recurrence associations were stronger at year 4 than at baseline and year 1 (i.e., the cross-sectional associations were stronger than the prospective associations). For example, at year 4, those in the highest tertile of the Shannon index had 0.40 (95% CI: 0.21–0.76) times the odds of having a prevalent recurrent adenoma, compared with those in the lowest tertile (Ptrend = 0.01). The associations for the other alpha diversity metrics at year 4 were similar, although the trends were not statistically significant. The associations of trajectory measures of alpha diversity with adenoma recurrence were generally strongest among women compared with men. At baseline and year 1, the alpha diversity associations were variable and not statistically significant (all P values > 0.05), with some slightly positive associations at year 1, both overall and when stratified by sex. These associations were slightly stronger, particularly at year 4, upon removing controls with hyperplastic polyps diagnosed around year 4 (Supplementary Table S2). Alpha diversity was most strongly, inversely associated with high-risk adenomas [e.g., the OR was 0.33 (95% CI: 0.11–0.82) comparing those in the highest relative with lowest year 4 Shannon index tertile]. However, the associations across various adenoma characteristics did not statistically significantly differ (Pheterogeneity >0.05; Supplementary Table S3).
. | Continuous . | Tertile 1 . | Tertile 2 . | Tertile 3 . | . | |||
---|---|---|---|---|---|---|---|---|
Timepoint/alpha diversity metric (range) . | OR (95% CI) . | N . | OR (95% CI) . | N . | OR (95% CI) . | N . | OR (95% CI) . | Ptrendb . |
Baseline (N = 316) | ||||||||
Shannon index (1.44, 6.95) | 0.93 (0.73–1.19) | 109 | 1.00 | 101 | 0.87 (0.48–1.57) | 106 | 0.85 (0.46–1.58) | 0.58 |
Observed ASVs (22.00, 362.00) | 1.00 (0.99–1.00) | 114 | 1.00 | 101 | 0.91 (0.50–1.64) | 101 | 0.68 (0.36–1.25) | 0.22 |
Faith PD (6.09, 33.81) | 1.01 (0.96–1.07) | 106 | 1.00 | 109 | 1.18 (0.66–2.11) | 101 | 0.91 (0.49–1.66) | 0.75 |
Year 1 (N = 354) | ||||||||
Shannon index (0.29, 6.82) | 0.98 (0.78–1.23) | 111 | 1.00 | 129 | 1.38 (0.80–2.41) | 114 | 1.06 (0.59–1.93) | 0.68 |
Observed ASVs (13.00, 348.00) | 1.00 (0.99–1.00) | 119 | 1.00 | 122 | 0.96 (0.55–1.67) | 113 | 0.86 (0.48–1.52) | 0.60 |
Faith PD (2.97, 36.37) | 1.03 (0.98–1.08) | 108 | 1.00 | 125 | 1.45 (0.82–2.57) | 121 | 1.24 (0.70–2.22) | 0.51 |
Year 4 (N = 294) | ||||||||
Shannon index (0.74, 7.25) | 0.80 (0.62–1.01) | 104 | 1.00 | 102 | 0.82 (0.45–1.50) | 88 | 0.40 (0.21–0.76) | 0.01 |
Observed ASVs (9.00, 384.00) | 1.00 (0.99–1.00) | 98 | 1.00 | 103 | 0.88 (0.49–1.60) | 93 | 0.56 (0.29–1.05) | 0.07 |
Faith PD (2.93, 34.39) | 0.96 (0.91–1.02) | 106 | 1.00 | 82 | 0.49 (0.25–0.94) | 106 | 0.67 (0.36–1.23) | 0.22 |
Trajectoryc (N = 202) | ||||||||
Shannon index (−0.88, 1.16) | 0.73 (0.20–2.73) | 71 | 1.00 | 57 | 0.80 (0.36–1.78) | 74 | 0.86 (0.39–1.88) | 0.71 |
Observed ASVs (−0.96, 2.09) | 0.79 (0.30–2.06) | 70 | 1.00 | 63 | 0.54 (0.24–1.21) | 69 | 0.78 (0.34–1.76) | 0.61 |
Faith PD (−0.86, 0.79) | 0.45 (0.09–2.17) | 64 | 1.00 | 76 | 1.09 (0.52–2.32) | 62 | 0.96 (0.42–2.19) | 0.93 |
. | Continuous . | Tertile 1 . | Tertile 2 . | Tertile 3 . | . | |||
---|---|---|---|---|---|---|---|---|
Timepoint/alpha diversity metric (range) . | OR (95% CI) . | N . | OR (95% CI) . | N . | OR (95% CI) . | N . | OR (95% CI) . | Ptrendb . |
Baseline (N = 316) | ||||||||
Shannon index (1.44, 6.95) | 0.93 (0.73–1.19) | 109 | 1.00 | 101 | 0.87 (0.48–1.57) | 106 | 0.85 (0.46–1.58) | 0.58 |
Observed ASVs (22.00, 362.00) | 1.00 (0.99–1.00) | 114 | 1.00 | 101 | 0.91 (0.50–1.64) | 101 | 0.68 (0.36–1.25) | 0.22 |
Faith PD (6.09, 33.81) | 1.01 (0.96–1.07) | 106 | 1.00 | 109 | 1.18 (0.66–2.11) | 101 | 0.91 (0.49–1.66) | 0.75 |
Year 1 (N = 354) | ||||||||
Shannon index (0.29, 6.82) | 0.98 (0.78–1.23) | 111 | 1.00 | 129 | 1.38 (0.80–2.41) | 114 | 1.06 (0.59–1.93) | 0.68 |
Observed ASVs (13.00, 348.00) | 1.00 (0.99–1.00) | 119 | 1.00 | 122 | 0.96 (0.55–1.67) | 113 | 0.86 (0.48–1.52) | 0.60 |
Faith PD (2.97, 36.37) | 1.03 (0.98–1.08) | 108 | 1.00 | 125 | 1.45 (0.82–2.57) | 121 | 1.24 (0.70–2.22) | 0.51 |
Year 4 (N = 294) | ||||||||
Shannon index (0.74, 7.25) | 0.80 (0.62–1.01) | 104 | 1.00 | 102 | 0.82 (0.45–1.50) | 88 | 0.40 (0.21–0.76) | 0.01 |
Observed ASVs (9.00, 384.00) | 1.00 (0.99–1.00) | 98 | 1.00 | 103 | 0.88 (0.49–1.60) | 93 | 0.56 (0.29–1.05) | 0.07 |
Faith PD (2.93, 34.39) | 0.96 (0.91–1.02) | 106 | 1.00 | 82 | 0.49 (0.25–0.94) | 106 | 0.67 (0.36–1.23) | 0.22 |
Trajectoryc (N = 202) | ||||||||
Shannon index (−0.88, 1.16) | 0.73 (0.20–2.73) | 71 | 1.00 | 57 | 0.80 (0.36–1.78) | 74 | 0.86 (0.39–1.88) | 0.71 |
Observed ASVs (−0.96, 2.09) | 0.79 (0.30–2.06) | 70 | 1.00 | 63 | 0.54 (0.24–1.21) | 69 | 0.78 (0.34–1.76) | 0.61 |
Faith PD (−0.86, 0.79) | 0.45 (0.09–2.17) | 64 | 1.00 | 76 | 1.09 (0.52–2.32) | 62 | 0.96 (0.42–2.19) | 0.93 |
Abbreviations: ASV, amplicon sequence variant; CI, confidence interval; OR, odds ratio; PD, phylogenetic diversity.
aCovariates in multivariable logistic regression models included: age at randomization, sex, randomization group (intervention or control arm), body mass index (kg/m2) at baseline, adenoma characteristics at baseline [high-risk adenomas (i.e., lesions with a maximal diameter of ≥ 1 cm, ≥ 25% villous elements, evidence of high-grade dysplasia, including carcinoma or ≥ 2 adenomas) or early adenoma], adenoma/polyp characteristics at year 1 (no adenoma/polyps, high-risk adenomas, early adenoma, or hyperplastic polyp), education level (post-college graduate or ≤ college graduate), center (Memorial Sloan Kettering/University of Buffalo/University of Pittsburgh/Edward Hines Jr. Hospital, Wake Forest University/Walter Reed Medical Center, Kaiser Foundation Research Institute, or University of Utah), smoking status at baseline (current, former or never regular), family history of colorectal cancer (yes or no), regular aspirin or NSAID use (yes or no), and days between colonoscopy and biopsy at the pertinent timepoint (days were set to zero if biopsy collected before colonoscopy); patients with missing data were excluded from model.
bAfter correcting for multiple testing via Bonferroni correction, significant P values were considered to be P < 0.02.
cTrajectory calculated as [αY4 − mean (αBaseline, αY1)]/mean (αBaseline, αY1); if participant was missing either baseline or year 1 measurements, the nonmissing measurement was used; if participant was missing year 4 measurements or both baseline and year 1 measurements they were excluded from the trajectory analysis.
We further investigated cross-sectional associations of alpha diversity with prevalence of adenoma characteristics at baseline and at year 1 (Supplementary Table S4). At baseline, alpha diversity was strongly inversely associated with prevalent advanced and/or multiple adenomas (vs. early adenomas). For example, comparing individuals in the highest and lowest tertiles of Faith PD, the OR (95% CI) was 0.49 (0.28, 0.86; Ptrend = 0.01). At year 1, considering those with no adenomas as the reference group, the associations of alpha diversity with adenoma prevalence characteristics were variable and unstable.
In MiRKAT tests assessing multivariate differences in overall microbiome composition (Table 4), no consistent, statistically significant beta diversity associations were observed at any of the timepoints.
. | Bray–Curtis . | Unweighted Unifrac . | Weighted Unifrac . | Omnibus . |
---|---|---|---|---|
Timepoint . | P . | P . | P . | Pb . |
Baseline | 0.43 | 0.09 | 0.05 | 0.12 |
Year 1 | 0.88 | 0.03 | 0.41 | 0.08 |
Year 4 | 0.71 | 0.60 | 0.58 | 0.86 |
. | Bray–Curtis . | Unweighted Unifrac . | Weighted Unifrac . | Omnibus . |
---|---|---|---|---|
Timepoint . | P . | P . | P . | Pb . |
Baseline | 0.43 | 0.09 | 0.05 | 0.12 |
Year 1 | 0.88 | 0.03 | 0.41 | 0.08 |
Year 4 | 0.71 | 0.60 | 0.58 | 0.86 |
Abbreviation: MiRKAT, Microbiome regression-based analysis tests.
aMiRKAT models adjusted for age, sex, and intervention arm (intervention or control arm).
bAfter correcting for multiple testing via Bonferroni correction, significant P values were considered to be P < 0.02.
Associations of bacteria with adenoma recurrence
Associations of relative abundance and presence of a priori selected bacteria at year 4 with adenoma recurrence are presented in Fig. 2 (see associations at baseline and year 1 in Supplementary Fig. S2; see ORs and 95% CIs in Supplementary Table S5). Among the a priori selected bacteria, relative abundance of Bacteroides was slightly more strongly, positively associated with adenoma prevalence cross-sectionally than prospectively [OR (95% CI) per 1% increase in year 4 Bacteroides abundance = 1.03 (1.00–1.05); P = 0.03]. No other a priori selected bacteria were statistically significantly associated with adenoma recurrence.
Among the 85 genera included for exploratory relative abundance analyses and 144 taxa for exploratory presence/absence analyses (Supplementary Table S6), no bacteria at any timepoint were statistically significantly associated with adenoma recurrence at the Bonferroni threshold. However, baseline presence of Lachnospiraceae uncultured genus level group-001 was inversely associated with adenoma recurrence [OR (95% CI) = 0.48 (0.28–0.80); P = 0.01]; year 1 presence of Corynebacterium 1 and Staphylococcus was positively associated with adenoma recurrence [ORs (95% CIs) = 1.81 (1.13–2.90) and 1.89 (1.16–3.10), respectively; both P = 0.01]; and, year 4 Akkermansia presence was inversely associated with adenoma prevalence [OR = 0.47 (0.27–0.83); P = 0.01].
Discussion
In this 4-year prospective study of 455 individuals with a history of colorectal adenomas, microbial profiles of rectal tissue samples collected at baseline or within the first year were not prospectively associated with adenoma recurrence. However, we did find that microbial profiles of rectal tissue collected at the fourth year, primarily alpha diversity, were cross-sectionally associated with adenoma prevalence. This suggests that microbial characteristics in the rectal tissue may not be reflective of adenoma recurrence risk but may be the result of physiologic changes resulting from having had an adenoma develop sometime between year 1 and years 3 and/or 4. In addition, we found that the rectal tissue microbiome was not stable over time, and that most of the variability in the microbial communities was due to interindividual differences and not to other measured factors including year of collection, adenoma status, or time between biopsy and colonoscopy.
We found that rectal tissue alpha diversity was inversely associated, and Bacteroides abundance positively associated, with prevalent adenoma but that beta diversity was not associated with prevalent adenoma. One previous study investigated the association of the rectal tissue microbiome with prevalent adenoma among 33 adenoma cases and 38 adenoma-free controls undergoing colonoscopy screening at the University of North Carolina. Using 454 pyrosequencing of the V1-V2 region of the 16S rRNA gene, in contrast to our findings, they found that microbial richness was higher in cases compared with controls, but there were no observed differences for evenness. In the principal component analysis, statistically significant clustering by case status was observed. At the genus level, relative abundance of 30 genera were higher in adenoma cases compared with controls, including Akkermansia, while the relative abundance of one genus (Streptococcus) was higher in controls (21). The differences between the findings in this study and our study may have arisen, at least in part, due to different timing of the rectal biopsy collection, different sequencing technologies and 16S rRNA gene target regions, or population differences.
Although no studies have investigated the prospective association of the gut microbiome with adenoma, a number of studies have investigated cross-sectional associations, primarily using fecal samples (22). In the largest of these previous studies, a total of 144 colorectal adenoma cases, 40 hyperplastic polyp cases, 33 sessile serrated adenoma cases, and 323 controls from Minnesota and New York provided fecal samples for 16S rRNA gene sequencing analysis. Compared with polyp-free controls, alpha diversity was inversely associated with colorectal adenoma, similar to our cross-sectional year 4 findings, but in contrast, they found that alpha diversity was slightly positively associated with hyperplastic polyps. Like our beta diversity findings, colorectal adenoma and/or polyp cases did not appear to have significantly different bacterial communities compared with controls. In a differential abundance analysis, 25 OTUs were statistically significantly associated with prevalent adenoma, including multiple OTUs in the order Clostridiales (inversely associated) and Streptococcus (positively associated; ref. 23). It is not surprising that our results differ from those observed using fecal samples, as rectal tissue biopsies appear to have distinct microbial communities compared with fecal samples (24, 25).
We found that stability of the rectal tissue microbiome in our study was generally low over 4 years. No other studies assessed stability of the rectal tissue microbiome; however, multiple studies found the stability of the fecal microbiome among relatively healthy individuals to be fairly high over a course of 6 months to 2 years apart (26–28). It is possible that the relatively low ICCs observed in our study could be due to having an adenoma removed within 6 months prior to baseline or repeated colonoscopies and bowel preparation, because bowel preparation has been shown to modify the fecal and colon tissue microbiome. However, most studies suggest that the microbiome is restored by 2 weeks after bowel preparation and we found that alpha and beta diversity did not substantially vary comparing biopsies collected before or after colonoscopy (29–31). In addition, we were unable to identify factors strongly related to the microbial community variability in this population which is common in many fecal microbiome studies (26, 32). Interestingly, in a study of 7,009 individuals from 14 districts within one province in China, the factor most strongly associated with fecal microbiome composition, outside of interindividual variability, was host location which explained approximately 6% of the community variation (32). We found that other than interindividual variability, the study center explained the most variability in rectal tissue bacterial community composition which may be due to geographic differences or differences in tissue collection and/or processing.
This study had several strengths. It is the first study to collect rectal tissue samples longitudinally at multiple timepoints over 4 years. With this study design, it was possible to not only investigate cross-sectional, but also prospective, associations of the rectal tissue microbiome with recurrent adenoma. In addition, because all participants underwent an additional colonoscopy a year after the removal of baseline polyps, our recurrent adenoma category likely represents true recurrence versus a previously missed polyp. However, this study also has some limitations. The microbiome of rectal tissue may differ from colon tissue (where most adenomas were located) and likely differs from fecal samples, which have been more frequently studied in relation to adenomas and colorectal cancer; however, rectal biopsies may be a useful specimen to study the gut microbiome and some research indicates that the microbial communities are relatively homogenous across the colon and rectum (33–35), though the role of the gut microbiome in the etiology of colorectal neoplasms could differ by location (10, 36, 37). Tissue specimens have a relatively low bacterial biomass, so contamination may be of concern. In particular, the center of collection explained the greatest variability in the beta diversity matrices (ranging from 2.74% to 6.78%). We also did not have information on recent antibiotic use, an important modifier of the gut microbiome. These tissue samples were stored frozen for many years prior to DNA extraction, so it is possible that the bacterial communities changed over time. However, all samples and data were collected using the same procedure and without regard to adenoma recurrence status, so any contamination, missing data, and community changes due to storage are likely to be nondifferential, potentially attenuating our findings. Because the power to detect statistically significant associations is partially related to the temporal stability of the microbial metrics (15, 28), our study was likely underpowered to detect weak or moderate associations. Furthermore, we investigated associations of a number of microbiome metrics, and although we adjusted for multiple comparisons, there may still be concerns about chance findings. Finally, our findings may not apply to the general population because all participants had a history of adenomas. Future prospective studies with larger sample sizes and multiple biospecimens would be useful to investigate microbiome associations with incident and recurrent colorectal adenomas.
In conclusion, we found that the rectal tissue microbiome may have stronger associations with recurrent adenomas cross-sectionally rather than prospectively. These findings need replication in independent populations. Given the present lack of prospective studies of the microbiome and colorectal neoplasms, more prospective studies in humans are needed to understand whether the associations observed in prior case–control studies are related to the etiology of cancer or if they are due to reverse causation.
Authors' Disclosures
No disclosures were reported.
Authors' Contributions
D.A. Byrd: Conceptualization, data curation, formal analysis, visualization, methodology, writing–original draft, writing–review and editing. E. Vogtmann: Conceptualization, data curation, formal analysis, methodology, writing–original draft, writing–review and editing. A.M. Ortega-Villa: Formal analysis, methodology, writing–review and editing. Y. Wan: Formal analysis, methodology, writing–review and editing. M. Gomez: Data curation, validation, visualization, writing–review and editing. S. Hogue: Data curation, validation, writing–review and editing. A. Warner: Investigation, methodology, writing–review and editing. B. Zhu: Formal analysis, methodology, writing–review and editing. C. Dagnall: Investigation, methodology, writing–review and editing. K. Jones: Investigation, methodology, writing–review and editing. B. Hicks: Investigation, methodology, writing–review and editing. P.S. Albert: Formal analysis, methodology, writing–review and editing. G. Murphy: Conceptualization, formal analysis, writing–original draft, writing–review and editing. R. Sinha: Conceptualization, formal analysis, writing–original draft, writing–review and editing.
Acknowledgments
We thank the PPT Study Group and study participants for their contribution to this project.
This study was supported by funding from the Intramural Research Program of the NCI at the NIH. This project has been funded in whole or in part with Federal funds from the NCI, NIH, under contract no. 75N91019D00024. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).