Background:

The link between modifiable breast cancer risk factors and tumor genomic alterations remains largely unexplored. We evaluated the association of prediagnostic body mass index (BMI), cigarette smoking, and alcohol consumption with somatic copy number variation (SCNV), total somatic mutation burden (TSMB), seven single base substitution (SBS) signatures (SBS1, SBS2, SBS3, SBS5, SBS13, SBS29, and SBS30), and nine driver mutations (CDH1, GATA3, KMT2C, MAP2K4, MAP3K1, NCOR1, PIK3CA, RUNX1, and TP53) in a subset of The Cancer Genome Atlas (TCGA).

Methods:

Clinical and genomic data were retrieved from the TCGA database. Risk factor information was collected from four TCGA sites (n = 219 women), including BMI (1 year before diagnosis), cigarette smoking (smokers/nonsmokers), and alcohol consumption (current drinkers/nondrinkers). Multivariable regression analyses were conducted in all tumors and stratified according to estrogen receptor (ER) status.

Results:

Increasing BMI was associated with increasing SCNV in all women (P = 0.039) and among women with ER tumors (P = 0.031). Smokers had higher SCNV and TSMB versus nonsmokers (P < 0.05 all women). Alcohol drinkers had higher SCNV versus nondrinkers (P < 0.05 all women and among women with ER+ tumors). SBS3 (defective homologous recombination-based repair) was exclusively found in alcohol drinkers with ER disease. GATA3 mutation was more likely to occur in women with higher BMI. No association was significant after multiple testing correction.

Conclusions:

This study provides preliminary evidence that BMI, cigarette smoking, and alcohol consumption can influence breast tumor biology, in particular, DNA alterations.

Impact:

This study demonstrates a link between modifiable breast cancer risk factors and tumor genomic alterations.

Breast cancer is the most common cancer diagnosed in women in the United States (1). Understanding the molecular effects of epidemiological risk factors can provide insights into breast cancer etiology and progression, and may lead to better prevention efforts and treatment guidelines. Breast cancer risk factors can be classified as nonmodifiable [e.g., family history (2) and previous diagnosis of benign breast disease (3)] or modifiable [e.g., being overweight (4), low physical activity (5), high alcohol consumption (6), and cigarette smoking (7, 8)].

Using gene expression data, we previously evaluated the molecular influence of adiposity and alcohol consumption on breast cancer biology (9, 10). Among postmenopausal women, we found a positive association between body mass index (BMI) and cellular proliferation pathways in estrogen receptor positive (ER+) breast tumors, whereas there were positive associations with inflammation pathways in women with ER disease (9). Breast tumors from women who consumed more than 10 g of alcohol per day displayed increased cellular proliferation compared with nondrinkers (10). Our work may partly explain why women with ER+ disease and high BMI have poorer prognosis (11, 12), and provide new insights into alcohol-related breast tumorigenesis. Some studies have reported the association of breast cancer risk factors with breast tumor genomic alterations. Germline BRCA1/2 mutation, a hereditary breast cancer risk factor, is associated with increased genomic instability (13). TP53 somatic mutations are more frequent in breast tumors of early parous women and current smokers (14, 15). However, the link between modifiable breast cancer risk factors and tumor genomic alterations remains largely unexplored.

The Cancer Genome Atlas (TCGA) is a large-scale effort by the NCI and National Human Genome Research Institute to understand the molecular basis of cancers (16). To date, over 1,200 breast tumors have been characterized in TCGA using genomic, transcriptomic, and proteomic profiling technologies (17). We obtained breast cancer risk factor information from a subset of 219 TCGA women. In this study, we evaluated the association of modifiable breast cancer risk factors with (i) somatic copy number variation (SCNV), (ii) total somatic mutation burden (TSMB), (iii) single base substitution (SBS) signatures, and (iv) breast cancer driver mutations.

Study population

The Committee on the Use of Human Subjects in Research at Brigham and Women's Hospital, Boston, MA, reviewed and approved this study (Protocol No.: 2010P001641). Breast cancer risk factor data were sought from participants contributed by collaborators at four of the largest TCGA sites [the University of Pittsburgh (UP), Roswell Park Comprehensive Cancer Center (RPCCC), the Mayo Clinic (MC), and Memorial Sloan Kettering Cancer Center (MSKCC)] as discussed previously (9, 10). Data were ultimately collected from 229 participants. Ten patients were subsequently excluded (1 male, 8 females redacted by TCGA, and 1 female with stage IV disease), resulting in 219 female breast cancer cases for this study.

Clinical and epidemiologic variables

Age at diagnosis, year of diagnosis, race, disease stage, tumor grade, menopausal status, and IHC results for ER, progesterone receptor (PR), and HER2 were retrieved from the TCGA clinical database. RPCCC and UP (53.4%) collected breast cancer risk data using a de novo self-administered questionnaire. Data from MC were previously collected as part of a case–control study (26.0%). Data from MSKCC were abstracted from patient medical records (20.5%). Three modifiable breast cancer risk factors (main exposures) were investigated in this study: BMI (kg/m2; one year before diagnosis), prediagnostic cigarette smoking use (smokers/nonsmokers), and prediagnostic alcohol consumption (current drinkers/nondrinkers; Table 1).

Table 1.

Demographic and lifestyle characteristics of 219 TCGA women in this study.

Clinical characteristics 
TCGA site, n (%) MC 57 (26.0) 
 MSKCC 45 (20.5) 
 UP 44 (20.1) 
 RPCCC 73 (33.3) 
Age at initial diagnosis, n (%) <45 years 45 (20.5) 
 ≥45 to <55 years 69 (31.5) 
 ≥55 to <65 years 53 (24.2) 
 ≥65 years 52 (23.7) 
Disease stage, n (%) 50 (22.8) 
 II 130 (59.4) 
 III 39 (17.8) 
Tumor grade, n (%) 37 (16.9) 
 63 (28.8) 
 47 (21.5) 
 Missing 72 (32.9) 
Race, n (%) White 192 (87.7) 
 Non-White 20 (9.1) 
 Missing 7 (3.2) 
Estrogen receptor status, n (%) Positive 179 (81.7) 
 Negative 40 (18.3) 
Progesterone receptor status, n (%) Positive 158 (72.1) 
 Negative 61 (27.9) 
HER2 status, n (%) Positive 28 (12.8) 
 Negative 186 (84.9) 
 Missing 5 (2.3) 
Modifiable risk factors 
BMI, n (%) <25 kg/m2 71 (32.4) 
 ≥25 to <30 kg/m2 53 (24.2) 
 ≥30 kg/m2 77 (35.2) 
 Missing 18 (8.2) 
Prediagnostic cigarette smoking, n (%) Smoker 81 (37.0) 
 Nonsmoker 104 (47.5) 
 Missing 34 (15.5) 
Prediagnostic alcohol consumption, n (%) Current drinker 129 (58.9) 
 Nondrinker 46 (21.0) 
 Missing 44 (20.1) 
Clinical characteristics 
TCGA site, n (%) MC 57 (26.0) 
 MSKCC 45 (20.5) 
 UP 44 (20.1) 
 RPCCC 73 (33.3) 
Age at initial diagnosis, n (%) <45 years 45 (20.5) 
 ≥45 to <55 years 69 (31.5) 
 ≥55 to <65 years 53 (24.2) 
 ≥65 years 52 (23.7) 
Disease stage, n (%) 50 (22.8) 
 II 130 (59.4) 
 III 39 (17.8) 
Tumor grade, n (%) 37 (16.9) 
 63 (28.8) 
 47 (21.5) 
 Missing 72 (32.9) 
Race, n (%) White 192 (87.7) 
 Non-White 20 (9.1) 
 Missing 7 (3.2) 
Estrogen receptor status, n (%) Positive 179 (81.7) 
 Negative 40 (18.3) 
Progesterone receptor status, n (%) Positive 158 (72.1) 
 Negative 61 (27.9) 
HER2 status, n (%) Positive 28 (12.8) 
 Negative 186 (84.9) 
 Missing 5 (2.3) 
Modifiable risk factors 
BMI, n (%) <25 kg/m2 71 (32.4) 
 ≥25 to <30 kg/m2 53 (24.2) 
 ≥30 kg/m2 77 (35.2) 
 Missing 18 (8.2) 
Prediagnostic cigarette smoking, n (%) Smoker 81 (37.0) 
 Nonsmoker 104 (47.5) 
 Missing 34 (15.5) 
Prediagnostic alcohol consumption, n (%) Current drinker 129 (58.9) 
 Nondrinker 46 (21.0) 
 Missing 44 (20.1) 

BMI data were available for 43 of 44 (97.7%) UP participants, 62 of 73 (84.9%) RPCCC participants, 42 of 46 (91.3%) MSKCC participants, and 55 of 57 (96.5%) MC participants. For RPCCC and UP participants, BMI was calculated using self-reported weight at 1 year prior to breast cancer diagnosis and present height (without shoes) from the questionaire. For MSKCC participants, BMI was calculated using the weight (up to 1 year prior to breast cancer diagnosis) and height extracted from medical records. For MC participants, BMI was extracted from their previous case–control study database.

Smoking data were available for 44 of 44 (100%) UP participants, 63 of 73 (86.3%) RPCCC participants, 46 of 46 (100%) MSKCC participants, and 33 of 57 (57.9%) MC participants. At RPCCC and UP, participants self-reported their smoking status as never smoked, quit before diagnosis, or smoking at diagnosis on the questionaire. MSKCC abstracted smoking information from medical records as never smoked, quit before diagnosis, or smoking at diagnosis. For MC, we abstracted smoking information from their case–control study database: smoking up to 1 year prior to diagnosis or smoking currently at diagnosis. In total, 81 women reported smoking, 69 (85.2%) were past smokers, 11 (13.6%) were current smokers, and 1 (1.2%) reported smoking but current status was unknown. Cigarette smoking exposure was recategorized into smokers (n = 81) and nonsmokers (n = 104).

Alcohol data were available for 44 of 44 (100%) UP participants, 62 of 73 (84.9%) RPCCC participants, 36 of 46 (78.3%) MSKCC participants, and 33 of 57 (57.9%) MC participants. For RPCCC and UP, patients selected one out of seven categories on the questionaire that reflected the average number of alcohol drinks they consumed at ages 18, 30, 45, and 60: none, <1 drink per week, 1 to 6 drinks per week, 1 drink per day, 2 to 3 drinks per day, >3 drinks per day, and not applicable. MSKCC abstracted alcohol consumption up to the year before breast cancer diagnosis from medical records. Alcohol data from MC were the total number of alcohol drinks consumed in the year before breast cancer diagnosis and recorded as a continuous number ranging from 0 to 14. Data from MC were recoded to match the categories in our questionnaire: 0 remains coded as none (n = 5), 0.5 drinks was recoded as <1 drink per week (n = 5), 1 to 5 drinks was recoded as 1 to 6 drinks per week (n = 18), 7 drinks were recoded as 1 drink per day (n = 2), 8 to 14 drinks were recoded as 2 to 3 drinks per day (n = 3), and 24 patients were coded as missing. In this manuscript, alcohol exposure was defined as the alcohol intake closest to but before diagnosis of breast cancer. The total numbers collected in the seven categories were: none (n = 46; 21.0%), <1 drink per week (n = 54; 24.7%), 1 to 6 drinks per week (n = 57; 26.0%), 1 drink per day (n = 10; 4.6%), 2 to 3 drinks per day (n = 6; 2.7%), >3 drinks per day (n = 2; 0.9%), and unknown (n = 44; 20.1%). Alcohol exposure was also recategorized into current drinkers (n = 129) and nondrinkers (n = 46).

Tumor genomic data

Tumor genomic data were downloaded from (http://cancergenome.nih.gov/): normalized SCNV (version 2024; mapped using GRCh38; Affymetrix Genome-Wide Human SNV Array 6.0) and somatic mutations (version gdc-1.0.0; derived from Illumina Genome Analyzer II whole-exome sequencing and MuTect somatic mutation calls; ref. 18). Of 219 participants, 218 had SCNV data and 192 had somatic mutation data. For SCNV, chromosomal segments that were quantitated by <10 probes were excluded and a cutoff of 0.2 was used to indicate a gain or loss segment (19); total SCNV was obtained by summing all the segments that passed the cutoff (i.e, ≥|0.2|). TSMB was calculated by summing all significant synonymous and nonsynonymous mutations for each case. SCNV and TSMB data were winsorized to 96% where data below the second percentile were set to the value at the second percentile and data above the 98th percentile were set to the value at the 98th percentile.

Twenty-five distinct SBS signatures derived using TCGA whole-exome sequencing data were available for 186 breast cancer cases (20–22). SBS signatures were derived by applying a previously established methodology (23) to the MC3 release of TCGA somatic mutations (24). Briefly, the activity of each of the consensus COSMICv3 SBS signatures (25) was quantified in each of the examined TCGA breast cancer samples. For each SBS signature, every participant is represented by a score which reflects the numbers of mutations per megabase attributed to an SBS signature. The scores for each signature were winsorized to 96%. To reduce spurious findings, only seven SBS signatures with at least 15 cases with nonzero scores were analyzed. The seven SBS signatures investigated were: SBS1 (cell-division/mitotic clock), SBS2 (hyperactivity of AID/APOBEC enzymes), SBS3 (defective homologous recombination-based DNA repair), SBS5 (ERCC2 mutations/cigarette smoking), SBS13 (similar to SBS2), SBS29 (putative etiology of tobacco chewing attributed in oral cancer), and SBS30 (deficiency in base excision repair due to inactivating mutations in NTHL1).

Forty-three somatic mutations implicated in breast cancer (i.e., driver genes) were first identified using Mutation Significance version 2 (26) and Genomic Identification of Significant Targets in Cancer (27): AFF2, AKT1, ARID1A, BRCA1, BRCA2, CASP8, CBFB, CDH1, CDKN1B, CTCF, CUL4B, ERBB2, FOXA1, GATA3, GPRIN2, GPS2, HIST1H3B, KMT2A, KMT2C, KRAS, MAP2K4, MAP3K1, MED23, MYB, NBL1, NCOR1, NF1, PIK3CA, PIK3R1, PTEN, PTPN22, PTPRD, RAB40A, RB1, RUNX1, SF3B1, SPEN, STAG2, TBL1XR1, TBX3, TMEM151B, TP53, and ZFP36L1 (28, 29). To reduce spurious findings, analyses were only performed in nine mutations that occurred in at least 10 cases for each exposure: CDH1, GATA3, KMT2C, MAP2K4, MAP3K1, NCOR1, PIK3CA, RUNX1, and TP53 (Supplementary Table S1).

Statistical analyses

SCNV and TSMB were log10 transformed. SBS scores were cube root transformed. The Mann–Whitney test, Kruskal–Wallis test, Fisher test, and Spearman rho were used to evaluate the relationships between clinical characteristics (i.e., age and year of diagnosis, race, ER/PR/HER2, grade, and stage) and SCNV, TSMB, or somatic mutations. To evaluate the relationship between the exposures (BMI, cigarette smoking, and alcohol intake) and SCNV, TSMB, or a SBS signature, multivariable linear regression models were run, adjusted for the following clinical characteristics: age and year of diagnosis (model 1); age and year of diagnosis, ER/PR/HER2, grade, stage, and TCGA site (model 2). Secondary analyses stratified the women by ER status. BMI (per 5 kg/m2 increase) was analyzed as a continuous variable; cigarette smoking and alcohol consumption were analyzed as categoric variables.

To evaluate the relationship between each exposure and a driver mutation, we performed multivariable logistic regression, adjusting for two sets of covariates. Model 1 adjusted for age and year of diagnosis as above; model 2 adjusted for age and year of diagnosis, ER/PR, stage, TSMB, and TCGA site. HER2 was not included due to small numbers of HER2+ cases. Secondary analyses were restricted to women with ER+ tumors only. Sensitivity analyses were conducted by restricting to the White population to determine whether the results were influenced by race.

Data are presented as the mean counts ± standard deviation or β estimate ± standard error in the text, or unless otherwise stated. Analyses were conducted using R, version 3.4.2. Statistical significance tests were two-sided. Statistical significance was defined as P < 0.05. None of the reported P values were formally adjusted for multiple testing.

These 219 TCGA participants were diagnosed with invasive breast cancer between 2001 and 2011 (median year = 2008). The majority were white postmenopausal women, between 45 and 55 years old with predominantly ER+/PR+/HER2 early-stage breast cancer (Table 1). They were also predominantly never smokers but were current drinkers and 35.2% were overweight as defined by BMI ≥30. When stratified by TCGA site, patient and tumor characteristics were similar (Supplementary Table S2). Tumor grade was not available (88.9%) for the majority of MSKCC participants. The number of UP, MC, and MSKCC participants were similar across the three BMI categories. Most RPCCC participants were either underweight (BMI < 25; 32.9%) or overweight (BMI ≥30; 41.1%). Most of the participants from MSKCC and UP were nonsmokers; there were more smokers among RPCCC participants; and the number of smokers and nonsmokers were similar among MC participants. Most of the participants from each site were current drinkers.

First, the association of clinical characteristics (i.e., age and year of diagnosis, race, ER/PR/HER2 status, grade, stage, and TCGA site) with SCNV counts or TSMB was evaluated. Year of diagnosis, ER/PR/HER2 status, and grade were significantly associated with SCNV and/or TSMB (P < 0.05; Supplementary Table S3). On the basis of these results and following the approach of Zhu and colleagues (19), we adjusted for age and year of diagnosis, ER/PR/HER2 status, grade, stage, and TCGA site in the multivariable regression analyses comparing exposures with SCNV or TSMB.

In all women, increasing BMI was significantly associated with increased SCNV after adjusting for age and year of diagnosis, ER/PR/HER2 status, tumor grade, and stage (model 2 P = 0.039; Table 2). Increasing BMI was also significantly associated with increased SCNV among women with ER tumors after adjusting for age and year of diagnosis (model 1 P = 0.031; Table 2). When further stratified by menopausal status, the association between BMI and SCNV did not change appreciably among all postmenopausal women or among postmenopausal women with ER tumors. There was no relationship between BMI and SCNV in premenopausal women (Supplementary Table S4).

Table 2.

Multivariable analyses with SCNV and TSMB.

Log10 SCNVLog10 TSMB
ModelnEstimateSEP valuenEstimateSEP value
BMI per 5 kg/m2 increase 
 All tumors 200 0.060 0.032 0.062 178 0.006 0.019 0.761 
 130 0.077 0.037 0.039 117 0.013 0.020 0.522 
 ER+ tumors 163 0.039 0.033 0.245 144 –0.007 0.019 0.706 
 106 0.068 0.039 0.082 94 0.004 0.022 0.841 
 ER tumors 37 0.212 0.094 0.031 34 0.091 0.054 0.103 
 24 0.170 0.131 0.218 23 0.117 0.062 0.085 
Prediagnostic cigarette smoking 
 All tumors 184 0.152 0.087 0.082 159 0.135 0.052 0.010 
 120 0.252 0.105 0.018 107 0.145 0.061 0.020 
 ER+ tumors 148 0.125 0.095 0.192 127 0.114 0.058 0.053 
 97 0.227 0.115 0.051 85 0.150 0.070 0.036 
 ER tumors 36 0.262 0.208 0.218 32 0.150 0.114 0.199 
 23 0.257 0.354 0.483 22 0.132 0.172 0.462 
Prediagnostic alcohol consumption 
 All tumors 174 0.261 0.100 0.010 155 0.027 0.065 0.671 
 119 0.241 0.129 0.064 107 0.032 0.077 0.680 
 ER+ tumors 141 0.230 0.108 0.035 124 0.026 0.071 0.719 
 96 0.169 0.137 0.219 85 0.079 0.084 0.354 
 ER+ tumors 33 0.364 0.253 0.161 31 0.091 0.140 0.519 
 23 0.513 0.497 0.324 22 –0.204 0.241 0.418 
Log10 SCNVLog10 TSMB
ModelnEstimateSEP valuenEstimateSEP value
BMI per 5 kg/m2 increase 
 All tumors 200 0.060 0.032 0.062 178 0.006 0.019 0.761 
 130 0.077 0.037 0.039 117 0.013 0.020 0.522 
 ER+ tumors 163 0.039 0.033 0.245 144 –0.007 0.019 0.706 
 106 0.068 0.039 0.082 94 0.004 0.022 0.841 
 ER tumors 37 0.212 0.094 0.031 34 0.091 0.054 0.103 
 24 0.170 0.131 0.218 23 0.117 0.062 0.085 
Prediagnostic cigarette smoking 
 All tumors 184 0.152 0.087 0.082 159 0.135 0.052 0.010 
 120 0.252 0.105 0.018 107 0.145 0.061 0.020 
 ER+ tumors 148 0.125 0.095 0.192 127 0.114 0.058 0.053 
 97 0.227 0.115 0.051 85 0.150 0.070 0.036 
 ER tumors 36 0.262 0.208 0.218 32 0.150 0.114 0.199 
 23 0.257 0.354 0.483 22 0.132 0.172 0.462 
Prediagnostic alcohol consumption 
 All tumors 174 0.261 0.100 0.010 155 0.027 0.065 0.671 
 119 0.241 0.129 0.064 107 0.032 0.077 0.680 
 ER+ tumors 141 0.230 0.108 0.035 124 0.026 0.071 0.719 
 96 0.169 0.137 0.219 85 0.079 0.084 0.354 
 ER+ tumors 33 0.364 0.253 0.161 31 0.091 0.140 0.519 
 23 0.513 0.497 0.324 22 –0.204 0.241 0.418 

Note: Model 1 adjusted for age and year of diagnosis. Model 2 adjusted for age and year of diagnosis, ER, PR, HER2 status, tumor grade, stage, and study site. Bolded values indicate significance (P < 0.05).

Smokers had higher SCNV compared with nonsmokers (mean counts 160.0 ± 150.7 vs. 147.3 ± 162.5; all women model 2 P = 0.018; Table 2); the magnitude of this association is similar among women with ER+ tumors. After further adjusting for alcohol consumption, the association between cigarette smoking and SCNV for all women did not change appreciably (β = 0.101 ± 0.092, P = 0.274 for model 1; β = 0.208 ± 0.109, P = 0.059 for model 2). Smokers also had higher TSMB compared with nonsmokers (mean counts 109.7 ± 116.9 vs. 73.9 ± 67.3; all women model 1 P = 0.010 and model 2 P = 0.020; Table 2). Likewise, the magnitude of this association is similar among women with ER+ tumors (model 1 P = 0.053 and model 2 P = 0.036). The association between cigarette smoking and TSMB remained significant after further adjusting for alcohol consumption (β = 0.136 ± 0.055, P = 0.014 for model 1; β = 0.145 ± 0.063, P = 0.022 for model 2).

Analyses were repeated by comparing never-smokers (n = 104) and past smokers (n = 69). The results were similar to smokers versus nonsmokers. Past smokers had higher SCNV compared with never smokers (mean counts 159.3 ± 148.0 vs. 147.3 ± 162.5; all women model 1 P = 0.054 and model 2 P = 0.013; women with ER+ tumors model 2 P = 0.030; Supplementary Table S5). Past smokers had higher TSMB compared with never smokers (mean counts 111.7 ± 119.0 vs. 73.9 ± 67.3; all women model 1 P = 0.007 and model 2 P = 0.005; women with ER+ tumors model 2 P = 0.005; Supplementary Table S5).

Alcohol drinkers had higher SCNV compared with nondrinkers (mean counts 158.0 ± 156.3 vs. 118.3 ± 163.6; all women model 1 P = 0.010; Table 2). Higher SCNV was also observed among drinkers with ER+ disease (mean counts 137.1 ± 141.3 vs. 104.0 ± 161.5; model 1 P = 0.035). The association between alcohol and SCNV did not change appreciably after accounting for cigarette smoking in the multivariable models (all women model 1 β = 0.230 ± 0.104, P = 0.029 and model 2 β = 0.177 ± 0.131, P = 0.182; ER+ tumors model 1 β = 0.204 ± 0.113, P = 0.072 and model 2 β = 0.105 ± 0.141, P = 0.457).

Neither BMI nor cigarette smoking were associated with any SBS signature (Supplementary Tables S6 and S7). Alcohol consumption was associated with SBS3 in women with ER tumors (model 1 β = 2.13 ± 0.66, P = 0.004; model 2 β = 3.45 ± 0.84, P = 0.004; Supplementary Table S8). SBS3 was exclusively detected in alcohol drinkers with ER tumors—11/16 (68.8%) drinkers versus 0/8 nondrinkers (Fig. 1). BRCA1/2 was not detected in the ER tumors of these 16 drinkers.

Figure 1.

Alcohol consumption was associated with SBS signature 3 (SBS3) in women with ER tumors (model 1: β = 2.13 ± 0.66, P = 0.004; model 2: β = 3.45 ± 0.84, P = 0.004). SBS3 was exclusively detected in alcohol drinkers: 11 of 16 (68.8%) drinkers versus 0 of 8 nondrinkers.

Figure 1.

Alcohol consumption was associated with SBS signature 3 (SBS3) in women with ER tumors (model 1: β = 2.13 ± 0.66, P = 0.004; model 2: β = 3.45 ± 0.84, P = 0.004). SBS3 was exclusively detected in alcohol drinkers: 11 of 16 (68.8%) drinkers versus 0 of 8 nondrinkers.

Close modal

All clinical characteristics, except race and study site, was significantly associated with at least one driver mutation (Supplementary Table S9). Age and year of diagnosis, ER/PR status, stage, TSMB, and TCGA site, were accounted for in the multivariable binary logistic model analyses between each exposure and somatic mutation. Increasing BMI was associated with GATA3 mutation [all women model 2 OR = 1.43, 95% confidence interval (CI), 1.02–2.01; women with ER+ tumors model 2: OR = 1.43, 95% CI, 1.02–2.01; Supplementary Table S10]. The ORs for all women and within women with ER+ tumors were identical since GATA3 was only detected in ER+ tumors (also refer to Supplementary Table S1 for frequency numbers). No driver mutation was associated with cigarette smoking or alcohol consumption.

Sensitivity analyses were conducted for the main findings by restricting the dataset to White participants (Supplementary Table S11). The association between BMI and GATA3, and alcohol with SCNV and SBS3 remained significant. The relationship between BMI or smoking with SCNV and TSMB did not change appreciably.

Little is known about the molecular influence of breast cancer risk factors on tumor genomic profiles. Understanding the molecular impact of breast cancer risk factors, particularly modifiable risk factors, can enhance our knowledge of breast cancer etiology and point to new avenues for prevention strategies and treatment. In this pilot subset of 219 TCGA participants, we evaluated the association of three modifiable breast cancer risk factors and tumor genomic alterations. Higher BMI was positively associated with increased SCNV in all women and among women with ER tumors. Higher BMI was also associated with GATA3 mutation among women with ER+ tumors. Cigarette smoking was positively associated with increased SCNV and TSMB in all women. Current alcohol consumption was positively associated with increased SCNV for all women and among women with ER+ tumors. Current alcohol drinkers with ER disease exclusively expressed mutations associated with defective homologous recombination-based repair (SBS signature 3). The collective association of these exposures with elevated SCNV suggests that they enhance DNA genomic instability in breast tumors. Our study is the first to provide a preliminary direct molecular link between modifiable risk factors (BMI, cigarette smoking, and alcohol consumption) and breast tumor biology at the DNA level.

The relationship between BMI and breast cancer risk varies according to menopausal and ER status. High BMI is consistently associated with decreased risk of premenopausal breast cancer (both ER+ and ER) but increased risk of ER+ post-menopausal breast cancers (30–32). One possible mechanism linking high BMI and breast cancer risk is the exposure of breast tissues to high levels of estrogen produced by adipose tissues in overweight/obese women. High estrogen levels can increase cellular proliferation and initiate breast tumorigenesis (33). Indeed, we reported that tumors from postmenopausal women with higher BMI (versus lower BMI) were enriched for cellular proliferation (ER+ and ER tumors), and IFNα and IFNγ pathways (ER tumors; ref. 9). Gene networks involved in cellular proliferation were overexpressed in triple-negative breast cancers from premenopausal obese patients (34). GATA3 mutation, frequently observed in ER+ breast tumors, is associated with enhanced tumor growth (35, 36). Our current work suggests that the positive correlations between BMI and tumor proliferation may be in part driven by the presence of GATA3 mutation, and contributes additional evidence that BMI increases DNA genomic instability.

Cigarette smoking is associated with an increased (10%–20%) breast cancer risk, especially among women who started smoking at a young age (37) or smoked at least >10 years before their first birth (7, 8, 38). Current evidence suggests the association of cigarette smoking and breast cancer risk is limited to ER+ disease, not confounded by alcohol (8, 38), and the risk is proportional to smoking intensity (38). In our study, the observation of increased SCNV and TSMB in breast tumors of smokers (versus nonsmokers), unmodified by alcohol, is in line with tobacco carcinogenesis—inducing DNA damage, leading to misreplication, and subsequently increasing TSMB (20). We were unable to determine the association between prepregnancy smoking and TSMB as we did not collect adolescence and prepregnancy smoking information. However, in our subanalysis, our preliminary finding suggests that past smoking versus never smoking was also associated with increased SCNV and TSMB, especially in ER+ tumors. Among our 81 smokers, 37 (45.7%) smoked <1 cigarette pack/day, 25 (30.9%) smoked ≥1 pack/day, and 19 (23.4%) responses were missing. Thus, we were unable to evaluate the link between SCNV or TSMB and smoking intensity in our study due to small numbers. In addition, our small sample size may have reduced our ability to observe an association between cigarette smoking and TP53 mutations as previously reported by Conway and colleagues (15).

Mutational signatures created using SBS, small insertion and deletion, or double base substitution have been characterized across a wide spectrum of human cancers to better understand the diversity of mutational processes underlying cancer development (20–23, 39, 40). Specifically, signature SBS4 represents mutations occurring in epithelial cells directly exposed to cigarette smoke whereas SBS5 represents smoking-associated mutations occurring in cells not directly exposed to cigarette smoke. SBS4 was not detected in any of our samples. SBS5 was not associated with cigarette smoking in breast cancer. These findings are consistent with previous work where smoking-associated mutational signatures were not enriched in breast cancer (20). It could also be speculated that cigarette smoking influences breast carcinogenesis in a different manner, or breast cancer may harbor an undiscovered smoking-specific mutational signature.

Alcohol consumption is a well-established breast cancer risk factor (6, 41, 42). Proposed mechanisms for alcohol-induced breast carcinogenesis include elevated oxidative stress (43–45), DNA damage (46, 47), and estrogen metabolism (48, 49). To the best of our knowledge, this is the first study to demonstrate a relationship between alcohol consumption and genome-wide SCNV in breast tumors. TP53 protects breast tumor cells from alcohol-induced DNA damage in vitro (47). Because we observed mutations related to DNA damage and repair (i.e., SBS3) exclusively in ER tumors of alcohol drinkers, we would expect ER tumors to be enriched for TP53 mutations. Indeed, 14 of 15 drinkers and 3 of 7 nondrinkers with ER disease had TP53 mutations. More studies are warranted to investigate the link between alcohol consumption, TP53 mutation, dysfunctional DNA damage and repair, and ER breast carcinogenesis to better understand the complex interplay of mechanisms attributed to alcohol consumption in breast tumors.

Limitations of our study include small sample size, especially for ER and HER2+ disease, thus limiting analyses for dose–response assessments of exposure. We analyzed three exposures and performed subtype analyses by ER and menopausal status. The observed associations may reflect chance findings as no analysis survived multiple hypothesis testing after Bonferroni correction. We could not assess other mutational signatures derived using small insertion and deletion or double base substitution because only 23 of our participants had whole genome sequencing data. Finally, other modifiable risk factors such as physical activity were not collected.

In conclusion, our study provides preliminary evidence that BMI, cigarette smoking, and alcohol consumption, can influence breast tumor biology, in particular, DNA alterations. A deeper understanding of the molecular impact of breast cancer risk factors will enhance our knowledge of breast cancer etiology, and create opportunities for prevention strategies and treatment. Future larger epidemiologic studies are required to confirming these findings.

F.J. Couch reports receiving a commercial research grant from GRAIL and speakers bureau honoraria from Qiagen. No potential conflicts of interest were disclosed by the other authors.

Conception and design: Y.J. Heng, F. Modugno, R.M. Tamimi, P. Kraft

Development of methodology: P. Kraft

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): S.E. Hankinson, L.B. Alexandrov, C.B. Ambrosone, V.P. de Andrade, A.M. Brufsky, F.J. Couch, T.A. King, F. Modugno, C.M. Vachon

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Y.J. Heng, S.E. Hankinson, J. Wang, L.B. Alexandrov, A.M. Brufsky, R.M. Tamimi, P. Kraft

Writing, review, and/or revision of the manuscript: Y.J. Heng, S.E. Hankinson, J. Wang, C.B. Ambrosone, V.P. de Andrade, A.M. Brufsky, F.J. Couch, T.A. King, F. Modugno, C.M. Vachon, A.H. Eliassen, R.M. Tamimi, P. Kraft

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Y.J. Heng, V.P. de Andrade, F. Modugno

Study supervision: F. Modugno, P. Kraft

Funding for this project was provided by the Klarman Family Foundation (to Y.J. Heng), University of Pittsburgh School of Medicine Dean's Faculty Advancement Award (to F. Modugno), Susan G. Komen SAC110014 (to S.E. Hankinson), and the NIH Support Grant P30 CA016056 to Roswell Park Comprehensive Cancer Center. The data used in this study were in whole or in part based on the data generated by the TCGA Research Network: http://cancergenome.nih.gov/. We thank the TCGA participants and staff at the University of Pittsburgh, Roswell Park Comprehensive Cancer Center, the Mayo Clinic, and Memorial Sloan Kettering Cancer Center.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
American Cancer Society
.
Cancer facts & figures 2016
.
Atlanta, GA
:
American Cancer Society
; 
2016
.
Available from:
https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts-figures/cancer-facts-figures-2016.html.
2.
Collaborative Group on Hormonal Factors in Breast Cancer
. 
Familial breast cancer: collaborative reanalysis of individual data from 52 epidemiological studies including 58,209 women with breast cancer and 101,986 women without the disease
.
Lancet
2001
;
358
:
1389
99
.
3.
Tamimi
RM
,
Rosner
B
,
Colditz
GA
. 
Evaluation of a breast cancer risk prediction model expanded to include category of prior benign breast disease lesion
.
Cancer
2010
;
116
:
4944
53
.
4.
Calle
EE
,
Kaaks
R
. 
Overweight, obesity and cancer: epidemiological evidence and proposed mechanisms
.
Nat Rev Cancer
2004
;
4
:
579
91
.
5.
Thune
I
,
Brenn
T
,
Lund
E
,
Gaard
M
. 
Physical activity and the risk of breast cancer
.
N Engl J Med
1997
;
336
:
1269
75
.
6.
Chen
WY
,
Rosner
B
,
Hankinson
SE
,
Colditz
GA
,
Willett
WC
. 
Moderate alcohol consumption during adult life, drinking patterns, and breast cancer risk
.
JAMA
2011
;
306
:
1884
90
.
7.
Gaudet
MM
,
Gapstur
SM
,
Sun
J
,
Ryan Diver
W
,
Hannan
LM
,
Thun
MJ
. 
Active smoking and breast cancer risk: original cohort data and meta-analysis
.
J Natl Cancer Inst
2013
;
105
:
515
25
.
8.
Gaudet
MM
,
Carter
BD
,
Brinton
LA
,
Falk
RT
,
Gram
IT
,
Luo
J
, et al
Pooled analysis of active cigarette smoking and invasive breast cancer risk in 14 cohort studies
.
Int J Epidemiol
2017
;
46
:
881
93
.
9.
Heng
YJ
,
Wang
J
,
Ahearn
TU
,
Boyer
S
,
Zhang
X
,
Ambrosone
CB
, et al
Molecular mechanisms linking high body mass index to breast cancer etiology in post-menopausal tumor and tumor-adjacent tissues
.
Breast Cancer Res Treat
2019
;
173
:
667
77
.
10.
Wang
J
,
Heng
YJ
,
Eliassen
AH
,
Tamimi
RM
,
Hazra
A
,
Carey
VJ
, et al
Alcohol consumption and breast tumor gene expression
.
Breast Cancer Res
2017
;
19
:
108
.
11.
Ewertz
M
,
Jensen
MB
,
Gunnarsdóttir
,
Højris
I
,
Jakobsen
EH
,
Nielsen
D
, et al
Effect of obesity on prognosis after early-stage breast cancer
.
J Clin Oncol
2011
;
29
:
25
31
.
12.
Sparano
JA
,
Wang
M
,
Zhao
F
,
Stearns
V
,
Martino
S
,
Ligibel
JA
, et al
Obesity at diagnosis is associated with inferior outcomes in hormone receptor-positive operable breast cancer
.
Cancer
2012
;
118
:
5937
46
.
13.
Scully
R
,
Livingston
DM
. 
In search of the tumour-suppressor functions of BRCA1 and BRCA2
.
Nature
2000
;
408
:
429
32
.
14.
Nguyen
B
,
Venet
D
,
Lambertini
M
,
Desmedt
C
,
Salgado
R
,
Horlings
HM
, et al
Imprint of parity and age at first pregnancy on the genomic landscape of subsequent breast cancer
.
Breast Cancer Res
2019
;
21
:
25
.
15.
Conway
K
,
Edmiston
SN
,
Cui
L
,
Drouin
SS
,
Pang
J
,
He
M
, et al
Prevalence and spectrum of p53 mutations associated with smoking in breast cancer
.
Cancer Res
2002
;
62
:
1987
95
.
16.
The Cancer Genome Atlas Research Network
. 
The Cancer Genome Atlas Pan-Cancer analysis project
.
Nat Genet
2013
;
45
:
1113
20
.
17.
The Cancer Genome Atlas Network
. 
Comprehensive molecular portraits of human breast tumors
.
Nature
2012
;
490
:
61
70
.
18.
Sougnez
C
,
Gabriel
S
,
Meyerson
M
,
Lander
ES
,
Cibulskis
K
,
Lawrence
MS
, et al
Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples
.
Nat Biotechnol
2013
;
31
:
213
9
.
19.
Zhu
B
,
Mukherjee
A
,
Machiela
MJ
,
Song
L
,
Hua
X
,
Shi
J
, et al
An investigation of the association of genetic susceptibility risk with somatic mutation burden in breast cancer
.
Br J Cancer
2016
;
115
:
752
60
.
20.
Alexandrov
LB
,
Ju
YS
,
Haase
K
,
Van Loo
P
,
Martincorena
I
,
Nik-Zainal
S
, et al
Mutational signatures associated with tobacco smoking in human cancer
.
Science
2016
;
354
:
618
22
.
21.
Alexandrov
LB
,
Jones
PH
,
Wedge
DC
,
Sale
JE
,
Campbell
PJ
,
Nik-Zainal
S
, et al
Clock-like mutational processes in human somatic cells
.
Nat Genet
2015
;
47
:
1402
7
.
22.
Petljak
M
,
Alexandrov
LB
. 
Understanding mutagenesis through delineation of mutational signatures in human cancer
.
Carcinogenesis
2016
;
37
:
531
40
.
23.
Alexandrov
LB
,
Nik-Zainal
S
,
Wedge
DC
,
Campbell
PJ
,
Stratton
MR
. 
Deciphering signatures of mutational processes operative in human cancer
.
Cell Rep
2013
;
3
:
246
59
.
24.
Ellrott
K
,
Bailey
MH
,
Saksena
G
,
Covington
KR
,
Kandoth
C
,
Stewart
C
, et al
Scalable open science approach for mutation calling of tumor exomes using multiple genomic pipelines
.
Cell Syst
2018
;
6
:
271
81
.
25.
Alexandrov
LB
,
Kim
J
,
Haradhvala
NJ
,
Huang
MN
,
Ng
AWT
,
Wu
Y
, et al
The repertoire of mutational signatures in human cancer
.
bioRxiv
2019
;
322859
.
26.
Lawrence
MS
,
Stojanov
P
,
Polak
P
,
Kryukov G
V
,
Cibulskis
K
,
Sivachenko
A
, et al
Mutational heterogeneity in cancer and the search for new cancer-associated genes
.
Nature
2013
;
499
:
214
8
.
27.
Mermel
CH
,
Schumacher
SE
,
Hill
B
,
Meyerson
ML
,
Beroukhim
R
,
Getz
G
. 
GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers
.
Genome Biol
2011
;
12
:
R41
.
28.
Heng
YJ
,
Lester
SC
,
Tse
GMK
,
Factor
RE
,
Allison
KH
,
Collins
LC
, et al
The molecular basis of breast cancer pathological phenotypes
.
J Pathol
2017
;
241
:
375
91
.
29.
Ciriello
G
,
Gatza
ML
,
Beck
AH
,
Wilkerson
MD
,
Rhie
SK
,
Pastore
A
, et al
Comprehensive molecular portraits of invasive lobular breast cancer
.
Cell
2015
;
163
:
506
19
.
30.
Schoemaker
MJ
,
Nichols
HB
,
Wright
LB
,
Brook
MN
,
Jones
ME
,
O'Brien
KM
, et al
Association of body mass index and age with subsequent breast cancer risk in premenopausal women
.
JAMA Oncol
2018
;
4
:
e181771
.
31.
Yang
XR
,
Chang-Claude
J
,
Goode
EL
,
Couch
FJ
,
Nevanlinna
H
,
Milne
RL
, et al
Associations of breast cancer risk factors with tumor subtypes: a pooled analysis from the Breast Cancer Association Consortium studies
.
J Natl Cancer Inst
2011
;
103
:
250
63
.
32.
Renehan
AG
,
Tyson
M
,
Egger
M
,
Heller
RF
,
Zwahlen
M
. 
Body-mass index and incidence of cancer: a systematic review and meta-analysis of prospective observational studies
.
Lancet
2008
;
371
:
569
78
.
33.
Yue
W
,
Yager
JD
,
Wang
JP
,
Jupe
ER
,
Santen
RJ
. 
Estrogen receptor-dependent and independent mechanisms of breast cancer carcinogenesis
.
Steroids
2013
;
78
:
161
70
.
34.
Mamidi
TKK
,
Wu
J
,
Tchounwou
PB
,
Miele
L
,
Hicks
C
. 
Whole genome transcriptome analysis of the association between obesity and triple-negative breast cancer in caucasian women
.
Int J Environ Res Public Health
2018
;
15
:
2338
.
35.
Gustin
JP
,
Miller
J
,
Farag
M
,
Marc Rosen
D
,
Thomas
M
,
Scharpf
RB
, et al
GATA3 frameshift mutation promotes tumor growth in human luminal breast cancer cells and induces transcriptional changes seen in primary GATA3 mutant breast cancers
.
Oncotarget
2017
;
8
:
103415
27
.
36.
Takaku
M
,
Grimm
SA
,
Roberts
JD
,
Chrysovergis
K
,
Bennett
BD
,
Myers
P
, et al
GATA3 zinc finger 2 mutations reprogram the breast cancer transcriptional network
.
Nat Commun
2018
;
9
:
1059
.
37.
Jones
ME
,
Schoemaker
MJ
,
Wright
LB
,
Ashworth
A
,
Swerdlow
AJ
. 
Smoking and risk of breast cancer in the Generations Study cohort
.
Breast Cancer Res
2017
;
19
:
118
.
38.
Andersen
ZJ
,
Jørgensen
JT
,
Grøn
R
,
Brauner
EV
,
Lynge
E
. 
Active smoking and risk of breast cancer in a Danish nurse cohort study
.
BMC Cancer
2017
;
17
:
556
.
39.
Nik-Zainal
S
,
Davies
H
,
Staaf
J
,
Ramakrishna
M
,
Glodzik
D
,
Zou
X
, et al
Landscape of somatic mutations in 560 breast cancer whole-genome sequences
.
Nature
2016
;
534
:
1
20
.
40.
Helleday
T
,
Eshtad
S
,
Nik-Zainal
S
. 
Mechanisms underlying mutational signatures in human cancers
.
Nat Rev Genet
2014
;
15
:
585
98
.
41.
Singletary
KW
,
Gapstur
SM
. 
Alcohol and breast cancer: review of epidemiologic and experimental evidence and potential mechanisms
.
JAMA
2001
;
286
:
2143
51
.
42.
Hirko
KA
,
Chen
WY
,
Willett
WC
,
Rosner
BA
,
Hankinson
SE
,
Beck
AH
, et al
Alcohol consumption and risk of breast cancer by molecular subtype: prospective analysis of the Nurses' Health Study after 26 years of follow-up
.
Int J Cancer
2016
;
138
:
1094
101
.
43.
Wright
RM
,
McManaman
JL
,
Repine
JE
. 
Alcohol-induced breast cancer: a proposed mechanism
.
Free Radic Biol Med
1999
;
26
:
348
54
.
44.
Seitz
HK
,
Pelucchi
C
,
Bagnardi
V
,
La Vecchia
C
. 
Epidemiology and pathophysiology of alcohol and breast cancer: update 2012
.
Alcohol Alcohol
2012
;
47
:
204
12
.
45.
Shen
J
,
Platek
M
,
Mahasneh
A
,
Ambrosone
CB
,
Zhao
H
. 
Mitochondrial copy number and risk of breast cancer: a pilot study
.
Mitochondrion
2010
;
10
:
62
8
.
46.
Brooks
PJ
. 
DNA damage, DNA repair, and alcohol toxicity—a review
.
Alcohol Clin Exp Res
2006
;
21
:
1073
82
.
47.
Zhao
M
,
Howard
EW
,
Guo
Z
,
Parris
AB
,
Yang
X
. 
P53 pathway determines the cellular response to alcohol-induced DNA damage in MCF-7 breast cancer cells
.
PLoS One
2017
;
12
:
e0175121
.
48.
Dorgan
JF
,
Baer
DJ
,
Albert
PS
,
Judd
JT
,
Brown
ED
,
Corle
DK
, et al
Serum hormones and the alcohol-breast cancer association in postmenopausal women
.
J Natl Cancer Inst
2001
;
93
:
710
5
.
49.
Reichman
ME
,
Judd
JT
,
Longcope
C
,
Schatzkin
A
,
Clevidence
BA
,
Nair
PP
, et al
Effects of alcohol consumption on plasma and urinary hormone concentrations in premenopausal women
.
J Natl Cancer Inst
1993
;
85
:
722
7
.

Supplementary data