The relationship between germline genetic variation and breast cancer survival is largely unknown, especially in understudied minority populations. Traditional genome-wide association studies (GWAS) have interrogated survival associations in cohorts of predominantly European ancestries but are often underpowered due to subtype heterogeneity in breast cancer and a wide range of clinical covariates. Furthermore, these analyses tend to detect loci in non-coding regions, which require follow-up functional studies to interpret. Recent work in transcriptome-wide association studies (TWAS) has shown increased power in detecting functionally-relevant, trait-associated loci by leveraging information from expression quantitative trait loci (eQTLs) in external reference panels in relevant tissues. However, race-specific reference panels for TWAS may be needed to draw correct inference in large, ethnically-heterogeneous cohorts, and such panels for breast cancer are lacking.

In this work, we provide a framework for TWAS for breast cancer in diverse populations, using data from the Carolina Breast Cancer Study (CBCS), a population-based cohort that oversampled for women self-identifying as African American. Using Nanostring expression data from CBCS, we perform an eQTL analysis for 417 breast cancer-related genes to train race-stratified predictive models of tumor expression from germline genotypes. We use these models to impute expression in held-out samples from the CBCS and in The Cancer Genome Atlas, using a permutation method to assess predictive performance accounting for sampling variability. We find that these race-stratified expression models are not always applicable across race, depending on the imputation cohort. Furthermore, their predictive performance varies by breast cancer subtypes. Lastly, we conduct a small-scale TWAS for breast cancer mortality in CBCS (N = 3,828), controlling for covariates used in previous GWAS for breast cancer survival. At an false discovery rate-adjusted P-value less than 0.1, adjusting for self-reported race, we find hazardous associations near CAPN13 (2p23.1), VAV3 (1p13.3), and BLK (8p23.1) and a protective association near SERPINB5 (18q21.33) in TWAS that are underpowered in GWAS. This approach shows increased power for detection of survival-associated genomic loci, demonstrating the relative strength of TWAS over GWAS.

A carefully implemented TWAS is an efficient alternative to GWAS for understanding the genetic architecture underpinning breast cancer outcomes in diverse human populations and across biologically distinct tumor types.

Citation Format: Arjun Bhattacharya, Montserrat García-Closas, Andrew F. Olshan, Charles M. Perou, Melissa A. Troester, Michael I Love. A framework for transcriptome-wide association studies in breast cancer in diverse study populations [abstract]. In: Proceedings of the Twelfth AACR Conference on the Science of Cancer Health Disparities in Racial/Ethnic Minorities and the Medically Underserved; 2019 Sep 20-23; San Francisco, CA. Philadelphia (PA): AACR; Cancer Epidemiol Biomarkers Prev 2020;29(6 Suppl_2):Abstract nr B065.