Abstract
Women of sub-Saharan African descent have disproportionately higher incidence of triple-negative breast cancer (TNBC) and TNBC-specific mortality across all populations. Population studies show racial differences in TNBC biology, including higher prevalence of basal-like and quadruple-negative subtypes in African Americans (AA). However, previous investigations relied on self-reported race (SRR) of primarily U.S. populations. Due to heterogeneous genetic admixture and biological consequences of social determinants, the true association of African ancestry with TNBC biology is unclear. To address this, we conducted RNA sequencing on an international cohort of AAs, as well as West and East Africans with TNBC. Using comprehensive genetic ancestry estimation in this African-enriched cohort, we found expression of 613 genes associated with African ancestry and 2,000+ associated with regional African ancestry. A subset of African-associated genes also showed differences in normal breast tissue. Pathway enrichment and deconvolution of tumor cellular composition revealed that tumor-associated immunologic profiles are distinct in patients of African descent.
Our comprehensive ancestry quantification process revealed that ancestry-associated gene expression profiles in TNBC include population-level distinctions in immunologic landscapes. These differences may explain some differences in race–group clinical outcomes. This study shows the first definitive link between African ancestry and the TNBC immunologic landscape, from an African-enriched international multiethnic cohort.
See related commentary by Hamilton et al., p. 2496.
This article is highlighted in the In This Issue feature, p. 2483
INTRODUCTION
Breast cancer is the most frequently diagnosed cancer among women globally and the leading cause of cancer-related death among women (1, 2). Despite having lower breast cancer incidence, mortality rates are among the highest across most sub-Saharan African nations, compared with other nations worldwide. Although poorer survival is typically attributed to advanced-stage disease at presentation and limited access to treatment options in lower-middle-income countries (LMIC; ref. 2), triple-negative breast cancer (TNBC) incidence rates across African nations represent approximately 33% of breast cancer diagnoses compared with less than 20% in other nations (3, 4), with the highest incidence of TNBC in West African nations compared with East African nations (3, 5, 6). Globally, overall breast cancer mortality and TNBC burden appear higher across the African diaspora at large, corresponding with a higher prevalence of TNBC disease among women with African ancestry (6), who reside in nations throughout Europe (7, 8), South Africa, and admixed African American (AA) populations in the United States (5, 9, 10). We previously reported a higher risk of TNBC, compared with other types of breast cancer, associated with West African ancestry (5, 6). Therefore, we hypothesized that there may be genetic drivers associated with West African ancestry that predispose and/or lead to aggressive breast cancer, including TNBC.
TNBC continues to have the worst prognosis of breast cancer subtypes and the worst survival outcomes due to a lack of targeted therapy options for these tumors (11, 12). Given TNBC incidence rates across the African diaspora, our efforts in oncologic anthropology have shifted to a molecular focus to uncover and characterize the influence of African ancestry on breast cancer disease etiology and progression (4, 13, 14). Previous comparative breast cancer studies among patients of diverse race groups have focused on comparing tumors from AA and European American (EA) self-reported race (SRR) groups in the United States. Although there were inherent limitations of cohort size and heterogeneity of race, these approaches were useful in determining that broad biological differences do exist across diverse patient populations (15, 16). Some of these discoveries included race–group distinctions in genomic differences in frequencies of single-nucleotide variants (SNV; refs. 6, 17, 18), somatic tumor mutation signatures (19, 20), structural copy-number variations (CNV; ref. 21), and differences in DNA methylation patterns in both estrogen receptor–positive and estrogen receptor–negative tumors (22). Our work and that of others have uncovered racial differences in gene expression that revealed distinctions in immune response signatures, repeatedly across independent cohorts, implicating differences in the tumor microenvironment (TME; refs. 16, 23) as a possible cause of outcome disparities. The emerging promise of curative immunotherapies in overcoming treatment resistance in TNBC highlights an important opportunity to target the immune microenvironment. Given our previous and current findings that uncover race–group differences in tumor immune responses, these findings have increasing relevance to overcome disparities, particularly in regard to the potential of immunotherapies to improve treatment response (24, 25). However, there are limitations to using SRR in genomic studies, mainly due to complexity in genomic backgrounds of admixed groups.
Recently, our work was the first to use quantified genetic ancestry in admixed AA women to identify African ancestry–specific gene expression differences in TNBC tumors compared with EA women, which we also found had some overlap with SRR-associated gene networks (26). Of the African ancestry–associated genes, 48.1% were distinct from the SRR-associated genes, indicating the functional influence of the genetic ancestry background upon gene expression, apart from SRR alone. Similarly, a recent study by our collaborators characterized gene expression of TNBCs from the Bantu tribe from Kenya and found Bantu population–specific gene expression signatures as compared with TNBCs of AA and EA TNBCs (14). However, the implications of ancestry are still untested, lacking the inclusion of the contemporary and appropriate representative ancestry groups that are specific and relevant to the admixed patient groups.
Therefore, our current study utilized an African-enriched international cohort from the International Center for the Study of Breast Cancer Subtypes (ICSBCS; ref. 27), which will help to resolve a more precise understanding of genetic influences associated with African ancestry in race-associated gene signatures. We have measured the influence of African ancestry on TNBC tumor biology, derived from gene expression differences, that includes West African/Ghanaian and East African/Ethiopian women with TNBC compared with admixed AAs. Our rationale was based firmly on prior studies indicating that shared African ancestry harbors both the unique genetic risk of TNBC tumor etiology and the distinct gene signatures of TNBC among women of the African diaspora (6, 18, 26). We identified both African ancestry–associated gene expression signatures and TME cell-type differences from bulk RNA sequencing (RNA-seq) data. We demonstrate that the inclusion of native Africans with admixed AA patients, who share the same genetic ancestry, can overcome population complexity to help deduce the shared genetic drivers observed in race–group differences and discern these from environmental or other exogenous drivers of gene expression changes. We have identified subpopulation differences in gene expression between East versus West African ancestry lineages, which can be applied throughout population studies of the African diaspora in Europe (7, 8), the United States (5, 9, 10), and abroad (i.e., Afro-Latinx and Afro-Caribbean).
RESULTS
Characterization of Ancestry Profiles Reveals Complex Admixture in African and AA Cohort
We estimated the global genomic ancestry for each patient in our cohort to determine the varying levels of admixture based on the 1000 Genomes superpopulation and subpopulations (Supplementary Table S1; ref. 28). Our cross-sectional set of 148 TNBC cases includes 66 AAs and 41 European American (EA), enriched with 13 West African Ghanaians, and 22 East African Ethiopians, with four individuals who declined to report SRR (Supplementary Fig. S1). African (AFR) ancestry comparisons indicated significant differences in African ancestry across our cohort (ANOVA P < 0.001), with Ghanaian patients having the highest levels of AFR ancestry (median 97.3%) and AAs having an average 15% less AFR ancestry (median 82.6%). Ethiopian patients had a surprisingly lower AFR ancestry (median 43.0%), with nearly equal amounts of European (EUR) ancestry (median 43.5%), which is consistent with previous anthropologic studies (refs. 29–31; Fig. 1A; Supplementary Table S2). Our EA patients generally showed exceptionally low levels of AFR ancestry (median 2.4%); however, three self-identified EA patients had between 30% and 80% AFR ancestry.
For more precise ancestry estimations that reflect regional origins, we estimated genetic ancestry for five African subpopulations, which includes four populations representing West African ancestry, including Esan in Nigeria (ESN), Yoruba in Ibadan, Nigeria (YRI), Gambian in Western Divisions in the Gambia (GWD), and Mende in Sierra Leone (MSL). There was only one population representing East Africa in 1000 Genomes—Luhya, in Webuye, Kenya (LWK; Fig. 1B; Supplementary Table S1). As anticipated, AA patients presented with AFR ancestry primarily of west African origin, which included ESN (median 36.1%) and MSL (median 19.7%) ancestry, with less than 10% estimated East African ancestry (LWK median 7.5%). Interestingly, the heterogeneity of African origin within AAs is more extensive and wide-ranging than the origin of African ancestry in Ghanaians or Ethiopians, in whom the amount of specific subpopulation ancestry can range from 0% to 90% for a given individual, indicating the complex diversity of African admixture in AAs. African patients’ subpopulation ancestry is highly concordant with their regions of origin, in which Ghanaian ancestry is overwhelmingly enriched with the West African reference groups from YRI (median 66.0%), and MSL (median 24.1%) and Ethiopian patients have almost exclusively East African ancestry, represented as LWK (median 43.0%). Relatedness of patients’ estimated genetic ancestry shows separation of Ghanaian and AA patients from Ethiopian and EA patients (Fig. 1C), and AFR and EUR ancestry were significantly inversely correlated among patients (Fig. 1D). Interestingly, the EUR ancestry in our Ethiopian patients was primarily Italian [Toscani in Italia (TSI), median 41.2%]. Ethiopian patients also showed substantial levels of East and South Asian ancestry (EAS median 1.9%, SAS median 9.0%), with more SAS compared with other SRR groups. All of these admixture revelations are consistent with the social histories of each SRR group and reflect the diverse complexity of ancestry across the African diaspora (29–33).
Influence of Ancestry in Gene Expression Profiles of TNBC Tumors Results in Ancestry-Associated Differential Immune Signatures
To investigate AFR ancestry–specific gene expression profiles in our ICSBCS TNBC samples, we isolated our analyses to patients with significant (>35%) AFR ancestry. As previously described (26), we performed gene-by-gene linear regression, using genetic ancestry as a continuous variable, which determined ancestry-associated gene expression. We identified gene signatures associated with AFR (n = 613) and EUR (n = 345) ancestry (P < 0.001), with 293 genes shared between these gene signatures (Fig. 2A; Supplementary Table S3). Given the significant inverse correlation of AFR versus EUR ancestry in our patient cohort (Fig. 1D), we compared the polarity of gene expression levels of the 293 overlapping genes and found that genes upregulated in association with AFR ancestry are conversely downregulated in association with EUR ancestry (Fig. 2B). This may represent genes that have expression drivers that are ancestral informative variants, which are isolated to certain ancestry groups.
Unsupervised hierarchical clustering of the 613 AFR-associated genes separated patients into two distinct clusters correlating with levels of AFR ancestry, which we denote as a low AFR cluster, including primarily Ethiopian TNBC patients, and a high AFR cluster, including primarily AA and Ghanaian patients (Fig. 2C). AFR ancestry is significantly higher among the high AFR cluster (median 86.13%, mean 85.87%) compared with the low AFR cluster (median 43.08%, mean 44.55%; P < 0.0001; Supplementary Fig. S2). The high AFR subcluster includes two subnodes: one representing ESN, MSL, and LWK ancestry (6/9 AAs and 1/6 Ghanaians) and a second representing YRI, GWD, and ESN ancestry (3/9 AAs and 4/6 Ghanaians; Fig. 2C, red asterisk; Fig. 2D, red box). The subnodes reflect differences in the origin of AFR ancestry composition observed among West Africans and AAs in our cohort.
We calculated AFR-associated genes (Fig. 2D) for functional pathway enrichment, using the log2 fold change between the distinct high AFR and low AFR clusters to measure differential expression (Fig. 2E). Top canonical pathways included some previously implicated processes in race–group comparisons, such as RNA posttranscriptional modification through spliceosomal cycle pathway enrichment (P = 0.0002, z-score = 3.804; ref. 34), cell-to-cell and extracellular matrix interactions in the integrin signaling pathway (P = 0.004, z-score = 0; ref. 35), and chronic inflammation in atherosclerosis signaling (P = 0.006, no z-score predicted; ref. 36). Upregulation of WNT family member genes drives enrichment in a colorectal cancer metastasis signaling pathway (P = 0.004, z-score = −0.302; ref. 37) and HOTAIR regulatory pathway (P = 0.006, z-score = −0.707). One of the top enriched functions identified was immune cell trafficking (P value range of subterms 0.0119–0.000502; Fig. 2F and G). Specifically, there was a predicted upregulation of signals relating to immune cell movement and migration, but conversely a predicted inhibition of signals relating to immune cell activation. This finding was of particular interest, given our previous findings related to DARC-regulated immune cell infiltration, which is associated with race groups (38).
Our results thus far establish AFR ancestry–associated differential gene expression in TNBC tumors; however, this differential regulation could be due to a diverse baseline biological context. Therefore, we investigated the expression of our core set of 613 AFR ancestry–associated genes in normal mammary tissue data using the Genotype-Tissue Expression (GTEx) cohort. Out of the 396 GTEx breast/mammary samples with ancestry, 47 were of AFR ancestry (Supplementary Fig. S3), and only 20 of those were females. We found that 17 of the 613 genes (2.7%) were significantly associated with AFR ancestry (Fig. 2H; Supplementary Table S4). Of these 17 genes, seven had a positive expression correlation with AFR ancestry in both normal and tumor, whereas the other 10 showed relatively opposite correlations of expression levels with AFR ancestry. Eight were positively correlated (upregulated) with respect to AFR ancestry in normal tissue, but negatively correlated in tumor, and two were negatively correlated in normal and positively correlated in tumor (Fig. 2H and I). Tumor subtype–agnostic survival analysis using The Cancer Genome Atlas (TCGA) BRCA cohort revealed lower expression of AVPR2 and CYBA and higher expression of AGMAT and SNORA53 show benefit among AA patients, but not EA patients, and this was significant for CYBA in AAs (P = 0.0302; Fig. 2I). Pathway enrichment analysis suggests that these 17 genes were related to the inflammatory response (P = 2.5 E-07) and growth of mammary tumor (P = 7.3 E-07) disease and function terms (Fig. 2J). This suggests that already established biological mechanisms, which are altered in the course of malignancy from normal to tumor, have divergent or more inflated regulatory drivers in populations of African descent.
Resolution of Subpopulation AFR Ancestry Influence on Gene Expression Signatures
We determined a higher resolution of African subpopulation origins to harness the shared genetic diversity and identify subpopulation-associated gene signatures. Specifically, we utilized the 1000 Genomes African subpopulation ancestry (Fig. 1) estimates for West African (YRI, ESN, GWD, and MSL) and East African (LWK) populations (Supplementary Table S1). By repeating the gene-by-gene statistical model with African subpopulations, we identified a combined 2,567 genes associated with the five African population groups (Fig. 3A). African subpopulation–specific gene associations included 338 YRI genes, 643 ESN genes, 201 GWD genes, 146 MSL genes, and 1,229 LWK genes (Supplementary Table S5). These gene lists included, but extended beyond, the genes identified in the African superpopulation ancestry analysis (Fig. 3A). Surprisingly, there were no subpopulation-specific genes shared among all five populations, suggesting that there are unique gene expression drivers from each ancestry group. However, a small fraction of each set of individual West African subpopulation genes were shared with the East African LWK population (total n = 210). As may be anticipated, we found that the largest overlap of African subpopulation–associated genes was shared between ancestry groups that are geographically adjacent nations (29.0% of YRI and 48.8% of GWD shared between these populations). However, the closest West African groups, YRI and ESN of Nigeria, did not share any associated genes.
Therefore, we considered which SRR/nationality groups carried the specific subpopulation ancestry and therefore in which groups these gene signatures may be found. The East African population with the largest set of associated genes (n = 1,229), LWK, predominantly represented our Ethiopian patients with a small portion of AA ancestry represented by LWK (median ∼8%). Pathway analysis predicted a decrease in immune response–related function in East African ancestry gene sets, including inhibition of CSF1 and various interleukins, CD28 and lymphopoiesis, and the canonical tec kinase signaling pathway (Fig. 3B). This inhibitive effect of LWK ancestry on these functions clearly distinguishes the differences in tumor biology between west and East Africans and suggests important differences in immune cell development, response, and activation. MSL ancestry was predominantly found among AAs and Ghanaians and included genes (n = 146) that also involved the function of immune response signals that suggest activation of immune-related functions (Fig. 3C). Specifically, there was activation of REL and IL21, both playing important roles in immune response regulation. To validate our findings, we conducted a reanalysis of the AA subset of our previously published cohort and identified African subpopulation–associated gene signatures enriched for pathways that involve immune function (Supplementary Fig. S4A–S4H).
Immune Cell Enrichment Expression Signatures Are Associated with AFR Ancestry
We estimated immune cell populations and overall tumor-associated leukocyte (TAL) abundance with the deconvolution and cell-type enrichment methods CIBERSORTx (39) and xCell (40), respectively. Absolute scores, the sum of all estimated immune populations, were significantly higher in patients with high AFR ancestry compared with patients with low AFR ancestry (Fig. 4A; P = 0.0076). The immune cell types accounting for the bulk of the AFR ancestry–associated infiltrating cells included naïve B cells, CD8+ T cells, helper T cells, regulatory T cells (Treg), and activated mast cells (Fig. 4B). Linear association testing showed that the increasing proportions of immune cells directly correlate with increasing AFR ancestry (Fig. 4C), suggesting a direct genetic dose response. The largest proportion of AFR-associated immune cells are naïve B cells, a nonactivated immune cell population. These findings concur with the preceding expression pathway analysis, indicating AFR ancestry–associated gene expression signatures are enriched for stimulated immune cell “migration/movement” (i.e., infiltration) and simultaneous repression of “cell-type activation” (i.e., naïve cells). Conversely, the “activated” cell population, such as “activated mast cell,” is more prominent in tumors of patients with low AFR ancestry. To verify this finding with an independent algorithm, we used the xCell cell-type enrichment analysis (36). The results replicated the CIBERSORTx findings, showing AFR-associated immune cell infiltration, and further discerned that the specific T populations associated with AFR ancestry are CD8+ T cells and CD8+ T effector memory cells (Supplementary Fig. S5; P < 0.05), building on the observation of CD8+ T cells from CIBERSORTx (Fig. 4C; P < 0.05).
To investigate immune-suppressive versus immune-stimulating TME marker associations (41) with African ancestry, we compared the relative expression of several well-known immune-checkpoint genes, including CD274 (PD-L1 marker), CTLA4, and PDCD1 (PD-1 marker; Fig. 4D). We found that PDCD1 was significantly associated with AFR ancestry and SRR (ANOVA P < 0.01), with both Ghanaian (mean 12.42) and AA (mean 13.72) patients’ tumors having 4× higher expression than Ethiopian patients’ tumors (mean 3.68). To ensure these immunosuppressive marker gene expression patterns were derived from the immune cell population within the bulk tumor, we tested the correlation of specific immune cell estimates from CIBERSORTx with the CTLA4, CD3D, and PDCD1 markers. We found each correlated with the abundance of relevant T-cell subtypes, indicating these cells are the likely source of the RNA expression (Fig. 4E).
To validate the RNA-seq–based immune cell estimations, we used two validation cohorts to quantify immune cell populations from protein-level data. Our first validation set represents clinical-grade IHC marker assays to score the infiltration of immune cells in an independent set of ICSBCS TNBC patients (n = 40), distributed across each ethnicity group represented in the RNA-seq cohort. We found similar immune cell infiltration trends across race/ethnic groups, with Ghanaian and AA tumors having higher counts of CD3+ and FOXP3+ cells compared with Ethiopian and EA tumors (Fig. 4F). CD3+ cells showed significant variation across all race/ethnic groups (ANOVA P = 0.0102; Fig. 4G), with significant pair-wise differences between Ghanaians and Ethiopians (P = 0.0457) and between AAs and EAs (P = 0.0379). Our second validation set represents a pilot cohort of multiplexed imaging data of TNBC patients representing AA (n = 2) and EA (n = 2) patients using the GeoMx platform. After completed segmentation across ∼5 regions of interest (ROI) per patient, stromal and tumor cellular subsets were segregated, and immune cell abundances were determined. In the stromal compartment, we see significantly higher levels of plasma cells (P = 0.0083), CD4+ and CD8+ memory T cells (P = 0.0344 and P = 0.0041, respectively), and Tregs (P = 0.0232) among AA patients/ROIs compared with EA (Fig. 4H). In the tumor compartment, CD8+ memory T cells were also significantly higher among AAs (P = 0.0126), and we also observed borderline significantly higher levels of Tregs (P = 0.0601; Fig. 4I).
FOXP3 expression in Tregs is correlated with a suppressive immune TME, suggesting patients with higher AFR ancestry may have a TME that is more suppressive versus stimulating compared with patients of EUR ancestry. The RNA-based findings in our cohort matched both IHC and multiplexed immunofluorescent protein staining among SRR groups in independent cohorts (Fig. 4D). We found that the most significant differences in immune cell infiltration were found when comparing populations by AFR ancestry rather than across SRR groups, emphasizing the importance of genetic ancestry in immunologic differences. Taken together, this further suggests an immune-suppressive tumor environment being associated specifically with West African ancestry, as opposed to East African or EUR ancestry.
TNBC Subtyping Reveals Ancestry Bias in Composition of Mosaic Heterogeneity
Gene expression signatures are used to categorize TNBC tumors into subtypes that have been shown to predict clinical outcomes (42, 43). The landmark report of these subtypes, the well-known Vanderbilt TNBCtype tool, correlated gene expression signatures of TNBC tumors to their tumor training set, which initially designated tumors into six distinct subtypes, including basal-like 1 (BL1), basal-like 2 (BL2), luminal androgen receptor (LAR), mesenchymal (M), mesenchymal stem–like (MSL), and immunomodulary (IM). An additional “subtype” category of exclusion harbors any tumors with “unsure calls” (UNS), which masked tumors with either multiple subtype correlations or no positive correlations with any of the established phenotypes from the training set. Further consideration of the histologic context of these tumor subtypes determined that the MSL and IM subtypes are stromal and tumor-associated immune-derived rather than distinct phenotypes, which are currently corrected by the tool to be manually reassigned with their secondary correlation call (43). Correlated subtypes called from the Vanderbilt TNBC tool found tumors in our cohort from Ghanaian and Ethiopian patients were more often BL1, and tumors from AA patients were more often IM subtype (Fig. 5A, top). Interestingly, all IM tumors were from the high AFR ancestry Ghanaian and AA individuals, indicating the strong tumor immune signatures in these ancestry groups. After the suggested reassignment of the IM and MSL subtypes, AAs had a predominance of UNS calls, indicating an unresolved heterogeneity that would not allow designation of a single subtype (Fig. 5A, middle). Therefore, to ascribe a biological phenotype to these tumors, we used a previously described median ranking (26), which excludes the confounding influence of the immune signature genes by only including the gene expression signatures of the BL1, BL2, M, and LAR TNBC subtypes. Our results indicate that BL1 is the predominant subtype for Ghanaians and that M is the predominant subtype for both AAs and Ethiopians (Fig. 5A, bottom).
Given the previously reported heterogeneity of TNBC tumors among AA patients (26), we used our previously established Triple-Negative Hetero Fluid (TNHF) subtyping method (26). In brief, by utilizing both TNBCtype correlations and our median ranks from ∼4,000 genes, we are able to determine heterogeneous categories of mixed TNBC subtypes. This assignment of multiple subtypes in a tumor's composition more appropriately informs us of the diverse cell types present in each patient's tumor. Our method also considers the exclusion of certain subtypes to establish unique combinations of subtype composition. Unsupervised hierarchical clustering of these mixed subtype categories resolved into five distinct nodes (Fig. 5B and C). Cluster 1 is the largest node, composed of BL1+/M+/BL2−/LAR− tumors, which happen to be derived from Ghanaians and Ethiopians. Cluster 2 is BL2+/M− and is composed exclusively of AFR-high cases. Cluster 3 tumors are BL2+ and M+ tumors originating from AA and Ethiopian tumors. Cluster 4 includes BL1+/BL2− with only AFR-high cases. Lastly, cluster 5 includes LAR+ tumors derived from all of the patient groups (Fig. 5C–E). Interestingly, the tumors in clusters 2 and 4 would also be classified as IM (with one UNS) and are exclusively derived from high AFR patients (Fig. 5E). We investigated the immune deconvolution CIBERSORTx signatures of the IM/cluster 2 and 4 tumors and found the TAL populations show enrichment in specific immune cell populations that included B cells, CD8+ T cells, M2 macrophages, and natural killer (NK) cells (Fig. 5F). Overall, we find that the majority of these TNBC tumors in our African-enriched cohort are basal-like combined with mesenchymal, showing a distinction of BL1 among Ghanaians and Ethiopians (BL1+/M+) and BL2 among Ghanaians and AAs (BL2+ and/or M+; Fig. 5D). The clusters with high immune cell infiltration (2 and 4) are exclusively patients with substantial West African ancestry (Ghanaian and AA).
Genes Associated with SRR Are Involved in Comorbidity Pathways
In the interest of determining any distinct impact of racial social constructs on the biological phenotypes of tumors, we also investigated the functional pathway enrichment of SRR-associated genes that were not associated with genetic ancestry. We hypothesized that functional pathways of gene signatures associated with SRR would differ from our ancestry-associated gene signatures, noting that the social construct of race is not a comprehensively reliable assessment of factors related to the implicit bias and systemic racism correlated with SRR. Nevertheless, the SRR-associated comparison model identified 1,071 gene signatures as differentially expressed across SRR groups, and these were compared with our 613 AFR-associated genes and 345 EUR-associated genes (Fig. 6A). The overlap of ancestry-associated genes (AFR and/or EUR) with SRR genes included 320 genes, whereas 751 genes were uniquely associated with SRR. These distinctions of race- versus ancestry-associated signatures demonstrate the importance of both individual genetic ancestry and SRR on gene expression signatures in the context of tumor gene expression profiles.
Upon investigation of the 1,071 genes associated with SRR, unsupervised hierarchical clustering grouped patients by ancestry, where Ethiopian cases cluster distinctly from Ghanaian and AA cases (Fig. 6B), which is likely due to the influence of the overlapping ancestry-associated genes included in the analysis. However, unsupervised clustering of the 751 genes unique to the SRR analysis drastically changed the grouping (Fig. 6C), with AA cases diverging from the node containing all African cases. The African cluster also separated into two subnodes, but the divergence was largely driven by an upregulated gene signature found among AA and not seen among our African patients. We hypothesized that this signature represents distinct environmental influences that are unique to AA patients. Strikingly, signature pathway analyses determined that several known canonical pathways were enriched in AAs compared with Ghanaians and Ethiopians (P < 0.05), including several related to comorbidities that may reflect unknown comorbidity health status in our patient cohort. Specifically, comorbidity pathways related to cardiac function, adiposity/obesity, diabetes, and insulin signaling were found to be activated among AA patients, but not activated in Ghanaian and Ethiopian patients (Fig. 6D). This supports the premise of social determinants linked to racial constructs, which are known to be associated with these diseases. These comorbidities potentially influence the race-specific biological differences of TNBC microenvironments, which we detected in our cohort (Supplementary Fig. S6A and S6B). Although these comorbidities are known risk factors for negative outcomes, they could be addressed with interventions that target these pathways in tumors.
DISCUSSION
This study is the first comparative RNA-seq study of TNBC that utilized an African-enriched cohort of east and West Africans along with AAs to discern the influence of genetic ancestry on TNBC tumor biology related to racial disparities. Our findings support the emerging concept that the inclusion of multiethnic patient groups in genomic research increases rigor and can have a transformative impact on cancer discoveries and overcoming disparities. Previous approaches have attempted to uncover the causal variables of disparate mortality and TNBC incidence but used SRR as a proxy for genetics or used global ancestry threshold cutoffs to categorize patient groups, which missed the broad range of genetic admixture we have detected here (AFR ancestry ranging from 16.64% to 99.99%) as well as the variability of ancestral origin among AAs due to unique social histories across the African diaspora (33). Even European admixture among AA patients varies regionally, where less European admixture was reported among AA individuals in the southeast United States compared with the northeast or pacific northwest (32, 33). The unique composition of patient ancestry from the African diaspora in our international ICSBCS cohort gave us a novel power and perspective to show that using linear regression models with estimated genetic ancestry as a continuous variable can unveil a distinct set of African ancestry–associated genes. The high level of European admixture we detected in our Ethiopian patients, nearly equal to AFR ancestry, combined with a significant proportion of SAS admixture is a key example of the power of our oncologic anthropology approach (13). The non-African ancestral origins in Ethiopians were previously reported in genetic anthropology studies as a significant proportion of mitochondrial DNA haplotypes (29) and Y chromosome haplotypes (30) with up to 50% of non-African ancestry (31). Further, our subcontinental ancestry estimates revealed that although the predominant AFR ancestry of AA and Ghanaian patients is generally West African, the regional origin of African ancestry was distinct between these groups. Specifically, Ghanaians were primarily represented by YRI ancestry with less than 0.01% ESN ancestry, and AAs were primarily represented by ESN, with only two of 66 AAs reporting YRI ancestry over 30% (AA median YRI 0.00%). This is a relevant distinction for genome-wide association studies (GWAS) that attempt to identify AA-specific risk alleles utilizing a YRI reference genome for a genotype imputation template. Our work indicates that the YRI genetic background is less appropriate for our AA patients, and any genetic risk study that utilizes a single inappropriate AFR genome reference could adversely affect the relevance and rigor/reproducibility of the findings. Interestingly, the shared West African origins between Ghanaian and AA patients correspond with MSL ancestry (medians: 24.1% and 19.7%, respectively), which is rarely cited as a genome reference background.
We found that TNBC gene expression signatures associated with the superpopulation AFR ancestry estimates can be distinct from the signatures associated with regional/national AFR ancestry. Interestingly, no overlapping gene signatures were found between YRI and ESN, the predominant African origin of Ghanaians and AAs, respectively. Only YRI- and GWD-associated genes overlap among the West African population–associated genes. The lack of gene overlap suggests that there is a significant population-level divergence of genetic drivers that direct gene expression signatures even within African nations. This was surprising given that the YRI and ESN populations are geographically closer and presumably would have more similar genetic backgrounds that would likely lead to significant gene signature overlaps. However, the only source of ESN ancestry in our cohort is derived from AA patients. These findings all suggest that AFR subpopulations harbor functional/regulatory population-specific genetic drivers of gene expression that are relevant to disease pathology. Of note, AA patients also have unique environmental influences mediating genetic impact on gene signatures (36) compared with African patients, and a lack of gene expression signature overlap may represent the influence of mediating environmental factors.
The 600+ AFR-associated gene signatures effectively separated the node of Ethiopian patients from the Ghanaian and AA patients, revealing a broad distinction of tumor biological traits. The predominant functions of these genes are mechanisms of immune cell trafficking and activation of migration signals, and this was validated in independent cohorts with classic IHC methods, establishing a higher infiltration of tumor-associated leukocytes in the TME of patients with West African ancestry. Our findings are in agreement with previous studies that indicated higher inflammation and immune cell enrichment in race groups of African descent (16, 23). In addition, human evolution studies on general immunologic responses support our TNBC findings, where distinct immune expression signatures have been reported in response to pathogens, including differential immune cell activity between European and African populations (44, 45). We hypothesize that our previous observations, related to TME inflammation in the context of the African-specific Duffy-null blood group status (6, 18), may further account for these ancestry-specific clinical consequences. Interestingly, the heightened immune response was specifically associated with the MSL subgroup ancestry (Fig. 4C–E and Fig. 5B, respectively), implicating a shared MSL ancestral origin of our cohort as the likely source of a genetic factor that modifies immunologic responses. Further studies are needed to untangle the actual alleles that may be MSL-specific and functionally involved in immune responses.
Although these analyses primarily focused on the proof of principle of ancestry conveying an influence on tumor biology, cancer is well known to be influenced by the intersection of genetic and environmental variables (36, 46). Therefore, we also demonstrated the importance of a bimodal approach to cancer disparities research (47) that engages both social and biological variables. To investigate any social determinant effect, we modeled SRR associations with gene expression and were able to detect biological differences that were not associated with genetic ancestry. Although SRR alone does not capture certain social determinants, SRR-associated gene signatures involved canonical pathways that are known to be influenced by individual and area deprivation, suggesting SRR captures, at least in part, the influence of social structure on tumor biology. Our findings emphasize once again a need to assess social constructs and quantify the related racial discrimination practices, such as issues with marginalization and community redlining (48), to establish the impact of these practices on the underlying biology that determines treatment outcomes. It is equally important to note that beyond just characterizing the causal factors of disparities, finding novel biological traits presents an opportunity for therapeutic targets or interventions that incorporate these signatures in clinical treatment decision-making to improve outcomes.
Uncovering AFR-associated gene expression in normal breast tissue that overlaps with gene signatures in tumors indicates that cancer-related gene networks may have baseline differences across the diverse population even before tumors develop (20, 47). We have shown systemic differences in inflammation among patients with cancer and healthy controls that is associated with African-specific alleles (i.e., Duffy-null; ref. 38), which are related to both normal and tumor enrichment of inflammatory response gene expression. It is imperative to understand the extent of population differences in the regulation of gene network baselines that may ultimately sustain tumorigenesis pathways. Several of the baseline AFR-ancestry gene expression changes we detected in normal tissue were switched to opposing directions in tumor tissue. This finding provides preliminary evidence that the ancestry-specific differences in tumor expression could be initiated in response to malignancy as opposed to normal biological variation. For example, the AVPR2 gene is negatively correlated with AFR ancestry in normal tissue but switches to be positively correlated with AFR ancestry in tumors, showing a population-specific pattern of drastic upregulation in TNBC tumors. The function of AVPR2 was defined as maintaining homeostatic levels of water and electrolytes in renal cells but has been detected in multiple cancer types, including breast cancer (49), with both pro- and antimalignant actions depending on the tumor type (49–52). The apparent clinical associations with survival trends appear to only emerge in AAs (Fig. 2I) compared with EA patients and further suggest population-private functionality that could be implicated in disease prognosis or novel therapeutic opportunities. Similarly, the FNDC3B gene, previously known as the factor for adipocyte differentiation, is an RNA binding protein that has been shown to be predominantly expressed in white adipose tissue and increases expression to play a role in early stages of adipocyte differentiation. It has a normal to tumor tissue expression transition from positive to negative correlation with AFR ancestry, respectively. This effective downregulation could represent an important loss of metabolic pathway regulation, as variants of FNDC3B have been GWAS hits in hemoglobin A1C measurements, body mass index, and waist–hip ratio. Interestingly, there is a higher expression of FNDC3B in TNBC (TCGA) compared with non-TNBC and persistently lower expression in AA compared with EA patients, with a subtype-agnostic association with survival that differs among race groups. Specifically, we observe better survival trending with high FNDC3B expression in EA patients, but low expression is associated with better survival in AA patients. Intriguingly, this suggests that SRR-associated gene network changes could be derived from distinct normal breast biology. However, one key limitation to these observations is the relatively low number of diverse patients in publicly available datasets with normal tissue expression.
As we continue to uncover the genetic regulation and environmental cues that influence TNBC differences across patient populations, our findings provide benchmarks for gene candidates of effective interventions to reduce mortality disparities and even cancer prevention. Ultimately, the unique tumor traits that hinge upon ancestry-associated gene expression signatures described here represent an important opportunity to fully characterize functional differences in tumor biology and opens a path to novel theragnostic options for these highly aggressive tumors.
METHODS
Ancestry Patient Cohort
ICSBCS Patient Cohort.
The ICSBCS biorepository represents the efforts of an international consortium of breast cancer clinicians and researchers with the goal to characterize breast cancer disease in diverse populations worldwide. We have prospectively recruited patients with breast cancer since 2006, from whom formalin-fixed, paraffin-embedded (FFPE) tumor tissue has been collected. Across all institutions, written informed consent was obtained from the patients, and the work has been conducted in accordance with recognized ethical guidelines. Institutional Review Board (IRB) approval for the utilization of biorepository samples was obtained from participating sites in the United States (Weill Cornell Medical College, New York, NY; Henry Ford Health System, Detroit, MI; and University of Michigan, Ann Arbor, MI) and our international African partnering institutions (Komfo Anokye Teaching Hospital, Kumasi, Ghana, and the Millennium Medical College St. Paul's Hospital, Addis Ababa, Ethiopia). In the present study, TNBC tumor tissue was obtained from a total of 45 patients, including nine AAs, three EAs, 12 Ghanaians, and 21 Ethiopians (Supplementary Fig. S1). Confirmation of TNBC diagnosis by IHC was completed for Ghanaian and Ethiopian cases at our ICSBCS U.S. site locations in Michigan (University of Michigan, Henry Ford Health System) and New York (Weill Cornell Medical College). Samples collected in this cohort were used in both the ancestry (n = 61) and gene expression analyses (n = 26; Supplementary Fig. S1; Supplementary Table S6).
University of Alabama at Birmingham Patient Cohort.
The University of Alabama at Birmingham (UAB) TNBC cohort has been previously described (26) and consists of a convenience cohort of retrospective FFPE TNBC tissue collected between 2000 and 2012 at the UAB. Samples were collected and used under the UAB IRB. In the present study, samples were analyzed from 74 patients, including 42 AA and 32 EA patients. Samples in the UAB patient cohort were used in ancestry comparisons and as a validation cohort for gene expression analyses findings (Supplementary Fig. S1).
Englander Institute for Precision Medicine Patient Cohort.
All samples were collected and used under the Weill Cornell Medical College IRB. In the present study, we have estimated ancestry from TNBC tissue of 13 patients, including one AA, six EA, two Asian, and four patients who responded “other” or declined to provide race/ethnicity information (Supplementary Fig. S1).
RNA Extraction from Archival FFPE Tissue
RNA was extracted from archival FFPE tissue using a modified Qiagen RNeasy FFPE kit protocol. Briefly, prior to deparaffinization of the FFPE tissue, the samples were incubated with 1× acidic antigen retrieval solution at 90°C for 5 minutes. Following incubation, samples were cooled to room temperature and any excess paraffin was removed from the tube. We then proceeded through the standard kit protocol. RNA yield was quantified using the Qubit RNA Broad Range kit and Qubit 4.0 fluorometer.
RNA Library Preparation and Sequencing
The quality of each RNA is assessed using RNA High-Sensitivity Screen on TapeStation (Agilent Technologies). For RNA sequencing, 100 ng of total RNA molecules were used to construct libraries using Illumina TruSeq RNA Exome Library Prep Kit following the manufacturer's protocols. The final libraries were then quantified using Agilent D1000 Screen Tape as well as sequenced on Illumina MiSeq V2 Micro Kit to assess insert sizes and integrity before sequencing on a high-throughput sequencer. Each library was normalized to 4 nmol/L and pooled and sequenced on an Illumina NextSeq500 High Output Kit (Illumina). All sequencing reads were converted to industry-standard FASTQ files using BCL2FASTQ (version 1.8.4).
RNA-seq Data Processing and Quality Control of Samples
Raw RNA-seq reads were assessed with Fast QC (version 0.11.8; https://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and Trimmomatic (ref. 53; version 0.36) was utilized for read trimming and adapter removal. Reads were aligned using HISAT2 (ref. 54; version 2.0.4) with the GrCh37 reference genome. Picard tools (version 2.18.3; https://broadinstitute.github.io/picard/) were used to pull alignment metrics for the samples, in which a number of sequenced reads were found to have high levels of read duplication. Duplicate reads were removed using Picard, and only samples that had 10M reads after deduplication were utilized in subsequent gene expression analyses.
Ancestry Estimation Using Variants Called from RNA-seq Alignments
Ancestry proportion is determined by the Admixture v1.3.0 (55) software, which uses a maximum likelihood–based method to estimate the proportion of reference population ancestries in a sample. We genotyped the reference markers generated from 1,964 unrelated 1000 Genomes project (28) samples directly on the RNA-seq samples using GATK pileup. Individuals from populations MXL (Mexican ancestry from Los Angeles in United States), ACB (African Caribbean in Barbados), and ASW (African ancestry in the Southwest United States) were excluded from the reference due to being putatively admixed. The reference was further filtered by using only SNP markers with a minimum minor allele frequency of 0.01 overall and 0.05 in at least one 1000 Genomes superpopulation. Variants were additionally linkage disequilibrium–pruned using PLINK v1.9 (56) with a window size of 500 kb, a step size of 250 kb, and an r2 threshold of 0.2, resulting in 122,377 markers remaining. The analysis results in a proportional breakdown of each sample into five superpopulations [AFR, American (AMR), EAS, EUR, and SAS] and 23 subpopulations (Supplementary Table S1).
Gene Expression Quantification and Differential Gene Analysis
Stringtie (ref. 54; version 1.3.3) was used to quantify gene expression from our deduplicated aligned reads. Quantified genetic ancestry and SRR groups were used to identify ancestry- or SRR-associated genes in our cohort, using linear regression analysis comparing gene expression either with the continuous ancestry variable or ANOVA analysis to determine associations with the categorical SRR variable. Genes with P < 0.01 were included in further analyses. Unsupervised hierarchical clustering of our gene lists was completed using JMP Pro 16 (SAS Institute Inc.).
Network Analyses of Differentially Expressed Genes
Ingenuity Pathway Analysis software (Qiagen, version 01-16) was used to determine the involvement of our gene lists in various canonical pathways, to determine upstream regulators, and to draw de novo networks involving our gene lists. For each analysis and gene list, the log fold change was calculated based on the resulting node structure of the samples when the gene lists underwent unsupervised hierarchical clustering, as our ancestry-associated versus SRR gene lists resulted in different clustering patterns of our samples.
Gene Expression Analysis in Nondiseased Breast/Mammary Tissue
To explore our AFR-associated gene signatures in normal mammary tissue, we obtained ancestry estimations and gene expression values [transcripts per million (TPM)] from the GTEx cohort (https://gtexportal.org/home/datasets). African ancestry was determined by principal coordinates as described by Gay and colleagues (57), where PC1 was associated primarily with AFR ancestry. Using PC1 <−0.04 as a threshold for AFR ancestry, we identified 47 individuals of AFR ancestry. Twenty of the 47 AFR ancestry individuals identified as female, and expressed the female-specific long noncoding RNA X-inactive–specific transcript and no genes from chromosome Y, and were used in subsequent analyses.
Survival Analysis of AFR Ancestry–Associated Genes in TNBC Tumors and Nondiseased Breast/Mammary Tissues
The TCGA BRCA cohort was used to determine if there was any prognostic benefit associated with our AFR ancestry–associated genes found in both TNBC and normal breast tissue (GTEx). Survival data (58) were obtained from https://gdc.cancer.gov/about-data/publications/pancanatlas. We used an upper quartile cutoff to separate patients into high- and low-expression categories, in which differences in survival outcomes were visualized by fitting Kaplan–Meier curves. P values from log-rank tests were reported where significant differences were found.
Tumor-Associated Immune Cell Abundance in Tumors Using RNA-seq Deconvolution and Enrichment Methods
To determine estimated abundance of tumor-associated immune cell populations, we used the online CIBERSORTx (39) platform (https://cibersortx.stanford.edu/) with our gene expression values as input. The LM22 signature matrix file was used as a reference, and the estimation was completed with quantile normalization disabled (as recommended for RNA-seq data) with 500 permutations. Only CIBERSORTx output that was determined to be significant (P < 0.05) was included in our analyses.
We have additionally used xCell for deconvolution of immune and other cell populations from our bulk RNA-seq data (40). Normalized TPM expression was used as input for the xCell algorithm.
IHC of CD3 and FOXP3
FFPE tumor blocks were obtained from the ICSBCS biorepository. Slide preparations were conducted through the Henry Ford Health System Histology Core using standard operating protocols. From the FFPE blocks, 4-μm sections were obtained. Multiplex staining was done using FOXP3 at a 1:100 dilution (BioLegend, cat. no. 320101) with CD3 predilute (Agilent, IR503) as the antibody diluent. Clinical attributes of the IHC cohort are reported in Supplementary Table S7.
Tumor-Infiltrating Leukocyte Analysis from IHC
Tumor-infiltrating leukocyte markers from multiplex IHC staining were analyzed using HALO software (V3, Indica Labs). Stained slides were electronically scanned using the Leica Aperio scanner and transferred into the HALO program. Positively stained tumor cells were annotated from hematoxylin and eosin staining and matched to a serial section with FOXP3 and CD3 multiplex staining. A custom algorithm optimized to detect color differences between the two markers was used to determine the number of positively stained cells for each marker. Positive tumor cells for each marker were divided by the total number of tumor cells and converted to a percent for subsequent data analysis.
Protein-level Immune Cell Deconvolution of TNBC Cases
The NanoString GeoMx Digital Spatial Profiler was used to analyze immune profiles of four TNBC cases representing two AA and two EA patients. Patients were consented at Tuskegee University under IRB approval. FFPE tissue was sectioned and stained with fluorescently labeled antibodies specific for the epithelial cell marker PanCK, pan-leukocyte marker CD45, and macrophage marker CD68. Data from four to five ROIs were captured per patient, in which segmentation was performed on each ROI to distinguish the stromal and tumor compartments. Immune cell quantifications for each segmented stromal and tumor ROI were determined with the R/Bioconductor package spatialdecon (59).
TNBC Subtyping
To determine TNBC subtypes of our samples, we input gene expression values into the Vanderbilt TNBCtype online tool (https://cbc.app.vumc.org/tnbc/; ref. 42). The TNBC subtypes IM and MSL have been determined to primarily represent infiltrating immune cells and tumor-associated stroma, respectively, and therefore these calls are reassigned to their second most correlated call and significant call (43). UNS are where multiple correlations are significantly associated with a tumor gene expression profile, and in our cohort, these were able to be resolved after disregarding IM and MSL calls.
As a supplementary validation method to the gene expression correlation-based Vanderbilt TNBC classification tool, a summarized ranks measure was computed using the original TNBC subtypes signatures for all samples using normalized RNA-seq expression data. TNBC subtype signatures were obtained from Lehmann and colleagues (42). Across all samples, all genes expressed were ranked from low to high expression using the rank function in R statistical software, with a minimum rank method used to resolve duplicate expression ties. For each sample, ranks for each gene in the given subtype signature were extracted and a representative median of ranks for the gene signature was calculated to estimate the overall regulation of the signature with respect to the total expression. The TNBC subtype signature with max median signature rank per sample was the assigned TNBC subtype for the sample.
Data Availability Statement
Sequencing data from our gene expression cohort can be accessed in the Gene Expression Omnibus (GSE211167).
Authors’ Disclosures
C.C. Yates reports grants from the NCI (U54 CA118623), the NIH/National Institute on Minority Health and Health Disparities (NIMHD; U54-MD007585-26), and the Department of Defense (PC170315P1 and W81XWH-18-1-0589) during the conduct of the study; personal fees from Riptide Biosciences, QED Therapeutics, and Amgen and other support from Riptide Biosciences outside the submitted work; and a patent for peptides having immunomodulatory properties issued and licensed to Riptide Biosciences and Aurinia Pharmaceuticals and a patent for three peptides having anti-inflammatory properties issued to Riptide Biosciences. U. Manne reports grants from the NIH/NCI (5U54CA118948) during the conduct of the study. O. Elemento reports other support from Owkin, Freenome, and OneThree Bio, personal fees and other support from Volastra Therapeutics and Pionyr Immunotherapeutics, and personal fees from Champions Oncology during the conduct of the study. J.D. Carpten reports grants from Susan G. Komen for the Cure and the NCI during the conduct of the study and is a member of the Board of Directors for the American Association for Cancer Research. M.B. Davis reports grants from the NCI and Weill Cornell Medical College during the conduct of the study; grants from Genentech outside the submitted work; and a patent for inflammation and immune markers in cancer pending. No disclosures were reported by the other authors.
Authors’ Contributions
R. Martini: Conceptualization, data curation, formal analysis, investigation, methodology, writing–original draft, writing–review and editing. P. Delpe: Formal analysis, investigation, methodology. T.R. Chu: Data curation, formal analysis, writing–original draft. K. Arora: Formal analysis, investigation, writing–review and editing. B. Lord: Formal analysis, validation, investigation, visualization, writing–original draft. A. Verma: Formal analysis, investigation, methodology, writing–original draft. D. Bedi: Data curation, methodology. B. Karanam: Data curation, methodology. I. Elhussin: Data curation, methodology. Y. Chen: Data curation, validation, investigation. E. Gebregzabher: Resources, investigation. J.K. Oppong: Resources, investigation, project administration. E.K. Adjei: Resources, investigation, methodology, project administration. A. Jibril Suleiman: Resources, investigation. B. Awuah: Resources, investigation, project administration. M.B. Muleta: Resources, investigation, project administration. E. Abebe: Resources, data curation, project administration. I. Kyei: Resources, data curation, investigation. F.S. Aitpillah: Resources, data curation. M.O. Adinku: Resources, data curation. K. Ankomah: Resources, data curation, investigation. E.B. Osei-Bonsu: Resources, investigation. D.A. Chitale: Resources, data curation, investigation, methodology. J.M. Bensenhaver: Resources, data curation, supervision, investigation, project administration. D.S. Nathanson: Resources, funding acquisition, investigation, project administration, writing–review and editing. L. Jackson: Resources, methodology. L.F. Petersen: Resources, investigation, writing–review and editing. E. Proctor: Resources, investigation. B. Stonaker: Resources, data curation, project administration. K.K. Gyan: Resources, project administration. L.D. Gibbs: Formal analysis, investigation, methodology. Z. Manojlovic: Formal analysis, supervision, investigation, methodology, writing-original draft. R.A. Kittles: Methodology, writing–review and editing. J. White: Resources, validation, investigation, methodology. C.C. Yates: Conceptualization, funding acquisition, validation, writing–review and editing. U. Manne: Resources, funding acquisition, validation, writing–review and editing. K. Gardner: Conceptualization, writing–review and editing. N. Mongan: Writing–review and editing. E. Cheng: Resources, data curation, investigation, methodology. P. Ginter: Resources, data curation, investigation, methodology. S. Hoda: Resources, data curation, investigation, methodology. O. Elemento: Resources, supervision, methodology, writing–review and editing. N. Robine: Data curation, software, formal analysis, supervision, investigation, writing–review and editing. A. Sboner: Data curation, formal analysis, supervision, investigation. J.D. Carpten: Resources, formal analysis, supervision, investigation, methodology, writing–review and editing. L. Newman: Conceptualization, resources, funding acquisition, project administration, writing–review and editing. M.B. Davis: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing-original draft, project administration, writing–review and editing.
Acknowledgments
This work was supported by funding from Susan G. Komen (awarded to L. Newman) and R01 CA259396-01 (awarded to M.B. Davis). This work was also supported by U54-MD007585-26 (NIH/NIMHD), U54 CA118623 (NIH/NCI), and Department of Defense grants (PC170315P1 and W81XWH-18-1-0589) awarded to C.C. Yates. These studies were partly supported by 5U54CA118948 (NIH/NCI) and by institutional funds (Department of Pathology and School of Medicine of the UAB) awarded to U. Manne. We acknowledge the help provided by the UAB Tissue Biorepository Shared Facility grant of the UAB O'Neal Comprehensive Cancer Center, P30CA013148. This work was also supported by U54CA233465 for L.D. Gibbs and J.D. Carpten and P30CA014089 for L.D. Gibbs, Z. Manojlovic, and J.D. Carpten. This research was further supported by the University of Southern California Institute of Translational Genomics Keck Genomics Platform. We thank all the members of the International Center for the Study of Breast Cancer Subtypes, from the United States and Africa, for their dedication to our mission. We also extend our most sincere gratitude to all the patients and their families for their contribution and trust in this work. The GTEx Project was supported by the Common Fund of the Office of the Director of the NIH and by the NCI, the National Human Genome Research Institute (NHGRI), the National Heart, Lung, and Blood Institute (NHLBI), the National Institute on Drug Abuse (NIDA), the National Institute of Mental Health (NIMH), and the National Institute of Neurological Disorders and Stroke (NINDS). The data used for the analyses described in this article were obtained from the GTEx Portal in May 2022.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Cancer Discovery Online (http://cancerdiscovery.aacrjournals.org/).