Abstract
Genome-wide association studies (GWAS) of cancer have identified more than 700 risk loci, of which approximately 80% were first discovered in European ancestry populations, approximately 15% in East Asians, 3% in multiethnic scans, and less than 1% in African and Latin American populations. These percentages closely mirror the distribution of samples included in the discovery phase of cancer GWAS to date (84% European, 11% East Asian, 4% African, and 1% Latin American ancestry). GWAS in non-European ancestry populations have provided insight into ancestry-specific variation in cancer and have pointed to regions of susceptibility that are of particular importance in certain populations. Uncovering and characterizing cancer risk loci in diverse populations is critical for understanding underlying biological mechanisms and developing future genetic risk prediction models in non-European ancestry populations. New GWAS and continued collaborations will be required to eliminate population inequalities in the number of studies, sample sizes, and variant content on GWAS arrays, and to better align genetic research in cancer to the global distribution of race/ethnicity Cancer Epidemiol Biomarkers Prev; 27(4); 405–17. ©2018 AACR.
See all articles in this CEBP Focus section, “Genome-Wide Association Studies in Cancer.”
Introduction
Genome-wide association studies (GWAS) have been successful in identifying more than 700 risk loci for cancer. The success of GWAS has been the result of rapid technological advances in highly dense genotyping arrays, with contemporary arrays indirectly capturing over 10 million common variants through enhanced resources of publicly available genetic information such as the 1000 Genomes Project; analytic methods for large-scale imputation; large sample collections; and international collaborations and consortia, enabling meta-analyses of GWAS of over tens of thousands of individuals (1). Despite the large number of individuals included in GWAS, these studies have remained largely centered on individuals of European ancestry. There has been growing recognition of the inadequate representation of diverse racial/ethnic populations in GWAS with concern regarding the translational impact of GWAS findings on populations globally (2, 3). For instance, a recently developed complex trait coalescent-based simulation framework showed that the prediction accuracy of GWAS-identified polygenetic risk scores was highly dependent on the genetic similarity to the discovery population (3). Extensive genetic diversity exists across populations and a substantial fraction of genetic variants are not being examined in GWAS of non-European ancestry populations.
There are stark differences in the burden of certain cancers across racial/ethnic populations. For instance, African American men have roughly a 67% higher incidence rate of prostate cancer than European American men (4). Also, Asian/Pacific Islander men and women experience a 70% and 95% higher incidence rate of liver cancer, respectively, than European American men and women (4). These differences in the burden of cancer across racial/ethnic groups have been attributed to an interplay of genetic, environmental, and social factors. There is growing evidence of population-specific genetic variation that may contribute to the observed racial/ethnic differences in the risk of cancer and cancer-related phenotypes. This is seen with the risk alleles at 8q24 and the higher burden of prostate cancer in African Americans (5, 6) and the missense variant in CREBRF and high prevalence of obesity in Samoans (7).
Here, we describe the novel findings from GWAS of cancer in non-European ancestry populations that have contributed to our understanding of cancer susceptibility. We also highlight the inequality in the availability of cancer GWAS data for non-European versus European ancestry populations, and discuss the next steps needed to further our understanding of cancer susceptibility across all racial/ethnic populations, and the scientific and public health gains in studying diverse groups.
Materials and Methods
We compiled a list of GWAS-discovered cancer risk loci in populations of non-European ancestry, using several approaches. First, we queried the NHGRI-EBI GWAS Catalog (http://www.ebi.ac.uk/gwas/) for all cancer risk loci. Second, we searched PubMed for GWAS of each cancer site (using terms “GWAS,” “genome-wide association study,” and cancer site) and reviewed published reports. We included reports published through December 15, 2016, as well as summary information from OncoArray-GWAS of the most common cancers in the U19 GAME-ON Consortium (8). For this review, cancer risk loci were those first discovered though GWAS that included non-European ancestry samples in the discovery stage and met genome-wide statistical significance (P ≤ 5 × 10−8). If multiple correlated variants were reported in a region, only the most statistically significant signal was included. In risk regions previously identified from GWAS in European ancestry populations, independent signals in non-European ancestry populations were determined through review of the literature (e.g., where investigators reported conditional analyses with the index risk variants and/or by evaluating the correlations between risk variants using data from HapMap or the 1000 Genomes Project). In the cancer site–specific sections below, we describe the scans conducted in non-European ancestry populations as well as the novel cancer risk loci that were discovered. A list of these risk loci as well as minor allele frequencies (minor allele as defined by the combined global population) are provided in Table 1. We also present discoveries from GWAS that included a multiethnic discovery stage; in most cases, the non-European ancestry sample represented a small fraction (<10%) of the overall sample. For each cancer site, we also provide a summary (Fig. 1) of the total number of cases included in the discovery stage for four main population groups: African, East Asian, Latin American, and European ancestry. Every attempt was made to include all GWAS in each population; however, due to the large number of GWAS conducted in European ancestry populations, it is possible that some were missed. Thus, the number of European ancestry cases is likely to be underestimated compared with the other groups.
. | . | . | . | Minor allele frequencya . | Discovery findingsb . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Cancer Site . | Chr . | SNP ID . | Minor Allele . | Global . | African Ancestry . | Latin American Ancestry . | East Asian Ancestry . | European Ancestry . | Discovery populationc . | OR . | P . | Reference . |
Prostate | ||||||||||||
1q32.1 | rs1775148 | T | 0.49 | 0.30 | 0.43 | 0.51 | 0.64 | ME | 1.06 | 3.80E−08 | 9 | |
3p11.2 | rs2055109 | C | 0.15 | 0.13 | 0.14 | 0.07 | 0.23 | EA | 1.20 | 3.90E−08 | 14 | |
5p15.33 | rs12653946 | T | 0.44 | 0.40 | 0.50 | 0.35 | 0.45 | EA | 1.26 | 3.90E−18 | 13 | |
6q14.1 | rs9443189 | G | 0.33 | 0.59 | 0.14 | 0.38 | 0.14 | ME | 1.08 | 3.90E−08 | 9 | |
6q22.1 | rs339331 | C | 0.32 | 0.24 | 0.26 | 0.40 | 0.31 | EA | 1.22 | 1.60E−12 | 13 | |
6q25.3 | rs4646284 | TG | 0.30 | 0.39 | 0.26 | 0.30 | 0.28 | ME | 1.16 | 5.40E−12 | 18 | |
8q24.21 | rs111906932 | A | 0.01 | 0.03 | 0.00 | 0.00 | 0.00 | AFR | 1.79 | 1.39E−13 | 5 | |
8q24.21 | rs114798100 | G | 0.01 | 0.05 | 0.00 | 0.00 | 0.00 | AFR | 2.31 | 2.38E−40 | 5 | |
9q31.2 | rs817826 | C | 0.20 | 0.35 | 0.11 | 0.12 | 0.14 | EA | 1.41 | 5.50E−14 | 15 | |
10q26.12 | rs2252004 | A | 0.35 | 0.59 | 0.30 | 0.32 | 0.09 | EA | 1.16 | 2.00E−08 | 14 | |
11p15.4 | rs12791447 | G | 0.10 | 0.01 | 0.20 | 0.12 | 0.06 | EA | 1.23 | 3.60E−08 | 16 | |
11q12.1 | rs1938781 | G | 0.28 | 0.39 | 0.25 | 0.29 | 0.20 | EA | 1.16 | 1.10E−10 | 14 | |
13q22.1 | rs9600079 | T | 0.46 | 0.52 | 0.38 | 0.45 | 0.47 | EA | 1.18 | 2.80E−09 | 13 | |
14q23.2 | rs58262369 | T | 0.05 | 0.06 | 0.08 | 0.08 | 0.00 | EA | 1.28 | 6.10E−10 | 16 | |
14q23.1 | rs7153648 | C | 0.24 | 0.39 | 0.12 | 0.18 | 0.08 | ME | 1.11 | 2.00E−09 | 9 | |
16q22.2 | rs12051443 | A | 0.41 | 0.25 | 0.52 | 0.73 | 0.33 | ME | 1.06 | 3.00E−08 | 9 | |
17q21.33 | rs7210100 | A | 0.02 | 0.06 | 0.00 | 0.00 | 0.00 | AFR | 1.51 | 3.40E−13 | 11 | |
19q13.42 | rs103294 | T | 0.22 | 0.07 | 0.24 | 0.51 | 0.18 | EA | 1.28 | 5.30E−16 | 15 | |
20q13.13 | rs12480328 | C | 0.10 | 0.16 | 0.06 | 0.09 | 0.07 | ME | 1.13 | 4.60E−11 | 9 | |
21q22.3 | rs1041449 | G | 0.34 | 0.39 | 0.49 | 0.20 | 0.41 | ME | 1.06 | 2.80E−08 | 9 | |
22q11.21 | rs2238776 | A | 0.21 | 0.02 | 0.27 | 0.43 | 0.20 | ME | 1.08 | 1.80E−08 | 9 | |
Breast | ||||||||||||
1p22.3 | rs12118297 | T | 0.23 | 0.22 | 0.13 | 0.27 | 0.20 | EA | 1.10 | 4.50E−08 | 29 | |
1q32.1 | rs4951011 | G | 0.20 | 0.05 | 0.33 | 0.31 | 0.15 | EA | 1.09 | 8.80E−09 | 28 | |
2q34 | rs13393577 | C | 0.11 | 0.15 | 0.09 | 0.04 | 0.10 | EA | 1.53 | 8.80E−14 | 24 | |
3q26.31 | rs13074711 | C | 0.18 | 0.39 | 0.10 | 0.14 | 0.11 | AFR | 1.29 | 1.80E−08 | 31 | |
5q14.3 | rs10474352 | T | 0.34 | 0.39 | 0.19 | 0.49 | 0.14 | EA | 1.09 | 1.67E−09 | 28 | |
5p15.33 | rs10069690 | T | 0.35 | 0.66 | 0.22 | 0.17 | 0.28 | ME | 1.18 | 1.00E−10 | 32 | |
6q14.1 | rs17530068 | C | 0.18 | 0.04 | 0.24 | 0.25 | 0.23 | ME | 1.12 | 1.10E−09 | 35 | |
6q25.1 | rs140068132 | G | 0.02 | 0.00 | 0.12 | 0.00 | 0.00 | LA | 1.67 | 9.00E−18 | 34 | |
6q25.1 | rs2046210 | A | 0.41 | 0.66 | 0.27 | 0.36 | 0.32 | EA | 1.29 | 2.00E−15 | 26 | |
6q25.1 | rs9485372 | A | 0.24 | 0.20 | 0.19 | 0.42 | 0.16 | EA | 1.11 | 3.80E−12 | 27 | |
10q21.2 | rs10822013 | T | 0.42 | 0.18 | 0.52 | 0.50 | 0.50 | EA | 1.10 | 5.90E−09 | 144 | |
15q26.1 | rs2290203 | A | 0.38 | 0.44 | 0.30 | 0.51 | 0.21 | EA | 1.08 | 4.25E−08 | 28 | |
20q11.22 | rs2284378 | T | 0.20 | 0.13 | 0.35 | 0.17 | 0.29 | ME | 1.16 | 1.10E−08 | 35 | |
21q22.12 | rs16992204 | C | 0.05 | 0.06 | 0.02 | 0.14 | 0.00 | EA | 1.13 | 4.60E−08 | 29 | |
22q13.1 | APOBEC3 deletion | 0.00 | 0.00 | 0.00 | 0.31 | 0.10 | EA | 1.35 | 9.60E−22 | 25, 145 | ||
Colorectal | ||||||||||||
5q23.3 | rs12522693 | A | 0.11 | 0.03 | 0.10 | 0.14 | 0.15 | EA | 1.31 | 2.08E−08 | 47 | |
5q31.1 | rs647161 | C | 0.46 | 0.48 | 0.37 | 0.70 | 0.34 | EA | 1.11 | 1.22E−10 | 45 | |
6p21.1 | rs4711689 | G | 0.23 | 0.06 | 0.29 | 0.20 | 0.46 | EA | 1.11 | 3.92E−08 | 46 | |
8q23.3 | rs2450115 | C | 0.26 | 0.18 | 0.25 | 0.46 | 0.19 | EA | 1.12 | 1.24E−12 | 46 | |
8q23.3 | rs6469656 | G | 0.19 | 0.17 | 0.21 | 0.33 | 0.11 | EA | 1.11 | 2.03E−12 | 46 | |
10q24.32 | rs4919687 | A | 0.19 | 0.11 | 0.20 | 0.25 | 0.30 | EA | 1.14 | 7.82E−12 | 46 | |
10q25.2 | rs12241008 | C | 0.19 | 0.21 | 0.07 | 0.30 | 0.10 | ME | 1.13 | 1.40E−09 | 44 | |
10q25.2 | rs11196172 | A | 0.26 | 0.05 | 0.16 | 0.65 | 0.12 | EA | 1.14 | 1.04E−12 | 38 | |
10q22.3 | rs704017 | G | 0.45 | 0.56 | 0.38 | 0.28 | 0.56 | EA | 1.10 | 2.07E−08 | 38 | |
11q12.2 | rs174537 | T | 0.30 | 0.02 | 0.59 | 0.57 | 0.35 | EA | 1.16 | 9.22E−21 | 38 | |
12p13.31 | rs10849432 | C | 0.17 | 0.30 | 0.20 | 0.19 | 0.10 | EA | 1.14 | 5.81E−10 | 38 | |
12p13.31 | rs11064437 | T | 0.16 | 0.31 | 0.11 | 0.28 | 0.01 | EA | 1.12 | 4.48E−11 | 46 | |
12p13.32 | rs10774214 | T | 0.45 | 0.66 | 0.46 | 0.32 | 0.38 | EA | 1.09 | 3.06E−08 | 45 | |
17p13.3 | rs12603526 | C | 0.05 | 0.00 | 0.02 | 0.20 | 0.01 | EA | 1.10 | 3.42E−08 | 38 | |
17q12 | rs17836917 | A | 0.05 | 0.01 | 0.07 | 0.11 | 0.02 | EA | 0.75 | 4.55E−08 | 47 | |
18q21.1 | rs7229639 | A | 0.13 | 0.21 | 0.11 | 0.13 | 0.10 | EA | 1.22 | 2.93E−11 | 42 | |
19q13.2 | rs1800469 | A | 0.37 | 0.22 | 0.46 | 0.55 | 0.31 | EA | 1.09 | 1.17E−08 | 38 | |
20p12.3 | rs2423279 | C | 0.36 | 0.39 | 0.29 | 0.33 | 0.27 | EA | 1.10 | 6.64E−09 | 45 | |
Lung | ||||||||||||
3q28 | rs4488809 | C | 0.42 | 0.28 | 0.36 | 0.52 | 0.49 | EA | 1.26 | 7.00E−26 | 58 | |
3q28 | rs10937405 | T | 0.32 | 0.27 | 0.29 | 0.31 | 0.43 | EA | 1.31 | 7.26E−12 | 58 | |
3q29 | rs2131877 | A | 0.30 | 0.21 | 0.25 | 0.57 | 0.19 | EA | 1.30 | 2.00E−08 | 146 | |
5p15.33 | rs465498 | G | 0.36 | 0.59 | 0.31 | 0.19 | 0.44 | EA | 1.33 | 1.20E−20 | 59 | |
5p15.33 | rs2736100 | C | 0.48 | 0.47 | 0.43 | 0.41 | 0.50 | EA | 1.27 | 2.91E−11 | 58 | |
6p21.32 | rs3817963 | C | 0.26 | 0.17 | 0.33 | 0.23 | 0.28 | EA | 1.18 | 2.70E−10 | 56 | |
6p21.32 | rs2395185 | T | 0.29 | 0.19 | 0.41 | 0.31 | 0.32 | EA | 1.17 | 1.00E−08 | 62 | |
6q22.1 | rs9387478 | C | 0.36 | 0.10 | 0.56 | 0.50 | 0.50 | EA | 1.18 | 4.00E−10 | 62 | |
9p21.3 | rs72658409 | T | 0.13 | 0.19 | 0.05 | 0.08 | 0.07 | EA | 0.77 | 1.41E−10 | 61 | |
10q25.2 | rs7086803 | A | 0.26 | 0.64 | 0.16 | 0.29 | 0.03 | EA | 1.28 | 4.00E−18 | 62 | |
12q13.13 | rs11610143 | G | 0.22 | 0.04 | 0.52 | 0.28 | 0.21 | EA | 0.89 | 4.96E−09 | 61 | |
12q23.1 | rs12296850 | G | 0.14 | 0.11 | 0.07 | 0.26 | 0.06 | EA | 1.28 | 1.19E−10 | 60 | |
13q12.12 | rs753955 | G | 0.44 | 0.14 | 0.57 | 0.34 | 0.63 | EA | 1.18 | 2.00E−12 | 59 | |
17q24.2 | rs7216064 | A | 0.32 | 0.18 | 0.27 | 0.69 | 0.21 | EA | 1.20 | 7.40E−11 | 56 | |
22q12.2 | rs17728461 | G | 0.22 | 0.07 | 0.26 | 0.19 | 0.32 | EA | 1.20 | 1.10E−11 | 59 | |
22q12.2 | rs36600 | T | 0.18 | 0.09 | 0.30 | 0.09 | 0.27 | EA | 1.29 | 6.00E−13 | 59 | |
Ovarian | ||||||||||||
9q22.33 | rs1413299 | G | 0.42 | 0.27 | 0.47 | 0.41 | 0.60 | EA | 1.20 | 1.88E−08 | 64 | |
10p11.21 | rs1192691 | G | 0.24 | 0.14 | 0.11 | 0.61 | 0.08 | EA | 1.23 | 2.62E−08 | 64 | |
Cervical | ||||||||||||
4q12 | rs13117307 | T | 0.15 | 0.12 | 0.12 | 0.10 | 0.26 | EA | 1.26 | 9.69E−09 | 72 | |
17q12 | rs8067378 | G | 0.43 | 0.56 | 0.41 | 0.28 | 0.51 | EA | 1.18 | 2.00E−08 | 72 | |
Bladder | ||||||||||||
5q12.3 | rs2042329 | T | 0.34 | 0.56 | 0.41 | 0.10 | 0.41 | EA | 1.40 | 4.61E−11 | 73 | |
15q24.1 | rs11543198 | A | 0.14 | 0.04 | 0.36 | 0.25 | 0.03 | EA | 1.14 | 4.03E−09 | 74 | |
Hepatocellular | ||||||||||||
1p36.22 | rs17401966 | G | 0.23 | 0.06 | 0.32 | 0.29 | 0.31 | EA | 1.64 | 2.00E−18 | 80 | |
2q32.3 | rs7574865 | T | 0.26 | 0.12 | 0.36 | 0.35 | 0.23 | EA | 1.21 | 2.00E−10 | 81 | |
6p21.32 | rs9275572 | A | 0.34 | 0.40 | 0.31 | 0.25 | 0.40 | EA | 1.30 | 6.00E−09 | 83 | |
6p21.32 | rs9275319 | G | 0.15 | 0.11 | 0.29 | 0.13 | 0.16 | EA | 1.49 | 3.00E−17 | 81 | |
6p21.32 | rs9272105 | G | 0.48 | 0.42 | 0.47 | 0.53 | 0.45 | EA | 1.28 | 5.00E−22 | 82 | |
6p21.33 | rs2596542 | T | 0.42 | 0.54 | 0.52 | 0.27 | 0.40 | EA | 1.39 | 4.00E−13 | 83 | |
21q21.3 | rs455804 | A | 0.29 | 0.42 | 0.23 | 0.31 | 0.24 | EA | 1.19 | 5.00E−10 | 82 | |
Gastric | ||||||||||||
1q22 | rs4072037 | C | 0.37 | 0.40 | 0.29 | 0.24 | 0.45 | EA | 1.33 | 4.00E−07 | 85 | |
1q22 | rs80142782 | C | 0.03 | 0.00 | 0.00 | 0.13 | 0.00 | EA | 1.61 | 2.00E−19 | 90 | |
3q13.31 | rs9841504 | G | 0.19 | 0.33 | 0.18 | 0.15 | 0.07 | EA | 1.32 | 2.00E−09 | 84 | |
5p13.1 | rs13361707 | T | 0.36 | 0.36 | 0.25 | 0.55 | 0.29 | EA | 1.41 | 8.00E−29 | 84 | |
5q14.3 | rs7712641 | T | 0.45 | 0.67 | 0.45 | 0.46 | 0.36 | EA | 1.19 | 1.00E−11 | 90 | |
6p21.1 | rs2294693 | C | 0.29 | 0.34 | 0.26 | 0.27 | 0.18 | EA | 1.14 | 7.00E−08 | 88 | |
6p21.1 | rs2494938 | A | 0.48 | 0.56 | 0.38 | 0.27 | 0.52 | EA | 1.18 | 5.00E−09 | 89 | |
8q24.3 | rs2976392 | A | 0.41 | 0.37 | 0.51 | 0.34 | 0.45 | EA | 1.62 | 1.11E−09 | 86 | |
10q23.33 | rs2274223 | G | 0.30 | 0.38 | 0.21 | 0.24 | 0.34 | EA | 1.31 | 8.40E−09 | 85 | |
Pancreatic | ||||||||||||
5p13.1 | rs2255280 | C | 0.10 | 0.00 | 0.10 | 0.40 | 0.00 | EA | 1.23 | 4.00E−10 | 93 | |
10q26.11 | rs12413624 | A | 0.30 | 0.10 | 0.41 | 0.37 | 0.43 | EA | 1.23 | 5.00E−11 | 93 | |
21q21.3 | rs372883 | T | 0.43 | 0.37 | 0.46 | 0.54 | 0.49 | EA | 1.27 | 2.00E−13 | 93 | |
21q22.3 | rs1547374 | G | 0.36 | 0.36 | 0.34 | 0.46 | 0.31 | EA | 1.27 | 4.00E−13 | 93 | |
22q13.32 | rs5768709 | G | 0.31 | 0.31 | 0.25 | 0.23 | 0.39 | EA | 1.25 | 1.00E−10 | 93 | |
Esophageal | ||||||||||||
2q33.1 | rs13016963 | A | 0.36 | 0.42 | 0.51 | 0.31 | 0.38 | EA | 1.29 | 7.63E−10 | 100 | |
4q23 | rs1229984 | T | 0.16 | 0.00 | 0.06 | 0.70 | 0.03 | EA | 1.79 | 8.00E−24 | 96 | |
5q31.2 | rs7447927 | G | 0.43 | 0.08 | 0.51 | 0.38 | 0.69 | EA | 1.18 | 8.00E−20 | 99 | |
5q11.2 | rs10052657 | A | 0.15 | 0.11 | 0.19 | 0.13 | 0.17 | EA | 1.49 | 2.00E−19 | 97 | |
6p21.1 | rs10484761 | C | 0.27 | 0.48 | 0.24 | 0.13 | 0.24 | EA | 1.33 | 7.00E−12 | 97 | |
10q23.33 | rs2274223 | G | 0.30 | 0.38 | 0.21 | 0.24 | 0.34 | EA | 1.47 | 2.81E−09 | 98 | |
10q23.33 | rs3765524 | T | 0.31 | 0.46 | 0.21 | 0.24 | 0.31 | EA | 1.35 | 2.00E−09 | 85 | |
12q24.12 | rs671 | A | 0.04 | 0.00 | 0.00 | 0.17 | 0.00 | EA | 1.67 | 3.00E−24 | 96 | |
12q24.13 | rs2074356 | A | 0.03 | 0.00 | 0.00 | 0.13 | 0.00 | EA | 1.56 | 2.00E−31 | 97 | |
16q12.1 | rs4785204 | T | 0.13 | 0.10 | 0.10 | 0.26 | 0.07 | EA | 1.24 | 2.00E−20 | 99 | |
17p13.1 | rs1642764 | T | 0.42 | 0.24 | 0.47 | 0.59 | 0.52 | EA | 1.14 | 3.00E−13 | 99 | |
21q22.12 | rs2014300 | A | 0.26 | 0.55 | 0.29 | 0.12 | 0.15 | EA | 1.43 | 8.00E−22 | 97 | |
22q12 | rs738722 | T | 0.30 | 0.45 | 0.34 | 0.23 | 0.29 | EA | 1.3 | 1.00E−08 | 85 | |
Oral and/or oropharynx | ||||||||||||
4q23 | rs1229984 | T | 0.16 | 0.00 | 0.06 | 0.70 | 0.03 | ME | 0.56 | 2.29E−15 | 101 | |
6p21.32 | rs3828805 | T | 0.18 | 0.27 | 0.14 | 0.11 | 0.22 | ME | 1.28 | 3.35E−13 | 101 | |
10q26.13 | rs201982221d | A | ME | 1.67 | 1.58E−09 | 101 | ||||||
11p15.4 | rs1453414 | C | 0.13 | 0.04 | 0.12 | 0.23 | 0.11 | ME | 1.19 | 4.78E−08 | 101 | |
2p23.3 | rs6547741 | G | 0.38 | 0.47 | 0.32 | 0.09 | 0.50 | ME | 0.83 | 3.97E−08 | 101 | |
5p15.33 | rs10462706 | T | 0.20 | 0.01 | 0.31 | 0.40 | 0.15 | ME | 0.74 | 5.54 E−10 | 101 | |
9p21.3 | rs8181047 | A | 0.15 | 0.01 | 0.14 | 0.11 | 0.31 | ME | 1.24 | 3.8 E−09 | 101 | |
9q34.12 | rs928674 | G | 0.17 | 0.22 | 0.09 | 0.15 | 0.14 | ME | 1.33 | 2.09E−08 | 101 | |
Nasopharyngeal | ||||||||||||
3p22.2 | rs189897 | A | 0.10 | 0.01 | 0.15 | 0.06 | 0.18 | EA | 3.18 | 6.85E−08 | 105 | |
3q26.2 | rs6774494 | A | 0.48 | 0.63 | 0.56 | 0.60 | 0.32 | EA | 0.84 | 1.34E−08 | 103 | |
5p15.33 | rs31489 | A | 0.32 | 0.49 | 0.28 | 0.19 | 0.41 | EA | 0.81 | 6.30E−13 | 108 | |
6p22.1 | rs417162 | C | 0.32 | 0.30 | 0.23 | 0.31 | 0.37 | EA | 0.63 | 1.05E−11 | 102 | |
9p21.3 | rs1412829 | G | 0.18 | 0.01 | 0.19 | 0.09 | 0.41 | EA | 0.8 | 2.80E−08 | 108 | |
13q12.12 | rs9510787 | G | 0.18 | 0.04 | 0.16 | 0.32 | 0.26 | EA | 1.20 | 1.53E−9 | 103 | |
16p13.13 | rs6498114 | G | 0.28 | 0.11 | 0.16 | 0.57 | 0.24 | EA | 0.87 | 4.01E−09 | 107 | |
Laryngeal squamous cell | ||||||||||||
6p21.33 | rs2857595 | A | 0.32 | 0.44 | 0.29 | 0.50 | 0.16 | EA | 0.78 | 2.43E−15 | 109 | |
11q12.2 | rs174549 | A | 0.28 | 0.02 | 0.57 | 0.55 | 0.30 | EA | 0.73 | 1.00E−20 | 109 | |
12q24.21 | rs10492336 | A | 0.17 | 0.03 | 0.16 | 0.20 | 0.32 | EA | 0.71 | 4.48E−14 | 109 | |
Non-Hodgkin lymphoma | ||||||||||||
3q27.3 | rs6773854 | C | 0.20 | 0.30 | 0.16 | 0.17 | 0.21 | EA | 1.44 | 3.40E−13 | 114 | |
Acute lymphoblastic leukemia | ||||||||||||
10p12.2 | rs7088318 | C | 0.41 | 0.66 | 0.25 | 0.39 | 0.40 | ME | 1.42 | 1.10E−11 | 119 | |
16q23.1 | rs1121404 | C | 0.47 | 0.53 | 0.59 | 0.31 | 0.55 | EA | 1.38 | 5.30E−10 | 118 | |
T-cell lymphoma | ||||||||||||
6p21.32 | rs9277378 | G | 0.49 | 0.70 | 0.34 | 0.61 | 0.31 | EA | 1.84 | 4.21E−19 | 115 |
. | . | . | . | Minor allele frequencya . | Discovery findingsb . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Cancer Site . | Chr . | SNP ID . | Minor Allele . | Global . | African Ancestry . | Latin American Ancestry . | East Asian Ancestry . | European Ancestry . | Discovery populationc . | OR . | P . | Reference . |
Prostate | ||||||||||||
1q32.1 | rs1775148 | T | 0.49 | 0.30 | 0.43 | 0.51 | 0.64 | ME | 1.06 | 3.80E−08 | 9 | |
3p11.2 | rs2055109 | C | 0.15 | 0.13 | 0.14 | 0.07 | 0.23 | EA | 1.20 | 3.90E−08 | 14 | |
5p15.33 | rs12653946 | T | 0.44 | 0.40 | 0.50 | 0.35 | 0.45 | EA | 1.26 | 3.90E−18 | 13 | |
6q14.1 | rs9443189 | G | 0.33 | 0.59 | 0.14 | 0.38 | 0.14 | ME | 1.08 | 3.90E−08 | 9 | |
6q22.1 | rs339331 | C | 0.32 | 0.24 | 0.26 | 0.40 | 0.31 | EA | 1.22 | 1.60E−12 | 13 | |
6q25.3 | rs4646284 | TG | 0.30 | 0.39 | 0.26 | 0.30 | 0.28 | ME | 1.16 | 5.40E−12 | 18 | |
8q24.21 | rs111906932 | A | 0.01 | 0.03 | 0.00 | 0.00 | 0.00 | AFR | 1.79 | 1.39E−13 | 5 | |
8q24.21 | rs114798100 | G | 0.01 | 0.05 | 0.00 | 0.00 | 0.00 | AFR | 2.31 | 2.38E−40 | 5 | |
9q31.2 | rs817826 | C | 0.20 | 0.35 | 0.11 | 0.12 | 0.14 | EA | 1.41 | 5.50E−14 | 15 | |
10q26.12 | rs2252004 | A | 0.35 | 0.59 | 0.30 | 0.32 | 0.09 | EA | 1.16 | 2.00E−08 | 14 | |
11p15.4 | rs12791447 | G | 0.10 | 0.01 | 0.20 | 0.12 | 0.06 | EA | 1.23 | 3.60E−08 | 16 | |
11q12.1 | rs1938781 | G | 0.28 | 0.39 | 0.25 | 0.29 | 0.20 | EA | 1.16 | 1.10E−10 | 14 | |
13q22.1 | rs9600079 | T | 0.46 | 0.52 | 0.38 | 0.45 | 0.47 | EA | 1.18 | 2.80E−09 | 13 | |
14q23.2 | rs58262369 | T | 0.05 | 0.06 | 0.08 | 0.08 | 0.00 | EA | 1.28 | 6.10E−10 | 16 | |
14q23.1 | rs7153648 | C | 0.24 | 0.39 | 0.12 | 0.18 | 0.08 | ME | 1.11 | 2.00E−09 | 9 | |
16q22.2 | rs12051443 | A | 0.41 | 0.25 | 0.52 | 0.73 | 0.33 | ME | 1.06 | 3.00E−08 | 9 | |
17q21.33 | rs7210100 | A | 0.02 | 0.06 | 0.00 | 0.00 | 0.00 | AFR | 1.51 | 3.40E−13 | 11 | |
19q13.42 | rs103294 | T | 0.22 | 0.07 | 0.24 | 0.51 | 0.18 | EA | 1.28 | 5.30E−16 | 15 | |
20q13.13 | rs12480328 | C | 0.10 | 0.16 | 0.06 | 0.09 | 0.07 | ME | 1.13 | 4.60E−11 | 9 | |
21q22.3 | rs1041449 | G | 0.34 | 0.39 | 0.49 | 0.20 | 0.41 | ME | 1.06 | 2.80E−08 | 9 | |
22q11.21 | rs2238776 | A | 0.21 | 0.02 | 0.27 | 0.43 | 0.20 | ME | 1.08 | 1.80E−08 | 9 | |
Breast | ||||||||||||
1p22.3 | rs12118297 | T | 0.23 | 0.22 | 0.13 | 0.27 | 0.20 | EA | 1.10 | 4.50E−08 | 29 | |
1q32.1 | rs4951011 | G | 0.20 | 0.05 | 0.33 | 0.31 | 0.15 | EA | 1.09 | 8.80E−09 | 28 | |
2q34 | rs13393577 | C | 0.11 | 0.15 | 0.09 | 0.04 | 0.10 | EA | 1.53 | 8.80E−14 | 24 | |
3q26.31 | rs13074711 | C | 0.18 | 0.39 | 0.10 | 0.14 | 0.11 | AFR | 1.29 | 1.80E−08 | 31 | |
5q14.3 | rs10474352 | T | 0.34 | 0.39 | 0.19 | 0.49 | 0.14 | EA | 1.09 | 1.67E−09 | 28 | |
5p15.33 | rs10069690 | T | 0.35 | 0.66 | 0.22 | 0.17 | 0.28 | ME | 1.18 | 1.00E−10 | 32 | |
6q14.1 | rs17530068 | C | 0.18 | 0.04 | 0.24 | 0.25 | 0.23 | ME | 1.12 | 1.10E−09 | 35 | |
6q25.1 | rs140068132 | G | 0.02 | 0.00 | 0.12 | 0.00 | 0.00 | LA | 1.67 | 9.00E−18 | 34 | |
6q25.1 | rs2046210 | A | 0.41 | 0.66 | 0.27 | 0.36 | 0.32 | EA | 1.29 | 2.00E−15 | 26 | |
6q25.1 | rs9485372 | A | 0.24 | 0.20 | 0.19 | 0.42 | 0.16 | EA | 1.11 | 3.80E−12 | 27 | |
10q21.2 | rs10822013 | T | 0.42 | 0.18 | 0.52 | 0.50 | 0.50 | EA | 1.10 | 5.90E−09 | 144 | |
15q26.1 | rs2290203 | A | 0.38 | 0.44 | 0.30 | 0.51 | 0.21 | EA | 1.08 | 4.25E−08 | 28 | |
20q11.22 | rs2284378 | T | 0.20 | 0.13 | 0.35 | 0.17 | 0.29 | ME | 1.16 | 1.10E−08 | 35 | |
21q22.12 | rs16992204 | C | 0.05 | 0.06 | 0.02 | 0.14 | 0.00 | EA | 1.13 | 4.60E−08 | 29 | |
22q13.1 | APOBEC3 deletion | 0.00 | 0.00 | 0.00 | 0.31 | 0.10 | EA | 1.35 | 9.60E−22 | 25, 145 | ||
Colorectal | ||||||||||||
5q23.3 | rs12522693 | A | 0.11 | 0.03 | 0.10 | 0.14 | 0.15 | EA | 1.31 | 2.08E−08 | 47 | |
5q31.1 | rs647161 | C | 0.46 | 0.48 | 0.37 | 0.70 | 0.34 | EA | 1.11 | 1.22E−10 | 45 | |
6p21.1 | rs4711689 | G | 0.23 | 0.06 | 0.29 | 0.20 | 0.46 | EA | 1.11 | 3.92E−08 | 46 | |
8q23.3 | rs2450115 | C | 0.26 | 0.18 | 0.25 | 0.46 | 0.19 | EA | 1.12 | 1.24E−12 | 46 | |
8q23.3 | rs6469656 | G | 0.19 | 0.17 | 0.21 | 0.33 | 0.11 | EA | 1.11 | 2.03E−12 | 46 | |
10q24.32 | rs4919687 | A | 0.19 | 0.11 | 0.20 | 0.25 | 0.30 | EA | 1.14 | 7.82E−12 | 46 | |
10q25.2 | rs12241008 | C | 0.19 | 0.21 | 0.07 | 0.30 | 0.10 | ME | 1.13 | 1.40E−09 | 44 | |
10q25.2 | rs11196172 | A | 0.26 | 0.05 | 0.16 | 0.65 | 0.12 | EA | 1.14 | 1.04E−12 | 38 | |
10q22.3 | rs704017 | G | 0.45 | 0.56 | 0.38 | 0.28 | 0.56 | EA | 1.10 | 2.07E−08 | 38 | |
11q12.2 | rs174537 | T | 0.30 | 0.02 | 0.59 | 0.57 | 0.35 | EA | 1.16 | 9.22E−21 | 38 | |
12p13.31 | rs10849432 | C | 0.17 | 0.30 | 0.20 | 0.19 | 0.10 | EA | 1.14 | 5.81E−10 | 38 | |
12p13.31 | rs11064437 | T | 0.16 | 0.31 | 0.11 | 0.28 | 0.01 | EA | 1.12 | 4.48E−11 | 46 | |
12p13.32 | rs10774214 | T | 0.45 | 0.66 | 0.46 | 0.32 | 0.38 | EA | 1.09 | 3.06E−08 | 45 | |
17p13.3 | rs12603526 | C | 0.05 | 0.00 | 0.02 | 0.20 | 0.01 | EA | 1.10 | 3.42E−08 | 38 | |
17q12 | rs17836917 | A | 0.05 | 0.01 | 0.07 | 0.11 | 0.02 | EA | 0.75 | 4.55E−08 | 47 | |
18q21.1 | rs7229639 | A | 0.13 | 0.21 | 0.11 | 0.13 | 0.10 | EA | 1.22 | 2.93E−11 | 42 | |
19q13.2 | rs1800469 | A | 0.37 | 0.22 | 0.46 | 0.55 | 0.31 | EA | 1.09 | 1.17E−08 | 38 | |
20p12.3 | rs2423279 | C | 0.36 | 0.39 | 0.29 | 0.33 | 0.27 | EA | 1.10 | 6.64E−09 | 45 | |
Lung | ||||||||||||
3q28 | rs4488809 | C | 0.42 | 0.28 | 0.36 | 0.52 | 0.49 | EA | 1.26 | 7.00E−26 | 58 | |
3q28 | rs10937405 | T | 0.32 | 0.27 | 0.29 | 0.31 | 0.43 | EA | 1.31 | 7.26E−12 | 58 | |
3q29 | rs2131877 | A | 0.30 | 0.21 | 0.25 | 0.57 | 0.19 | EA | 1.30 | 2.00E−08 | 146 | |
5p15.33 | rs465498 | G | 0.36 | 0.59 | 0.31 | 0.19 | 0.44 | EA | 1.33 | 1.20E−20 | 59 | |
5p15.33 | rs2736100 | C | 0.48 | 0.47 | 0.43 | 0.41 | 0.50 | EA | 1.27 | 2.91E−11 | 58 | |
6p21.32 | rs3817963 | C | 0.26 | 0.17 | 0.33 | 0.23 | 0.28 | EA | 1.18 | 2.70E−10 | 56 | |
6p21.32 | rs2395185 | T | 0.29 | 0.19 | 0.41 | 0.31 | 0.32 | EA | 1.17 | 1.00E−08 | 62 | |
6q22.1 | rs9387478 | C | 0.36 | 0.10 | 0.56 | 0.50 | 0.50 | EA | 1.18 | 4.00E−10 | 62 | |
9p21.3 | rs72658409 | T | 0.13 | 0.19 | 0.05 | 0.08 | 0.07 | EA | 0.77 | 1.41E−10 | 61 | |
10q25.2 | rs7086803 | A | 0.26 | 0.64 | 0.16 | 0.29 | 0.03 | EA | 1.28 | 4.00E−18 | 62 | |
12q13.13 | rs11610143 | G | 0.22 | 0.04 | 0.52 | 0.28 | 0.21 | EA | 0.89 | 4.96E−09 | 61 | |
12q23.1 | rs12296850 | G | 0.14 | 0.11 | 0.07 | 0.26 | 0.06 | EA | 1.28 | 1.19E−10 | 60 | |
13q12.12 | rs753955 | G | 0.44 | 0.14 | 0.57 | 0.34 | 0.63 | EA | 1.18 | 2.00E−12 | 59 | |
17q24.2 | rs7216064 | A | 0.32 | 0.18 | 0.27 | 0.69 | 0.21 | EA | 1.20 | 7.40E−11 | 56 | |
22q12.2 | rs17728461 | G | 0.22 | 0.07 | 0.26 | 0.19 | 0.32 | EA | 1.20 | 1.10E−11 | 59 | |
22q12.2 | rs36600 | T | 0.18 | 0.09 | 0.30 | 0.09 | 0.27 | EA | 1.29 | 6.00E−13 | 59 | |
Ovarian | ||||||||||||
9q22.33 | rs1413299 | G | 0.42 | 0.27 | 0.47 | 0.41 | 0.60 | EA | 1.20 | 1.88E−08 | 64 | |
10p11.21 | rs1192691 | G | 0.24 | 0.14 | 0.11 | 0.61 | 0.08 | EA | 1.23 | 2.62E−08 | 64 | |
Cervical | ||||||||||||
4q12 | rs13117307 | T | 0.15 | 0.12 | 0.12 | 0.10 | 0.26 | EA | 1.26 | 9.69E−09 | 72 | |
17q12 | rs8067378 | G | 0.43 | 0.56 | 0.41 | 0.28 | 0.51 | EA | 1.18 | 2.00E−08 | 72 | |
Bladder | ||||||||||||
5q12.3 | rs2042329 | T | 0.34 | 0.56 | 0.41 | 0.10 | 0.41 | EA | 1.40 | 4.61E−11 | 73 | |
15q24.1 | rs11543198 | A | 0.14 | 0.04 | 0.36 | 0.25 | 0.03 | EA | 1.14 | 4.03E−09 | 74 | |
Hepatocellular | ||||||||||||
1p36.22 | rs17401966 | G | 0.23 | 0.06 | 0.32 | 0.29 | 0.31 | EA | 1.64 | 2.00E−18 | 80 | |
2q32.3 | rs7574865 | T | 0.26 | 0.12 | 0.36 | 0.35 | 0.23 | EA | 1.21 | 2.00E−10 | 81 | |
6p21.32 | rs9275572 | A | 0.34 | 0.40 | 0.31 | 0.25 | 0.40 | EA | 1.30 | 6.00E−09 | 83 | |
6p21.32 | rs9275319 | G | 0.15 | 0.11 | 0.29 | 0.13 | 0.16 | EA | 1.49 | 3.00E−17 | 81 | |
6p21.32 | rs9272105 | G | 0.48 | 0.42 | 0.47 | 0.53 | 0.45 | EA | 1.28 | 5.00E−22 | 82 | |
6p21.33 | rs2596542 | T | 0.42 | 0.54 | 0.52 | 0.27 | 0.40 | EA | 1.39 | 4.00E−13 | 83 | |
21q21.3 | rs455804 | A | 0.29 | 0.42 | 0.23 | 0.31 | 0.24 | EA | 1.19 | 5.00E−10 | 82 | |
Gastric | ||||||||||||
1q22 | rs4072037 | C | 0.37 | 0.40 | 0.29 | 0.24 | 0.45 | EA | 1.33 | 4.00E−07 | 85 | |
1q22 | rs80142782 | C | 0.03 | 0.00 | 0.00 | 0.13 | 0.00 | EA | 1.61 | 2.00E−19 | 90 | |
3q13.31 | rs9841504 | G | 0.19 | 0.33 | 0.18 | 0.15 | 0.07 | EA | 1.32 | 2.00E−09 | 84 | |
5p13.1 | rs13361707 | T | 0.36 | 0.36 | 0.25 | 0.55 | 0.29 | EA | 1.41 | 8.00E−29 | 84 | |
5q14.3 | rs7712641 | T | 0.45 | 0.67 | 0.45 | 0.46 | 0.36 | EA | 1.19 | 1.00E−11 | 90 | |
6p21.1 | rs2294693 | C | 0.29 | 0.34 | 0.26 | 0.27 | 0.18 | EA | 1.14 | 7.00E−08 | 88 | |
6p21.1 | rs2494938 | A | 0.48 | 0.56 | 0.38 | 0.27 | 0.52 | EA | 1.18 | 5.00E−09 | 89 | |
8q24.3 | rs2976392 | A | 0.41 | 0.37 | 0.51 | 0.34 | 0.45 | EA | 1.62 | 1.11E−09 | 86 | |
10q23.33 | rs2274223 | G | 0.30 | 0.38 | 0.21 | 0.24 | 0.34 | EA | 1.31 | 8.40E−09 | 85 | |
Pancreatic | ||||||||||||
5p13.1 | rs2255280 | C | 0.10 | 0.00 | 0.10 | 0.40 | 0.00 | EA | 1.23 | 4.00E−10 | 93 | |
10q26.11 | rs12413624 | A | 0.30 | 0.10 | 0.41 | 0.37 | 0.43 | EA | 1.23 | 5.00E−11 | 93 | |
21q21.3 | rs372883 | T | 0.43 | 0.37 | 0.46 | 0.54 | 0.49 | EA | 1.27 | 2.00E−13 | 93 | |
21q22.3 | rs1547374 | G | 0.36 | 0.36 | 0.34 | 0.46 | 0.31 | EA | 1.27 | 4.00E−13 | 93 | |
22q13.32 | rs5768709 | G | 0.31 | 0.31 | 0.25 | 0.23 | 0.39 | EA | 1.25 | 1.00E−10 | 93 | |
Esophageal | ||||||||||||
2q33.1 | rs13016963 | A | 0.36 | 0.42 | 0.51 | 0.31 | 0.38 | EA | 1.29 | 7.63E−10 | 100 | |
4q23 | rs1229984 | T | 0.16 | 0.00 | 0.06 | 0.70 | 0.03 | EA | 1.79 | 8.00E−24 | 96 | |
5q31.2 | rs7447927 | G | 0.43 | 0.08 | 0.51 | 0.38 | 0.69 | EA | 1.18 | 8.00E−20 | 99 | |
5q11.2 | rs10052657 | A | 0.15 | 0.11 | 0.19 | 0.13 | 0.17 | EA | 1.49 | 2.00E−19 | 97 | |
6p21.1 | rs10484761 | C | 0.27 | 0.48 | 0.24 | 0.13 | 0.24 | EA | 1.33 | 7.00E−12 | 97 | |
10q23.33 | rs2274223 | G | 0.30 | 0.38 | 0.21 | 0.24 | 0.34 | EA | 1.47 | 2.81E−09 | 98 | |
10q23.33 | rs3765524 | T | 0.31 | 0.46 | 0.21 | 0.24 | 0.31 | EA | 1.35 | 2.00E−09 | 85 | |
12q24.12 | rs671 | A | 0.04 | 0.00 | 0.00 | 0.17 | 0.00 | EA | 1.67 | 3.00E−24 | 96 | |
12q24.13 | rs2074356 | A | 0.03 | 0.00 | 0.00 | 0.13 | 0.00 | EA | 1.56 | 2.00E−31 | 97 | |
16q12.1 | rs4785204 | T | 0.13 | 0.10 | 0.10 | 0.26 | 0.07 | EA | 1.24 | 2.00E−20 | 99 | |
17p13.1 | rs1642764 | T | 0.42 | 0.24 | 0.47 | 0.59 | 0.52 | EA | 1.14 | 3.00E−13 | 99 | |
21q22.12 | rs2014300 | A | 0.26 | 0.55 | 0.29 | 0.12 | 0.15 | EA | 1.43 | 8.00E−22 | 97 | |
22q12 | rs738722 | T | 0.30 | 0.45 | 0.34 | 0.23 | 0.29 | EA | 1.3 | 1.00E−08 | 85 | |
Oral and/or oropharynx | ||||||||||||
4q23 | rs1229984 | T | 0.16 | 0.00 | 0.06 | 0.70 | 0.03 | ME | 0.56 | 2.29E−15 | 101 | |
6p21.32 | rs3828805 | T | 0.18 | 0.27 | 0.14 | 0.11 | 0.22 | ME | 1.28 | 3.35E−13 | 101 | |
10q26.13 | rs201982221d | A | ME | 1.67 | 1.58E−09 | 101 | ||||||
11p15.4 | rs1453414 | C | 0.13 | 0.04 | 0.12 | 0.23 | 0.11 | ME | 1.19 | 4.78E−08 | 101 | |
2p23.3 | rs6547741 | G | 0.38 | 0.47 | 0.32 | 0.09 | 0.50 | ME | 0.83 | 3.97E−08 | 101 | |
5p15.33 | rs10462706 | T | 0.20 | 0.01 | 0.31 | 0.40 | 0.15 | ME | 0.74 | 5.54 E−10 | 101 | |
9p21.3 | rs8181047 | A | 0.15 | 0.01 | 0.14 | 0.11 | 0.31 | ME | 1.24 | 3.8 E−09 | 101 | |
9q34.12 | rs928674 | G | 0.17 | 0.22 | 0.09 | 0.15 | 0.14 | ME | 1.33 | 2.09E−08 | 101 | |
Nasopharyngeal | ||||||||||||
3p22.2 | rs189897 | A | 0.10 | 0.01 | 0.15 | 0.06 | 0.18 | EA | 3.18 | 6.85E−08 | 105 | |
3q26.2 | rs6774494 | A | 0.48 | 0.63 | 0.56 | 0.60 | 0.32 | EA | 0.84 | 1.34E−08 | 103 | |
5p15.33 | rs31489 | A | 0.32 | 0.49 | 0.28 | 0.19 | 0.41 | EA | 0.81 | 6.30E−13 | 108 | |
6p22.1 | rs417162 | C | 0.32 | 0.30 | 0.23 | 0.31 | 0.37 | EA | 0.63 | 1.05E−11 | 102 | |
9p21.3 | rs1412829 | G | 0.18 | 0.01 | 0.19 | 0.09 | 0.41 | EA | 0.8 | 2.80E−08 | 108 | |
13q12.12 | rs9510787 | G | 0.18 | 0.04 | 0.16 | 0.32 | 0.26 | EA | 1.20 | 1.53E−9 | 103 | |
16p13.13 | rs6498114 | G | 0.28 | 0.11 | 0.16 | 0.57 | 0.24 | EA | 0.87 | 4.01E−09 | 107 | |
Laryngeal squamous cell | ||||||||||||
6p21.33 | rs2857595 | A | 0.32 | 0.44 | 0.29 | 0.50 | 0.16 | EA | 0.78 | 2.43E−15 | 109 | |
11q12.2 | rs174549 | A | 0.28 | 0.02 | 0.57 | 0.55 | 0.30 | EA | 0.73 | 1.00E−20 | 109 | |
12q24.21 | rs10492336 | A | 0.17 | 0.03 | 0.16 | 0.20 | 0.32 | EA | 0.71 | 4.48E−14 | 109 | |
Non-Hodgkin lymphoma | ||||||||||||
3q27.3 | rs6773854 | C | 0.20 | 0.30 | 0.16 | 0.17 | 0.21 | EA | 1.44 | 3.40E−13 | 114 | |
Acute lymphoblastic leukemia | ||||||||||||
10p12.2 | rs7088318 | C | 0.41 | 0.66 | 0.25 | 0.39 | 0.40 | ME | 1.42 | 1.10E−11 | 119 | |
16q23.1 | rs1121404 | C | 0.47 | 0.53 | 0.59 | 0.31 | 0.55 | EA | 1.38 | 5.30E−10 | 118 | |
T-cell lymphoma | ||||||||||||
6p21.32 | rs9277378 | G | 0.49 | 0.70 | 0.34 | 0.61 | 0.31 | EA | 1.84 | 4.21E−19 | 115 |
aThe minor alleles and frequencies for these alleles were extracted from the 1000 Genomes Project (https://mathgen.stats.ox.ac.uk/impute/1000GP_Phase3.html).
bThe ORs and P values presented relate to the discovery population and were extracted from the cited reference.
cAFR, African ancestry; EA, East Asians; LA, Latin Americans; ME, multiethnic.
drs201982221 is not in 1000 Genomes database; reported A allele frequency in controls = 0.06 (101).
Throughout this review, context for terms that can represent a place, a person, or an ancestry will be provided. In the rare instances where it is not specified, it can be assumed that the term is referring to ancestry and not geography. Also, the term “ancestry-specific variant” refers to variants that are extremely rare in other ancestral populations and are largely “population-private.”
Results
Prostate cancer
As the most common cancer in men, the number and size of prostate cancer GWAS in non-European ancestry populations have been larger than many other cancer sites, with five scans including men of African ancestry, seven scans including men of East Asian ancestry, and three scans including men of Latin American ancestry. Of the more than 170 risk loci identified for prostate cancer (9, 10), the vast majority were initially reported in men of European ancestry, with the largest scan including 81,318 prostate cancer cases and 61,074 controls of European ancestry (OncoArray, personal communication, F. Schumacher). Twenty-one loci were first discovered in GWAS or in meta-analyses of GWAS that included one or more populations of non-European ancestry. Genetic studies in the African Ancestry Prostate Cancer (AAPC) Consortium (>4,853 prostate cancer cases and 4,678 controls) identified two African-specific variants at 8q24 and one at 17q21.33/ZNF365 in a region that a later GWAS in men of European ancestry found evidence of association with a separate variant (5, 10, 11). The only GWAS in exclusively African men was in a small sample of Ghanaians (474 prostate cancer cases and 458 controls) that reported loci specific for high and low-risk prostate cancer that were not replicated in AAPC (12). GWAS of prostate cancer in Japanese men have discovered six loci at 3p11.2, 5p15.33, 6q22.1, 10q26.12, 11q12.1, and 13q22.1 (13, 14), whereas a GWAS in Chinese men reported two additional loci at 9q31.2 and 19q13.4 (15). A meta-analysis of GWAS in Japanese and Chinese, which included these prior studies (3,000 prostate cancer cases and 4,394 controls), revealed two additional risk loci at 11p15.4 and 14q23.2 (16). The only Latin American-specific scan (1,043 prostate cancer cases and 1,057 controls) was underpowered to detect novel loci of modest risk (17). Multiethnic scans that included many of these populations were successful in discovering novel risk loci, with 7 loci (1q32.1, 6q14.1, 14q23.1, 16q22.2, 20q13.13, 21q22.3, and 22q11.21) identified in 43,303 prostate cancer cases and 43,737 controls (∼80% of European ancestry) in the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL)/Elucidating Loci Involved in Prostate Cancer Susceptibility (ELLIPSE) Consortium (9) and another locus at 6q25.3 identified in a sample of 7,783 cases and 38,595 controls (80% of European ancestry) in the Kaiser Permanente GWAS (18). Of the nearly 100,000 prostate cancer cases included in the discovery stage of GWAS, 85% have been of European ancestry, 11% of African ancestry, 3% of East Asian ancestry, and 1% of Latin American ancestry.
Breast cancer
More than 170 breast cancer risk loci have been identified in GWAS of overall breast cancer, or estrogen receptor-positive (ER+) and -negative (ER−) disease (19–21), with the most recent scan in the Breast Cancer Association Consortium (BCAC)/Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) Consortium, discovering approximately 40% (n = 71) of these loci (OncoArray, personal communication, D. Easton). Fourteen of these loci (8%) were first discovered in GWAS in non-European ancestry populations. GWAS of breast cancer in East Asians (Japanese, Chinese, and Koreans) have reported 11 novel loci (22–29), including a common deletion in the APOBEC3 gene (25), with the largest meta-analysis from the Asian Breast Cancer Consortium (7,619 breast cancer cases and 6,286 controls), reporting the two latest discoveries at 1p22.3 and 21q22.12 (29). Two GWAS of breast cancer in women of African ancestry have been conducted in the African American Breast Cancer (AABC Consortium) and the GWAS in Breast Cancer in the African Diaspora (ROOT Consortium; refs. 30, 31). In 2011, a GWAS meta-analysis of ER− breast cancer in AABC and the Triple-Negative Breast Cancer Consortium (TNBCC) in women of European ancestry discovered a variant that is more common in women of African than European ancestry (allele frequency = 0.6 vs. 0.3) that is specific for ER− breast cancer and near the TERT gene at 5p15.33 (32). A recent meta-analysis of AABC and ROOT (4,673 breast cancer cases and 4,774 controls), with replication in the African American AMBER Consortium (33), reported another locus for ER− disease at 3q26.31 (31). In the only scan of Latin Americans (3,140 breast cancer cases and 8,184 controls), a novel protective variant that originates from Native Americans was identified at 6q25.1 (34), near the Estrogen Receptor 1 gene (ESR1) in a region previously discovered in Chinese (26). This variant is common >5% in Latin Americans, is low frequency (<1%) in East Asians, and is absent in populations of European and African ancestry. Multiethnic scans that have included many of these populations were also successful in discovering a novel locus at 6q14.1 for overall breast cancer and at 20q11.22 for ER− disease (35), although the latter region has not been replicated in subsequent scans. More than 130,000 breast cancer cases have been included in the discovery stage of GWAS, with 89% of European ancestry, 6% of East Asian ancestry, 3% of African ancestry, and 2% of Latin American ancestry.
Colorectal cancer
More than 80 risk loci for colorectal cancer have been identified by GWAS with the majority of studies conducted among European ancestry populations (36–40). In the largest GWAS of colorectal cancer to date, based on an European ancestry discovery set (36,948 colorectal cancer cases and 30,864 controls), 11 new loci were identified (OncoArray, personal communication, S. Gruber). Among non-European ancestry populations, there have been several GWAS of colorectal cancer in East Asians (Chinese, Japanese, and Koreans; refs. 38, 41–47) with only one scan of African Americans (44) and another scan of Latin Americans (48). In East Asians, risk loci have been discovered at 5q31.1, 10q22.3, 10q25.2, 11q12.2, 12p13.3, 17p13.3, 18q21.1, 19q13.2, and 20p12.3 (38, 42, 44, 45, 47). The largest GWAS of colorectal cancer in East Asians was conducted in the Asia Colorectal Cancer Consortium, with 8,027 colorectal cancer cases and 22,557 controls in the discovery set and 11,044 colorectal cancer cases and 12,047 controls in the replication set that identified four novel risk loci for colorectal cancer (6p21.1, 8q23.3, 10q24.3, and 12p13.3) as well as two new variants in previously identified risk regions (10q25.2 and 20q13.3; ref. 46). In a multiethnic GWAS of colorectal cancer that included a discovery sample of Japanese (2,627 colorectal cancer cases and 3,797 controls) and African American subjects (1,894 colorectal cancer cases and 4,703 controls), a novel risk locus was found at 10q25 (44). For Latin American ancestry populations, there has been one GWAS study of colorectal cancer that included 1,611 colorectal cancer cases and 4,300 controls (48). In this study, no novel risk loci were found that reached a threshold of genome-wide significance, but suggestive associations (P < 10−6) were identified at 1p34.3, 2q23.3, 14q24.2, and 16q12.2. Close to 37,000 colorectal cancer cases have been included in the discovery stage of GWAS, with 80% of European ancestry populations, 4% of African ancestry, 16% of East Asian ancestry, and 4% of Latin American ancestry.
Lung cancer
More than 40 risk loci for overall lung cancer and/or lung cancer histologic cell types have been identified with the first GWAS in populations of European ancestry (49–54). To date, the largest scan was conducted in a study of 29,266 lung cancer cases and 56,450 controls of European ancestry, which identified 10 novel loci (OncoArray, personal communication, C. Amos). Among GWAS or GWAS meta-analysis of lung cancer in non-European ancestry groups, one was conducted in African Americans and seven were conducted in East Asians (Chinese, Japanese, and Koreans). In the only African American GWAS (1,737 lung cancer cases and 3,602 controls), a signal at 15q25.1, that encodes for the alpha-nicotinic acetylcholine receptor (CHRNA3-CHRNB4-CHRNA5) (rs2036527) and one at 5p15.33, that encodes for telomerase reverse transcriptase (TERT; rs2853677; ref. 55), were identified, which confirmed the risk loci reported in earlier GWAS of European and East Asian ancestry populations (49, 51, 56). No novel loci were identified in this GWAS of African Americans (55). In East Asians, variants in the 15q25.1 CHRNA3-CHRNB4-CHRNA5 region were not associated with lung cancer at a genome-wide significance level. However, variant rs2736100 at 5p15.33 TERT, which was previously found to be a suggestive association in European ancestry populations (correlated with rs2853677 in East Asians; P > 10−7; ref. 57), reached genome-wide significance in a Japanese study with 1,026 adenocarcinoma cases (58). An additional novel independent signal at 5p15.33 TERT, rs465498, has also been identified for overall lung cancer in East Asian populations (59). In addition, GWAS in East Asians with discovery sample sizes ranges from 621 to 6,877 for overall lung cancer or lung cancer subtypes had identified 14 novel risk loci at 11 regions, including five risk loci at 6q22.1, 6p21.32 (rs2395185), 9p21.3, 10q25.2 and 12q13.13 among nonsmokers (56, 58–62). More than 43,000 lung cancer cases have been included in the discovery stage of GWAS, with 69% of European ancestry populations, 4% of African ancestry and 27% of East Asian ancestry; and no Latin American populations have been studied.
Ovarian cancer
GWAS of ovarian cancer have identified 39 risk loci, with the largest scan including 25,509 ovarian cancer cases and 40,941 controls of European ancestry (OncoArray, personal communication, P. Pharoah). All GWAS have been conducted in women of European ancestry (63) with the exception of a three-stage scan in Chinese women (1,044 ovarian cancer cases and 1,172 controls), which revealed a novel risk locus in an intron of the gene COL15A1 gene at 9q22.33 and a second locus upstream of the gene ANKRD30A at 10p11.21 (64).
Endometrial cancer
GWAS of endometrial cancer have been conducted primarily in women of European ancestry (65–68), with the exception of a small GWAS in Chinese women (832 endometrial cancer cases and 2,682 controls), which did not yield a novel discovery (69). Endometrial cancer GWAS have revealed eight risk loci, with the largest GWAS meta-analysis including 6,542 endometrial cancer cases of European ancestry (67).
Cervical cancer
Five cervical cancer risk loci have been identified in three GWAS of cervical cancer (70–72), two of which were conducted in East Asian women (71, 72). In the only scan in women of European ancestry, with a discovery set of 1,034 cervical cancer cases and 3,948 controls, three independent loci at the major histocompatibility complex (MHC) region at 6p21.3 were identified (70). In a Japanese scan that included a limited discovery sample of 226 cervical cancer cases and 186 Japanese controls, no genome-wide significant loci were identified (71). In a Chinese scan that included a discovery set of 1,364 cervical cases and 3,028 controls, novel risk loci at 4q12 and 17q12 were identified and the 6p21.3 risk region was confirmed (70, 72).
Bladder cancer
GWAS of bladder cancer have identified 13 risk loci in nine studies of European ancestry populations and two risk loci in two GWAS of East Asians (73, 74). The largest study conducted in a European ancestry population included a discovery set of 6,911 bladder cancer cases and 11,818 controls. In a Japanese study, with a discovery set of 431 bladder cancer cases and 5,581 controls, a novel risk variant at the 15q24.1 risk locus was identified (74). In a Chinese study, with a discovery set of 599 bladder cancer cases and 4,645 controls, a novel risk variant at 5q12.3 was reported that did not replicate in European ancestry populations (73).
Renal cell carcinoma
There have been six risk loci identified for renal cell carcinoma, all of which were identified in GWAS in European ancestry populations (75–78). In the only GWAS conducted in a non-European ancestry population, of 225 African American renal cell carcinoma cases and 375 controls, a signal was detected at 12p11.23, 77 kb from a previously reported European-ancestry renal cell carcinoma risk variant; however, the association did not reach genome-wide significance (P = 1.2 × 10−7; ref. 77).
Hepatocellular carcinoma
Seven hepatocellular carcinoma risk loci have been identified, all of which were found in East Asian populations, where the incidences rates are the highest globally. Major risk factors for hepatocellular carcinoma include hepatitis B or C infections and exposure to dietary aflatoxin B1, which are prevalent in East Asian countries such as China, Japan, and Korea. The first GWAS in a Korean population included a discovery set of 180 hepatocellular carcinoma cases and 271 controls, and did not detect an association at a genome-wide significance level (79). The second study, in 348 Chinese hepatitis B virus (HBV)-related hepatocellular carcinoma cases and 359 Chinese controls reported a novel risk locus at 1p36.22 (80). In two following studies, with increased sample sizes of over 1,000 hepatocellular carcinoma cases, four additional loci at 2q32.3, 6p21.32/HLA-DQ (two variants) and 21q21.3 were discovered (81, 82). Approximately 80% of hepatocellular carcinoma cases in China have a prior history of HBV infection and the largest HBV-hepatocellular carcinoma GWAS was conducted among Chinese, with 1,538 HBV-related hepatocellular carcinoma cases (82). In the only GWAS of hepatitis C virus (HCV)-hepatocellular carcinoma, conducted in Japanese, the investigators found two novel risk loci at 6p21.32 and 6p21.33 (83).
Gastric adenocarcinoma
Approximately 40% of all gastric adenocarcinoma cases occur in East Asia, primarily in China. Of the seven GWAS of gastric adenocarcinoma, six were conducted in populations of East Asian ancestry. To date, 11 risk loci have been reported, with nine discovered in East Asian men and women [1q22 (two variants), 3q13.31, 5p13.1, 5q14.3, 6p21.1 (two variants), 8q24.3 and 10q23.33; refs. 84–86], with discovery GWAS sample sizes, ranging from 188 to 2,350 gastric adenocarcinoma cases. In the only European ancestry GWAS, one risk loci and one risk haplotype were identified in a discovery sample of 2,500 Icelandic gastric cancer cases (∼90% of these cases were adenocarcinoma) and 205,652 controls (87). More recent GWAS in East Asians have characterized the genetic associations by sublocations: cardia gastric adenocarcinoma, which occurs in the top few centimeters of the stomach, or non-cardia adenocarcinoma, which occurs elsewhere in the stomach. Sublocation analyses suggest that the variants identified at 1q22, 5p13.1, 5q14.3, and 6p21.10 may be specifically associated with non-cardia gastric adenocarcinoma, although further work by sublocation is required to confirm these findings (88–90).
Pancreatic cancer
There have been 21 pancreatic cancer risk loci identified (91), with five discovered in GWAS in non-European ancestry populations (92, 93). To date, the majority of pancreatic cancer susceptibility loci have been identified from PanScan study. The largest pancreatic scan includes the collaboration between PanScan study and the Pancreatic Cancer Cohort Consortium (PanC4), which examined 7,638 pancreatic cancer cases and 7,364 controls of which >92% were of European ancestry (94). Only two GWAS of pancreatic cancer have been conducted in non-European ancestry samples Japanese (991 pancreatic cancer cases and 5,209 controls; ref. 92) and Chinese (981 pancreatic cancer cases and 1,991 controls) men and women (93). The study in a Japanese population failed to detect genome-wide significant associations, whereas the study among a Chinese populations identified five novel loci at 5p13.1, 10q26.11, 21q21.3, 21q22.3, and 22q13.32. This study also found a suggestive association at 2p13.3, which was later confirmed as a susceptibility locus in the mostly European PanScan-PanC4 (94).
Esophageal cancer
To date, the 16 susceptibility loci for esophageal cancer have been reported from seven GWAS of esophageal cancer, which includes two scans in populations of European ancestry (95) and five scans in East Asians (Japanese and Chinese; refs. 85, 96–100). The five GWAS of esophageal squamous cell carcinoma (ESCC) in East Asians had discovery sample sizes ranging from 182 to 5,337 ESCC cases and reported the discovery of 13 of the 16 risk loci at 2q33.1, 4q23, 5q11.2, 5q31.2, 6p21.1, 10q23.33 (two variants), 12q24.12, 12q24.13, 16q12.1, 17p13.1, 21q22.12, and 22q12. Variants at 12q24.12 (rs671) and 12q24.13 (rs2074356), which are in or near the ALDH2 gene, have been reported to have a potential multiplicative interaction with alcohol drinking (96, 97), a major risk factor for esophageal cancer. ALDH2 encodes for one of the enzymes responsible for the metabolism of alcohol. The ALDH2 rs671 A allele, which is common in East Asian populations (30–50% frequency) and almost nonexistent in European ancestry populations, has been associated with a reduction in the enzyme's ability to metabolize acetaldehyde, the primary metabolite of alcohol and an established carcinogen.
Head and neck cancers
Oral and oropharynx.
There has been one GWAS of oral and pharynx cancer (oral, oropharynx, hypopharynx, and overlapping cancers) that was conducted in a multiethnic population of 6,034 cases, of which 986 of the cases were from Latin America (101). This study identified four loci at 4q23, 6p21.32, 10q26.13, and 11p15.4 for combined oral and oropharynx cancers. The variants at 4q23 and 6p21.32 were also found associated at a genome-wide association level with oropharyngeal cancer. For oral cancer, 2p23.3, 5p15.33, 9p21.3, and 9q34.12 were identified as risk loci.
Nasopharyngeal cancer.
To date GWAS of nasopharyngeal cancer have been conducted exclusively in East Asian populations and have identified seven nasopharyngeal cancer risk loci. For the six GWAS of nasopharyngeal cancer in East Asian populations, the discovery samples ranged from 111 to 1,583 nasopharyngeal cancer cases among southern Chinese, Malaysian Chinese, and Taiwanese (102–107). In these GWAS, nasopharyngeal risk loci were found at the MHC region at 6p22.1 and at 3p22.2, 3q26.2, 13q12.12, and 16p13.13. In spite of the small discovery sample for two of the East Asian GWAS, with less than 300 nasopharyngeal cancer cases, these studies were able to detect large effects for variants in HLA-A at 6p22.1, with odds ratios ranging from 1.86 to 2.57 (104, 106). In a recent meta-analysis of four GWAS of nasopharyngeal cancer that included 2,152 Chinese nasopharyngeal cancer cases and 3,740 controls, a novel risk variant was identified within the CLPTM1L/TERT region at 5p15.33, and a suggestive association at 9p21.3 (103) was confirmed, reaching genome-wide significance in this larger study (108).
Laryngeal cancer.
To date, there has been only one GWAS of laryngeal cancer, which was conducted among a Chinese population, with a discovery set of 993 laryngeal squamous cell carcinoma cases and 1,995 controls and three replication sets that included 2,398 cases and 2,804 controls. In this study, three risk loci were found at 6p21.33, 11q12.2, and 12q24.21 (109). No GWAS of laryngeal cancer has been reported in other populations.
Hematologic malignancies
The risk loci for hematological malignancies, identified primarily from scans in a European ancestry populations, include 41 for non-Hodgkin lymphoma (NHL, all subtypes), eight for Hodgkin lymphoma (110, 111), 17 for multiple myeloma (112), and nine for childhood acute lymphoblastic leukemia (ALL; ref. 113)]. The number and size of GWAS in non-European ancestry populations have been extremely limited. For NHL, only a single small GWAS in Chinese (253 NHL cases and 1,438 controls) has been conducted, which reported a risk locus at 3q27.3/BCL6/LPP (114). No GWAS have been reported for Hodgkin lymphoma or multiple myeloma in non-European ancestry populations. A GWAS of extranodal natural killer T-cell lymphoma (ENNK/TCL) conducted in Chinese (189 cases and 957 controls) revealed a locus at 6p21.32/HLA-DPB1, which encodes the β1 subunit of HLA-DP that is involved in extracellular antigen presentation to CD4-positive T-cell lymphocytes (115). Two small GWAS (<210 cases) of ALL and chronic myeloid leukemia (CML) have been conducted in Koreans (116, 117), but neither reported a novel risk locus. More recently, a three-stage scan of ALL in Chinese (stage 1: 272 ALL cases, 1,045 controls; total: 1,184 ALL cases, 3,219 controls) reported a novel locus at 16q23.1/WWOX (118). African Americans (89 ALL cases) and Hispanics (305 ALL cases) have also been included in a multiethnic GWAS of childhood ALL (1,605 cases), which reported a novel susceptibility locus at BM1-PIP4K2A on 10p12.2 (119).
Gallbladder cancer
The only GWAS of gallbladder cancer was conducted in a Japanese population with 41 gall bladder cancer cases and 866 controls in the discovery set and 30 gall bladder cancer cases and 898 controls in the replication (120). A suggestive association at 18q21.3 loci and gallbladder cancer risk was reported (P = 7.5 × 10−8). No GWAS of gallbladder cancer has been reported among European ancestry populations or among Native American populations which have the highest incidence (121).
Glioma
GWAS of glioma have centered on populations of European ancestry and have identified over a dozen risk loci (122–126). The largest study included four European ancestry populations that yielded a total pooled sample of 4,147 glioma cases and 7,435 controls (126). Glioma is a rare cancer with a higher incidence among non-Hispanic whites compared to minority populations (127).
Skin-cutaneous melanoma, squamous cell carcinoma, and basal cell carcinoma
All GWAS for the three common skin cancers have been conducted in European ancestry populations with 13 risk loci identified for melanoma (128), 11 for squamous cell carcinoma (129), and 35 for basal cell carcinoma (130). Skin cancers are common in populations of European ancestry and the primary risk factor is ultraviolet (UV) exposure and greater propensity to sunburn from UV exposure.
Thyroid cancer
Discussion
GWAS in populations of non-European ancestry have already begun to make important contributions towards understanding cancer mechanisms and genetic susceptibility to cancer. Of the >700 cancer risk loci revealed through GWAS, 82% of these loci were discovered in European ancestry populations, 14% in East Asians, 3% in multiethnic populations, and less than 1% in African ancestry and Latin American populations. As raised by Popejoy and colleagues (2) and others (3, 134), the under-representation of non-European ancestry populations in genetic studies may create a major translational gap in utilizing genetic information for preventing and treating diseases such as cancer across populations. The identification of genetic markers of cancer susceptibility in different racial/ethnic populations has the potential for widespread applicability and significance, leading to better defining the spectrum of allelic variation underlying population risk and disease heritability. Moreover, these identified cancer susceptibility markers may provide insight into novel biological pathways and environmental factors that are important in the cause of cancer, and aid in the development of risk models that account for allelic heterogeneity between populations and preventive strategies that target individuals at high risk. It is predicted that information from genome-wide investigations of common and rare variation will, over the coming years, play a significant role in shaping personalized preventive and therapeutic medicine.
For the cancer sites examined in this review, 20 sites included European ancestry samples in the discovery stage, 14 included samples of East Asian ancestry, 5 included samples of Latin American ancestry, and 5 included samples of African ancestry (Fig. 1). Three cancer sites have been investigated in only GWAS of East Asians (gallbladder, nasopharynx, and hepatic), whereas seven cancer sites were limited to GWAS of only European ancestry samples (thyroid, testicular, bladder, NHL, endometrium, multiple myeloma, brain/glioma, and skin/melanoma). For scans conducted in multiple populations, the size of GWAS in non-European ancestry populations has been substantially smaller than in European ancestry populations. This difference is most apparent for Latin American and African ancestry populations, whereas studies in East Asian populations have been considerably larger for sites with elevated incidence rates in this population such as nasopharyngeal, hepatocellular, gastric, esophageal, and cervical cancers. We estimate that 84% of the samples included in the discovery stages were of European ancestry, whereas only 11% were of East Asian ancestry, 4% were of African ancestry, and 1% were of Latin American ancestry (Fig. 2). The emphasis on studying European ancestry populations has been partly a matter of convenience and partly a matter of choice. The early investments in establishing epidemiological studies of cancer in the United States and Europe focused on European ancestry populations. It was also assumed that homogeneity in ancestry be a prerequisite to improve statistical power of GWAS. Thus, when genotyping technology advanced for large-scale genotyping in the mid-2000s, samples of European ancestry were prioritized.
Several important discoveries have been made in GWAS of cancer in non-European ancestry populations. For example, in a GWAS of breast cancer among Latinas, a risk variant at 6q25.1 near ESR1 was discovered that is specific to women with indigenous American ancestry, representing a novel signal from previous variants in this region reported for East Asians and European ancestry populations (34). GWAS of nasopharyngeal and hepatocellular cancer in East Asians revealed the significant role of the HLA region for these two cancer sites (81, 82, 102–104). A GWAS of lung cancer in nonsmoking East Asian women identified three novel loci at 10q25.2/VTI1A, 16p22.1/DCBLD, and 6p21.32/HLA-DRA (61, 62). In East Asian women with lung cancer, approximately 50% of these cases are nonsmokers (as opposed to 90% of lung cancer cases in European ancestral populations; ref. 135). The 8q24 region is well-known for harboring risk loci for many cancers including those that are only found in populations of African ancestry (5).
Despite the many discoveries from the limited number of GWAS in non-European ancestry populations, the majority of these scans were conducted using SNP arrays that were not optimized for capturing common variation (directly or indirectly) in non-European ancestry populations. Likewise, only 20% to 30% of the discovery stages performed imputation to 1,000 Genomes, which is the current gold standard for imputation. However, studies have shown that the imputation quality in non-European ancestral populations is lower than other populations (3, 136, 137). This is a result of both the European ancestral-centric array content and the available imputation reference panels. Next-generation arrays are currently being designed for studying genetic variation in ancestrally diverse populations with content derived from whole-exome and whole-genome sequencing. These arrays, such as the Illumina Multi-ethnic Genotyping Array (MEGA) array (138) and the Infinium Global Multi-Ethnic Array aim to provide a more comprehensive assessment of common and rare variation in non-European ancestry populations for discovery and fine-mapping (138). Over the next few years, large-scale whole-genome sequencing efforts to capture genetic diversity in populations divergent from those of European ancestry, such as Latin American, Polynesian, East/South Asian, Middle Eastern, and continental African populations, will improve the size and representativeness of imputation reference panels as well as power of genotyping arrays for GWAS in these populations.
There are several strengths of conducting GWAS of cancer in non-European ancestry populations. Foremost, is the ultimate goal of identifying risk alleles that will translate into medical advances to improve cancer risk prediction in all populations as well as further our understanding of the biology of disease. As some risk alleles differ in frequency and occurrence by ancestral population, no single population is sufficient to uncover all of the underlying genetic markers of disease. The inclusion of environment exposures may also strengthen GWAS associations (via interaction) and aid in the detection of risk alleles in exposed populations in geographical regions or within defined racial/ethnic groups. GWAS of cancer in non-European ancestry populations may also help to elucidate potential unmeasured or not yet identified environmental risk factors for which race/ethnicity may serve as a proxy. Variation in the burden of cancers and environmental risk factors across population groups also drives the rationale and ability to assemble large numbers of cancer cases required for risk locus discovery, as noted for the higher prevalence of Epstein-Barr Virus (EBV) and nasopharyngeal cancer, HCV and hepatocellular carcinoma, and Helicobacter pylori and gastric adenocarcinomas in East Asian populations (80–83, 88–90, 102–107).
A major challenge in conducting GWAS in non-European ancestry populations is sample size. Concerted consortia-type efforts have been needed to achieve pooled sample sizes of approximately 1,000 to 10,000 cases in individual non-European ancestry groups, with studies in East Asians reaching approximately 30,000 cases for common sites (Table 2). Such studies may also leverage “reusing” existing GWAS data (for “controls”) to enhance statistical power (139). However, for cancers with known environmental risk factors that influence genetic associations, using controls geographically different from cases may decrease the effectiveness of this approach. Moreover, improvements in data sharing would help to refine the precision of risk estimates and correct for false positives (99). One of the largest cancer GWAS in a non-European ancestry population conducted by African Ancestry Prostate Cancer Consortium (Table 2), has pooled GWAS data from 30 studies globally, established over the past 20 years, to achieve a GWAS discovery sample of 10,000 African ancestry prostate cancer cases and 10,000 controls. This scan, for a cancer that is nearly twice as common in men of African than European ancestry, is approximately 10% of the size of GWAS in men of European ancestry (which includes almost 100,000 cases). Progress in studying the genetic basis of cancer in non-European ancestry populations will require substantial investments in patient enrollment and sample collection, which will likely take years (if not decades) to reach the GWAS sample sizes of European ancestry populations observed today. The critical importance of large sample sizes is reinforced from recent discoveries in European ancestry groups of risk loci for cancer subtypes, such as estrogen receptor negative breast cancer (19), ovarian cancer subtypes (e.g., mucinous tumors) (140) and non-Hodgkin lymphoma subtypes (141). The shift towards studying rare and population-specific variation with modest effects will also demand larger study sizes. Continued expansion of consortium efforts will be needed particularly for less common cancer sites in diverse racial/ethnic groups.
Consortium/study (acronym) . | Ancestry . | Sizea . | Reference . |
---|---|---|---|
Female Lung Cancer Consortium in Asia (FLCCA) | East Asian | 12,700 | 61 |
Esophageal Squamous Carcinoma in Chinese | East Asian | 15,000 | 99 |
Asian Colorectal Cancer Consortium (ACCC) | East Asian | 28,000 | 38 |
Asian Breast Cancer Consortium (ABCC) | East Asian | 30,000 | 28 |
African American (Ovarian) Cancer Epidemiology Study | African | 500 | 147 |
African American Multiple Myeloma Study (AAMMS) | African | 1,500 | 148 |
African Ancestry Prostate Cancer Consortium (AAPC) | African | 10,500 | 5 |
Hispanic Colorectal Cancer Study (HCCS) | Latin American | 2,000 | 48 |
Consortium/study (acronym) . | Ancestry . | Sizea . | Reference . |
---|---|---|---|
Female Lung Cancer Consortium in Asia (FLCCA) | East Asian | 12,700 | 61 |
Esophageal Squamous Carcinoma in Chinese | East Asian | 15,000 | 99 |
Asian Colorectal Cancer Consortium (ACCC) | East Asian | 28,000 | 38 |
Asian Breast Cancer Consortium (ABCC) | East Asian | 30,000 | 28 |
African American (Ovarian) Cancer Epidemiology Study | African | 500 | 147 |
African American Multiple Myeloma Study (AAMMS) | African | 1,500 | 148 |
African Ancestry Prostate Cancer Consortium (AAPC) | African | 10,500 | 5 |
Hispanic Colorectal Cancer Study (HCCS) | Latin American | 2,000 | 48 |
aApproximate number of cases.
Tumor and organ-specific tissue resources, such as the Tumor Cancer Genome Atlas (TCGA; ref. 142) and Genotype-Tissue Expression (GTEx; ref. 143), are being utilized to study the biological function at GWAS risk regions through the characterization of chromatin marks, which signify regulatory sequence and gene expression profiles (i.e., eQTLs) to identify target genes. The vast majority of specimens in these resources are from individuals of European ancestry, which limits progress in understanding mechanism for GWAS risk regions discovered in (and perhaps limited to) populations of non-European ancestry. Future resource building efforts in non-European ancestry populations will need to consider the collection of a wider range of biospecimens beyond germline DNA.
Although striking differences in the number and size of GWAS conducted in non-European versus European ancestry populations are roughly proportional to the ancestral population sizes within most countries that have funded the discovery work, they certainly do not align with ancestral population sizes globally. Prioritizing cancer genomics research in non-European ancestry populations will improve our understanding of cancer biology and risk susceptibility and, more importantly, broaden the impact of genomic medicine on human health.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant Support
This work was supported by the following grants: U01 HG007397 (to S.L. Park, I. Cheng, C.A. Haiman), U01 CA164973 (to S.L. Park, I. Cheng, C.A. Haiman), U19 CA148537 (to C.A. Haiman), R01 CA165862 (to C.A. Haiman), and R01 CA165038 (to C.A. Haiman).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.