Abstract
Identifying genetic variants with pleiotropic associations across multiple cancers can reveal shared biologic pathways. Prior pleiotropic studies have primarily focused on European-descent individuals. Yet population-specific genetic variation can occur, and potential pleiotropic associations among diverse racial/ethnic populations could be missed. We examined cross-cancer pleiotropic associations with lung cancer risk in African Americans.
We conducted a pleiotropic analysis among 1,410 African American lung cancer cases and 2,843 controls. We examined 36,958 variants previously associated (or in linkage disequilibrium) with cancer in prior genome-wide association studies. Logistic regression analyses were conducted, adjusting for age, sex, global ancestry, study site, and smoking status.
We identified three novel genomic regions significantly associated (FDR-corrected P <0.10) with lung cancer risk (rs336958 on 5q14.3, rs7186207 on 16q22.2, and rs11658063 on 17q12). On chromosome16q22.2, rs7186207 was significantly associated with reduced risk [OR = 0.43; 95% confidence interval (CI), 0.73–0.89], and functional annotation using GTEx showed rs7186207 modifies DHODH gene expression. The minor allele at rs336958 on 5q14.3 was associated with increased lung cancer risk (OR = 1.47; 95% CI, 1.22–1.78), whereas the minor allele at rs11658063 on 17q12 was associated with reduced risk (OR = 0.80; 95% CI, 0.72–0.90).
We identified novel associations on chromosomes 5q14.3, 16q22.2, and 17q12, which contain HNF1B, DHODH, and HAPLN1 genes, respectively. SNPs within these regions have been previously associated with multiple cancers. This is the first study to examine cross-cancer pleiotropic associations for lung cancer in African Americans.
Our findings demonstrate novel cross-cancer pleiotropic associations with lung cancer risk in African Americans.
Introduction
Pleiotropy occurs when a genetic locus is associated with more than one trait (1) and has been observed across multiple phenotypes, including cancer (2–5). These shared associations suggest potential common biologic pathways. One example of pleiotropy occurs at the TERT gene region. Germline mutations in TERT are associated with telomere length (6), but have also been associated with phenotypes, such as pulmonary fibrosis (7), red blood cell count (8), and aplastic anemia (9). TERT mutations have also been identified across multiple cancer types (10), including breast (11), prostate (12), pancreatic (13), glioma (14), and lung (15–17). Identifying pleiotropy is a useful tool for genetic association studies to discover common biologic mechanisms, shared genetic architecture across diseases, and potentially new opportunities for therapeutic targets.
Cross-cancer pleiotropic analysis has been conducted for several cancer types (2, 3, 5, 18), including lung cancer (4, 5). Pleiotropic analysis of lung cancer has identified associations with LSP1, ADAM15/THBS3, CDKN2B-AS1, and BRCA2 genes in populations of predominantly European ancestry (4, 5). However, recent studies have shown the nontransferability of risk alleles across racial/ethnic groups (19, 20). It remains unknown whether pleiotropic associations of prior cancer variants are generalizable and associated with lung cancer among African American individuals. Furthermore, only a few studies (21) have accounted for either pleiotropy arising from variants in linkage disequilibrium or the ancestry of the discovery population. We used genome-wide genotyping data from five African American study populations to identify pleiotropic associations with lung cancer, incorporating variants in linkage disequilibrium and accounting for ancestry.
Methods
Study population
African American lung cancer cases and controls were selected from five study sites: the MD Anderson (MDA) Lung Cancer Epidemiology Study and Project CHURCH (Creating a Higher Understanding of Cancer Research & Community Health), NCI Lung Cancer Case–Control Study, the Northern California Lung Cancer Study from the University of California, San Francisco (UCSF), the Southern Community Cohort Study (SCCS), and three studies from the Karmanos Cancer Institute at Wayne State University (WSU), that is, Family Health Study III; Women's Epidemiology of Lung Disease Study; and Exploring Health, Ancestry and Lung Epidemiology Study. A detailed description of each study has been reported previously (22). All studies were approved by the Institutional Review Board at each institution and written informed consent was obtained from all participants.
Genotyping, quality control, and imputation
Samples were previously genotyped (22) on the Illumina Human Hap 1M Duo array at the NCI Cancer Genomics Research Laboratory (CGR) in the Division of Cancer Epidemiology and Genetics (DCEG) at the NCI. Genotyping data underwent strict quality control (Supplementary Fig. S1). Briefly, SNPs were excluded if they were nonautosomal, had a MAF <1%, <95% genotyping efficiency, or a Hardy–Weinberg Equilibrium (HWE) P < 1 × 10−6. Individuals were excluded if they had a <95% genotyping efficiency. Pairwise identity-by-descent was examined to identify related individuals; for each genetically related pair, the individual with the lowest genotyping efficiency was excluded (N = 147). No individuals were excluded because of inconsistencies between reported and genetic sex, but four individuals with an “unknown” reported sex were filled in based on the calculated genetic sex. All quality control filtering was applied using PLINK (23).
Missing genotypes were imputed using IMPUTE2 (24), with prephasing performed using SHAPEIT (25–27). Haplotypes from the cosmopolitan 1,000 Genomes phase III population consisting of 2,504 individuals from 26 countries were used as a reference population. SNPs imputed with low certainty were excluded on the basis of an info score <0.4, MAF<0.01, and HWE P < 1 × 10−5.
Ancestry estimation
Supervised admixture analysis was performed using ADMIXTURE software (28) to obtain global estimates of African and European ancestry for all individuals. Admixture analysis was performed on observed genotypes merged with the CEU (CEPH Utah residents with Northern and Western European ancestry) and YRI (Yoruba from Ibadan, Nigeria) HapMap reference populations (29), and pruned to a set of unlinked variants (window size = 50, step size = 10, r2 > 0.1). A total of 140,591 variants remained for supervised (k = 2) admixture analysis.
Selection of variants for pleiotropic analysis
All variants previously associated with any cancer type were identified from the NHGRI-EBI GWAS catalog as of April 2016. Manual review excluded studies in which the outcome was not risk (e.g., survival, prognosis, toxicity, relapse). Studies assessing interactions were also excluded. SNPs from each of the remaining studies were aggregated into a single list, hereafter referred to as “reported SNPs.”
Because genotyping arrays are designed to capture genome-wide variation using as few SNPs as possible, the majority of reported associations are rarely causal, but rather correlated (or in linkage disequilibrium, LD) with the true causal variant (30, 31). In addition, it is established that LD patterns differ between racial/ethnic groups (32). Because the vast majority of the GWAS-reported SNPs are identified in European- or Asian-descent populations, we further expanded our list of reported SNPs based on LD structure of the 1,000 Genomes (phase III) reference population (33, 34) similar to the race/ethnicity of the population reported in the GWAS catalog study. For example, the CHB (Han Chinese in Beijing, China) reference population was used to identify SNPs in LD with variants reported in Asian-descent populations, whereas the CEU (CEPH Utah Residents with Northern and Western European Ancestry) reference population was used for SNPs reported in European-descent populations. For the admixed Latino/a and African American populations, we used relevant continental reference populations: CEU, CHB, YRI (Yoruba in Ibadan, Nigeria), and MXL (Mexican Ancestry from Los Angeles) for Latino/a and CEU, YRI, and ASW (Americans of African Ancestry in SW United States) for African Americans. For each reported SNP, we extracted all SNPs with a r2 > 0.6 and ±100 kb of the reported SNP in the appropriate 1,000 Genomes reference population(s) using PLINK pairwise LD estimation. A final list of SNPs for analysis was generated, hereafter referred to as the “selected SNPs.”
Statistical analysis
Logistic regression was performed for each additively coded effect allele using SNPTEST to account for imputation probabilities (35). Age, sex, smoking status (current/former/never), global African ancestry, and study site were included as covariates in the logistic regression models. A Benjamini–Hochberg FDR correction was applied to P values to account for multiple testing. Statistical significance was defined as an FDR-corrected P < 0.1. Exploratory strata-specific analyses were conducted by histologic subtype, smoking status (ever/never), and sex. A meta-analysis was also conducted summarizing results from each study (adjusting for age, sex, smoking status, and global African ancestry) using a fixed effect model in METAL (36).
Results
Descriptive characteristics
A total of 4,253 African American individuals remained following quality control, with 1,410 lung cancer cases and 2,843 controls (Table 1; Supplementary Table S1). Forty-five percent of all individuals were male and the mean age at diagnosis was 58 years. Lung cancer cases were on average 5 years older than controls. Among cases, 55% of participants were current smokers and 37% were former smokers, whereas 43% of controls were never smokers. The median global African ancestry was similar among cases and controls (84% and 83%, respectively). The most frequent histologic cell type among cases was adenocarcinoma (45%), followed by squamous cell carcinoma (24%). Descriptive characteristics by study and case/control status are presented in Table 1.
. | . | . | . | . | . | Total . | ||
---|---|---|---|---|---|---|---|---|
. | MDA . | NCI . | SCCS . | UCSF . | WSU . | Cases . | Controls . | Combined . |
Characteristic . | N = 1,374 . | N = 583 . | N = 506 . | N = 986 . | N = 804 . | N = 1,410 . | N = 2,843 . | N = 4,253 . |
Status | ||||||||
Cases, N (%) | 373 (27.1) | 208 (35.7) | 168 (33.2) | 325 (33.0) | 336 (41.8) | 1,410 (33.2) | ||
Controls, N (%) | 1001 (72.9) | 375 (64.3) | 338 (66.8) | 661 (67.0) | 468 (58.2) | 2,843 (66.8) | ||
Sex | ||||||||
Male, N (%) | 525 (38.2) | 308 (52.8) | 300 (59.3) | 450 (45.6) | 317 (39.4) | 698 (49.5) | 1,202 (42.3) | 1,900 (44.7) |
Female, N (%) | 849 (61.8) | 275 (47.2) | 206 (40.7) | 536 (54.4) | 487 (60.6) | 712 (50.5) | 1,641 (57.7) | 2,353 (55.3) |
Mean age at diagnosis, y (SD) | 52.2 (13.5) | 64.4 (9.6) | 55.9 (8.9) | 63.5 (11.2) | 60.3 (11.3) | 61.5 (10.5) | 56.9 (13.2) | 58.4 (12.6) |
Smoking status | ||||||||
Current, N (%) | 339 (24.8) | 196 (33.7) | 276 (54.8) | 364 (38.1) | 358 (44.6) | 778 (55.3) | 755 (27.0) | 1,533 (36.4) |
Former, N (%) | 379 (27.7) | 250 (43.0) | 111 (22.0) | 361 (37.8) | 257 (32.0) | 520 (36.9) | 838 (29.9) | 1,358 (32.3) |
Never, N (%) | 649 (47.5) | 135 (23.2) | 117 (23.2) | 230 (24.1) | 187 (23.3) | 110 (7.8) | 1,208 (43.1) | 1,318 (31.3) |
Median African ancestry, % | 82.6 | 82.4 | 88.3 | 81.9 | 82.9 | 83.6 | 83.1 | 83.2 |
Histology | ||||||||
Adenocarcinoma, N (%) | 173 (46.4) | 97 (47.1) | 47 (31.3) | 145 (44.6) | 170 (50.6) | 632 (45.5) | — | 632 (45.5) |
Squamous cell, N (%) | 103 (27.6) | 53 (25.7) | 28 (18.7) | 82 (25.2) | 71 (21.1) | 337 (24.2) | — | 337 (24.2) |
Large cell, N (%) | 2 (0.5) | 5 (2.4) | 11 (7.3) | 4 (1.2) | 12 (3.6) | 34 (2.4) | — | 34 (2.4) |
Small cell, N (%) | 23 (6.2) | 2 (1.0) | 14 (9.3) | 20 (6.0) | 22 (6.8) | 81 (5.8) | — | 81 (5.8) |
Other, N (%) | 72 (19.3) | 49 (23.8) | 50 (33.3) | 72 (22.2) | 63 (18.8) | 306 (22.0) | — | 306 (22.0) |
. | . | . | . | . | . | Total . | ||
---|---|---|---|---|---|---|---|---|
. | MDA . | NCI . | SCCS . | UCSF . | WSU . | Cases . | Controls . | Combined . |
Characteristic . | N = 1,374 . | N = 583 . | N = 506 . | N = 986 . | N = 804 . | N = 1,410 . | N = 2,843 . | N = 4,253 . |
Status | ||||||||
Cases, N (%) | 373 (27.1) | 208 (35.7) | 168 (33.2) | 325 (33.0) | 336 (41.8) | 1,410 (33.2) | ||
Controls, N (%) | 1001 (72.9) | 375 (64.3) | 338 (66.8) | 661 (67.0) | 468 (58.2) | 2,843 (66.8) | ||
Sex | ||||||||
Male, N (%) | 525 (38.2) | 308 (52.8) | 300 (59.3) | 450 (45.6) | 317 (39.4) | 698 (49.5) | 1,202 (42.3) | 1,900 (44.7) |
Female, N (%) | 849 (61.8) | 275 (47.2) | 206 (40.7) | 536 (54.4) | 487 (60.6) | 712 (50.5) | 1,641 (57.7) | 2,353 (55.3) |
Mean age at diagnosis, y (SD) | 52.2 (13.5) | 64.4 (9.6) | 55.9 (8.9) | 63.5 (11.2) | 60.3 (11.3) | 61.5 (10.5) | 56.9 (13.2) | 58.4 (12.6) |
Smoking status | ||||||||
Current, N (%) | 339 (24.8) | 196 (33.7) | 276 (54.8) | 364 (38.1) | 358 (44.6) | 778 (55.3) | 755 (27.0) | 1,533 (36.4) |
Former, N (%) | 379 (27.7) | 250 (43.0) | 111 (22.0) | 361 (37.8) | 257 (32.0) | 520 (36.9) | 838 (29.9) | 1,358 (32.3) |
Never, N (%) | 649 (47.5) | 135 (23.2) | 117 (23.2) | 230 (24.1) | 187 (23.3) | 110 (7.8) | 1,208 (43.1) | 1,318 (31.3) |
Median African ancestry, % | 82.6 | 82.4 | 88.3 | 81.9 | 82.9 | 83.6 | 83.1 | 83.2 |
Histology | ||||||||
Adenocarcinoma, N (%) | 173 (46.4) | 97 (47.1) | 47 (31.3) | 145 (44.6) | 170 (50.6) | 632 (45.5) | — | 632 (45.5) |
Squamous cell, N (%) | 103 (27.6) | 53 (25.7) | 28 (18.7) | 82 (25.2) | 71 (21.1) | 337 (24.2) | — | 337 (24.2) |
Large cell, N (%) | 2 (0.5) | 5 (2.4) | 11 (7.3) | 4 (1.2) | 12 (3.6) | 34 (2.4) | — | 34 (2.4) |
Small cell, N (%) | 23 (6.2) | 2 (1.0) | 14 (9.3) | 20 (6.0) | 22 (6.8) | 81 (5.8) | — | 81 (5.8) |
Other, N (%) | 72 (19.3) | 49 (23.8) | 50 (33.3) | 72 (22.2) | 63 (18.8) | 306 (22.0) | — | 306 (22.0) |
Variant selection
A total of 266 unique studies were extracted from the NHGRI-EBI GWAS catalog based on the search term “neoplasm.” Forty-six studies were excluded after manual review (see Methods), resulting in 220 studies reporting associations for 959 unique SNPs (“reported SNPs”). Seventy-four percent (163/220) of studies were conducted in European-descent populations, followed by 26% (57/220) in Asian-descent populations (Table 2). The admixed Latino/a and African American populations accounted for only 3% and 5% of prior GWAS cancer studies, respectively (Table 2). Of all reported SNPs, 629 were directly observed in the genotype data and an additional 294 were imputed. Thirty-six reported SNPs were not present in the 1,000 Genomes reference populations used for LD–based selection of SNPs and were dropped from analysis. Application of the PLINK pairwise LD estimation method (r2 > 0.6 and ±100 kb) to all reported SNPs increased the number of SNPs from 923 to 39,010 (Table 2).
Race/ethnicity in GWAS catalog . | Number of studiesa . | Number of SNPs . | 1,000 Genomes population used for LD selection . | Number of SNPs after LD selection . |
---|---|---|---|---|
European | 163 | 743 | CEU | 29,727 |
Asian | 57 | 217 | CHB | 10,625 |
Latino | 6 | 20 | CEU, YRI, MXL, CHB | 968 |
African American | 11 | 49 | CEU, YRI, ASW | 1,490 |
Total | 220 | 959 | 39,010 |
Race/ethnicity in GWAS catalog . | Number of studiesa . | Number of SNPs . | 1,000 Genomes population used for LD selection . | Number of SNPs after LD selection . |
---|---|---|---|---|
European | 163 | 743 | CEU | 29,727 |
Asian | 57 | 217 | CHB | 10,625 |
Latino | 6 | 20 | CEU, YRI, MXL, CHB | 968 |
African American | 11 | 49 | CEU, YRI, ASW | 1,490 |
Total | 220 | 959 | 39,010 |
aTotal number of studies is less than the sum of populations because 11 studies included two or more racial/ethnic populations and are counted once for each population.
Logistic regression analysis
Among the 39,010 selected SNPs, 1,772 were neither observed nor imputed in our African American population and 280 SNPs failed to meet postimputation quality control filtering, resulting in 36,958 selected SNPs for analysis. Logistic regression analysis revealed 40 SNPs that were significantly associated with lung cancer risk (Fig. 1; Table 3). The most statistically significant association was identified on chromosome 15q25.1 for rs17486278 (per allele OR = 1.41; 95% confidence interval (CI), 1.26–1.57, FDR-corrected P = 2.32 × 10−5), followed by chromosome 5p15 (rs2853677; OR = 1.27; 95% CI, 1.13–1.41; FDR-corrected P = 0.04; Table 3). Three additional SNPs, rs336958 (5q14.3), rs7186207 (16q22.2), and rs11658063 (17q12) also had significant associations with lung cancer risk. The T allele at rs336958 on 5q14.3 was associated with increased risk with an OR = 1.47 and 95% CI, 1.22–1.78 (FDR-corrected P = 0.06). The association on chromosome 16q22.2 consisted of four SNPs with similar effect sizes, though only one, rs7186207, surpassed a 10% FDR correction threshold (OR = 0.80; 95% CI,0.73–0.89, FDR-corrected P = 0.04). On chromosome 17q12, the C allele at rs11658063 was associated with a reduced risk of lung cancer (OR = 0.80; 95% CI, 10.72–0.90, FDR-corrected P = 0.10). Similar results were observed when study sites were combined using fixed effect meta-analysis (data not shown).
SNP . | Chr. . | BP . | Effect/ref. allelea . | Effect allele freq . | Info score . | OR (95% CI) . | Unadjusted P . | FDR-corrected P . |
---|---|---|---|---|---|---|---|---|
rs17486278 | 15 | 78867482 | C/A | 0.3 | 1 | 1.41 (1.26–1.57) | 6.27E−10 | 2.32 × 10−5 |
rs55781567 | 15 | 78857986 | G/C | 0.28 | 1 | 1.37 (1.23–1.53) | 1.54E−08 | 2.84 × 10−4 |
rs2036527 | 15 | 78851615 | A/G | 0.23 | 1 | 1.36 (1.21–1.54) | 2.87E−07 | 2.89 × 10−4 |
rs58365910 | 15 | 78849034 | C/T | 0.27 | 0.98 | 1.35 (1.2–1.51) | 3.13E−07 | 2.89 × 10−4 |
rs147144681 | 15 | 78900908 | T/C | 0.18 | 1 | 1.36 (1.2–1.55) | 2.14E−06 | 0.01 |
rs576982 | 15 | 78870803 | T/C | 0.29 | 1 | 0.77 (0.69–0.86) | 2.74E−06 | 0.01 |
rs664172 | 15 | 78862762 | A/G | 0.28 | 1 | 0.76 (0.68–0.85) | 2.17E−06 | 0.01 |
rs667282 | 15 | 78863472 | C/T | 0.29 | 1 | 0.76 (0.68–0.85) | 1.91E−06 | 0.01 |
rs938682 | 15 | 78896547 | A/G | 0.72 | 1 | 1.3 (1.17–1.46) | 2.56E−06 | 0.01 |
rs569207 | 15 | 78873119 | T/C | 0.28 | 1 | 0.77 (0.69–0.86) | 3.71E−06 | 0.01 |
rs637137 | 15 | 78873976 | A/T | 0.29 | 1 | 0.77 (0.69–0.86) | 4.15E−06 | 0.01 |
rs11637630 | 15 | 78899719 | G/A | 0.29 | 0.99 | 0.77 (0.69–0.86) | 5.35E−06 | 0.01 |
rs2456020 | 15 | 78868398 | T/C | 0.4 | 1 | 0.79 (0.71–0.87) | 5.52E−06 | 0.01 |
rs55676755 | 15 | 78898932 | G/C | 0.17 | 1 | 1.35 (1.19–1.54) | 5.69E−06 | 0.01 |
rs7183604 | 15 | 78899213 | T/C | 0.28 | 0.99 | 0.77 (0.69–0.86) | 4.87E−06 | 0.01 |
rs12440014 | 15 | 78926726 | G/C | 0.22 | 0.91 | 0.75 (0.66–0.85) | 6.39E−06 | 0.01 |
rs3825845 | 15 | 78910258 | T/C | 0.23 | 1 | 0.76 (0.68–0.86) | 8.24E−06 | 0.02 |
rs503464 | 15 | 78857896 | A/T | 0.27 | 0.98 | 0.77 (0.69–0.87) | 1.00E−05 | 0.02 |
rs189218934 | 15 | 78903987 | T/C | 0.27 | 0.99 | 0.78 (0.69–0.87) | 1.08E−05 | 0.02 |
rs113931022 | 15 | 78901113 | T/C | 0.19 | 1 | 1.32 (1.16–1.5) | 2.18E−05 | 0.04 |
rs138544659 | 15 | 78900701 | G/T | 0.19 | 1 | 1.32 (1.16–1.49) | 2.21E−05 | 0.04 |
rs112878080 | 15 | 78900647 | G/A | 0.19 | 1 | 1.32 (1.16–1.49) | 2.21E−05 | 0.04 |
rs2853677 | 5 | 1287194 | G/A | 0.29 | 1 | 1.27 (1.13–1.41) | 2.28E−05 | 0.04 |
rs111704647 | 15 | 78900650 | T/C | 0.19 | 1 | 1.31 (1.16–1.49) | 2.50E−05 | 0.04 |
rs2735940 | 5 | 1296486 | G/A | 0.47 | 0.98 | 0.81 (0.73–0.89) | 2.60E−05 | 0.04 |
rs7186207 | 16 | 72035359 | T/C | 0.43 | 1 | 0.80 (0.73–0.89) | 2.66E−05 | 0.04 |
rs2853672 | 5 | 1292983 | A/C | 0.47 | 1 | 0.81 (0.73–0.89) | 2.85E−05 | 0.04 |
rs56077333 | 15 | 78899003 | A/C | 0.19 | 1 | 1.31 (1.15–1.49) | 3.24E−05 | 0.04 |
rs7170068 | 15 | 78912943 | A/G | 0.23 | 0.99 | 0.78 (0.69–0.88) | 3.93E−05 | 0.05 |
rs1051730 | 15 | 78894339 | A/G | 0.12 | 1 | 1.37 (1.18–1.6) | 5.00E−05 | 0.06 |
rs28491218 | 15 | 78267947 | C/T | 0.22 | 0.91 | 0.77 (0.68–0.87) | 5.43E−05 | 0.06 |
rs951266 | 15 | 78878541 | A/G | 0.11 | 0.99 | 1.39 (1.18–1.63) | 5.28E−05 | 0.06 |
rs12914385 | 15 | 78898723 | T/C | 0.2 | 1 | 1.29 (1.14–1.46) | 5.69E−05 | 0.06 |
rs336958 | 5 | 82973396 | T/C | 0.08 | 1 | 1.47 (1.22–1.78) | 5.90E−05 | 0.06 |
rs7172118 | 15 | 78862453 | A/C | 0.11 | 0.99 | 1.38 (1.18–1.62) | 6.73E−05 | 0.07 |
rs28360704 | 15 | 78268603 | T/C | 0.21 | 0.94 | 0.77 (0.68–0.88) | 8.70E−05 | 0.08 |
rs56390833 | 15 | 78877381 | A/C | 0.11 | 1 | 1.38 (1.17–1.62) | 8.10E−05 | 0.08 |
rs7180002 | 15 | 78873993 | T/A | 0.11 | 1 | 1.38 (1.17–1.61) | 8.42E−05 | 0.08 |
rs905739 | 15 | 78845110 | G/A | 0.25 | 0.98 | 0.79 (0.7–0.89) | 8.77E−05 | 0.08 |
rs11658063 | 17 | 36103872 | C/G | 0.39 | 0.88 | 0.80 (0.72–0.90) | 1.04E−04 | 0.10 |
rs16969968 | 15 | 78882925 | A/G | 0.07 | 1 | 1.49 (1.22–1.83) | 1.14E−04 | 0.10 |
rs8192482 | 15 | 78886198 | T/C | 0.07 | 1 | 1.49 (1.22–1.83) | 1.18E−04 | 0.10 |
SNP . | Chr. . | BP . | Effect/ref. allelea . | Effect allele freq . | Info score . | OR (95% CI) . | Unadjusted P . | FDR-corrected P . |
---|---|---|---|---|---|---|---|---|
rs17486278 | 15 | 78867482 | C/A | 0.3 | 1 | 1.41 (1.26–1.57) | 6.27E−10 | 2.32 × 10−5 |
rs55781567 | 15 | 78857986 | G/C | 0.28 | 1 | 1.37 (1.23–1.53) | 1.54E−08 | 2.84 × 10−4 |
rs2036527 | 15 | 78851615 | A/G | 0.23 | 1 | 1.36 (1.21–1.54) | 2.87E−07 | 2.89 × 10−4 |
rs58365910 | 15 | 78849034 | C/T | 0.27 | 0.98 | 1.35 (1.2–1.51) | 3.13E−07 | 2.89 × 10−4 |
rs147144681 | 15 | 78900908 | T/C | 0.18 | 1 | 1.36 (1.2–1.55) | 2.14E−06 | 0.01 |
rs576982 | 15 | 78870803 | T/C | 0.29 | 1 | 0.77 (0.69–0.86) | 2.74E−06 | 0.01 |
rs664172 | 15 | 78862762 | A/G | 0.28 | 1 | 0.76 (0.68–0.85) | 2.17E−06 | 0.01 |
rs667282 | 15 | 78863472 | C/T | 0.29 | 1 | 0.76 (0.68–0.85) | 1.91E−06 | 0.01 |
rs938682 | 15 | 78896547 | A/G | 0.72 | 1 | 1.3 (1.17–1.46) | 2.56E−06 | 0.01 |
rs569207 | 15 | 78873119 | T/C | 0.28 | 1 | 0.77 (0.69–0.86) | 3.71E−06 | 0.01 |
rs637137 | 15 | 78873976 | A/T | 0.29 | 1 | 0.77 (0.69–0.86) | 4.15E−06 | 0.01 |
rs11637630 | 15 | 78899719 | G/A | 0.29 | 0.99 | 0.77 (0.69–0.86) | 5.35E−06 | 0.01 |
rs2456020 | 15 | 78868398 | T/C | 0.4 | 1 | 0.79 (0.71–0.87) | 5.52E−06 | 0.01 |
rs55676755 | 15 | 78898932 | G/C | 0.17 | 1 | 1.35 (1.19–1.54) | 5.69E−06 | 0.01 |
rs7183604 | 15 | 78899213 | T/C | 0.28 | 0.99 | 0.77 (0.69–0.86) | 4.87E−06 | 0.01 |
rs12440014 | 15 | 78926726 | G/C | 0.22 | 0.91 | 0.75 (0.66–0.85) | 6.39E−06 | 0.01 |
rs3825845 | 15 | 78910258 | T/C | 0.23 | 1 | 0.76 (0.68–0.86) | 8.24E−06 | 0.02 |
rs503464 | 15 | 78857896 | A/T | 0.27 | 0.98 | 0.77 (0.69–0.87) | 1.00E−05 | 0.02 |
rs189218934 | 15 | 78903987 | T/C | 0.27 | 0.99 | 0.78 (0.69–0.87) | 1.08E−05 | 0.02 |
rs113931022 | 15 | 78901113 | T/C | 0.19 | 1 | 1.32 (1.16–1.5) | 2.18E−05 | 0.04 |
rs138544659 | 15 | 78900701 | G/T | 0.19 | 1 | 1.32 (1.16–1.49) | 2.21E−05 | 0.04 |
rs112878080 | 15 | 78900647 | G/A | 0.19 | 1 | 1.32 (1.16–1.49) | 2.21E−05 | 0.04 |
rs2853677 | 5 | 1287194 | G/A | 0.29 | 1 | 1.27 (1.13–1.41) | 2.28E−05 | 0.04 |
rs111704647 | 15 | 78900650 | T/C | 0.19 | 1 | 1.31 (1.16–1.49) | 2.50E−05 | 0.04 |
rs2735940 | 5 | 1296486 | G/A | 0.47 | 0.98 | 0.81 (0.73–0.89) | 2.60E−05 | 0.04 |
rs7186207 | 16 | 72035359 | T/C | 0.43 | 1 | 0.80 (0.73–0.89) | 2.66E−05 | 0.04 |
rs2853672 | 5 | 1292983 | A/C | 0.47 | 1 | 0.81 (0.73–0.89) | 2.85E−05 | 0.04 |
rs56077333 | 15 | 78899003 | A/C | 0.19 | 1 | 1.31 (1.15–1.49) | 3.24E−05 | 0.04 |
rs7170068 | 15 | 78912943 | A/G | 0.23 | 0.99 | 0.78 (0.69–0.88) | 3.93E−05 | 0.05 |
rs1051730 | 15 | 78894339 | A/G | 0.12 | 1 | 1.37 (1.18–1.6) | 5.00E−05 | 0.06 |
rs28491218 | 15 | 78267947 | C/T | 0.22 | 0.91 | 0.77 (0.68–0.87) | 5.43E−05 | 0.06 |
rs951266 | 15 | 78878541 | A/G | 0.11 | 0.99 | 1.39 (1.18–1.63) | 5.28E−05 | 0.06 |
rs12914385 | 15 | 78898723 | T/C | 0.2 | 1 | 1.29 (1.14–1.46) | 5.69E−05 | 0.06 |
rs336958 | 5 | 82973396 | T/C | 0.08 | 1 | 1.47 (1.22–1.78) | 5.90E−05 | 0.06 |
rs7172118 | 15 | 78862453 | A/C | 0.11 | 0.99 | 1.38 (1.18–1.62) | 6.73E−05 | 0.07 |
rs28360704 | 15 | 78268603 | T/C | 0.21 | 0.94 | 0.77 (0.68–0.88) | 8.70E−05 | 0.08 |
rs56390833 | 15 | 78877381 | A/C | 0.11 | 1 | 1.38 (1.17–1.62) | 8.10E−05 | 0.08 |
rs7180002 | 15 | 78873993 | T/A | 0.11 | 1 | 1.38 (1.17–1.61) | 8.42E−05 | 0.08 |
rs905739 | 15 | 78845110 | G/A | 0.25 | 0.98 | 0.79 (0.7–0.89) | 8.77E−05 | 0.08 |
rs11658063 | 17 | 36103872 | C/G | 0.39 | 0.88 | 0.80 (0.72–0.90) | 1.04E−04 | 0.10 |
rs16969968 | 15 | 78882925 | A/G | 0.07 | 1 | 1.49 (1.22–1.83) | 1.14E−04 | 0.10 |
rs8192482 | 15 | 78886198 | T/C | 0.07 | 1 | 1.49 (1.22–1.83) | 1.18E−04 | 0.10 |
aEffect allele = minor allele.
Exploratory stratified analyses
To identify histologic subtype–specific associations for lung cancer risk, we examined adenocarcinoma cases (N = 632) and squamous cell carcinoma cases (N = 337) separately. No genetic variant was significantly associated with lung cancer risk in either histologic subtype (Supplementary Fig. S2). On chromosome 15q25.1, stratification by sex and smoking status revealed a sex-specific association among females (rs17486278; OR = 1.51; 95% CI,1.30–1.76; FDR-corrected P = 4.29 × 10−3; Supplementary Fig. S3; Supplementary Table S2) and among ever smokers (rs17486278; OR = 1.41; 95% CI,1.26–1.58; FDR-corrected P = 1.42 × 10−4; Supplementary Fig. S4). One additional SNP, rs7486184 on chromosome 12q21.32 was also significantly associated with lung cancer risk among females (OR = 0.74; 95% CI,0.64–0.85; FDR-corrected P = 0.10; Supplementary Fig. S3; Supplementary Table S2). No SNPs were significant after an FDR correction in males (Supplementary Fig. S3). Among ever smoking African Americans, 33 SNPs on chromosomes 5p15.33 and 16q22.2 had FDR-corrected P ≤ 0.10 (Supplementary Fig. S4; Supplementary Table S3). No P values were statistically significant after FDR correction in never smokers (Supplementary Fig. S4).
Discussion
The current analysis sought to identify cross-cancer pleiotropic genetic associations for lung cancer risk in African Americans. The two most significant associations were on chromosome 15q25.1 and 5p15.33, both of which have been previously associated with lung cancer in African Americans (37–39) and recently validated in a nonindependent African American consortium study (22) that included cases and controls utilized in this study. We also identified three novel associations on chromosomes 5q14.3, 16q22.2, and 17q12. Chromosome 16q22.2 was also observed among ever smokers and an additional region on 12q21.32 was specific to women. Despite not meeting our threshold for statistical significance, there was suggestive evidence for an association of 5p15.33 and 15q25.1 among never smokers. These results are consistent with prior research indicating 5p15.33 is associated with lung cancer risk among never smokers (15, 40) and contribute to the ongoing debate as to whether 15q25.1 is directly associated with lung cancer or mediated by smoking (41–45).
Excluding established risk loci 5p15.33 and 15q25.1, the most significant association was for rs7186207 on chromosome 16q22.2 (FDR-corrected P = 0.04). All four SNPs in this region (rs7186207, rs8051239, rs7195958, and rs3213422) had similar ORs and were in strong LD (r2 > 0.68) with each other in both African (YRI and ASW) and European (CEU) 1,000 Genomes reference populations (46). Given the high degree of correlation between variants, it is unsurprising that effect allele frequencies were similar among the four SNPs, ranging from 0.38 to 0.44. None of the four SNPs were among the reported SNPs extracted from the GWAS catalog, but were selected because of their strong LD (r2 = 0.75–0.78) with rs12597458, a variant previously associated with prostate cancer risk (12). The 16q22.2 region has been previously associated with prostate cancer (47). SNPs rs7186207, rs8051239, and rs7195958 are intergenic and located between PKD1L3 and DHODH and do not appear to be located at sites with regulatory potential based on histone modification marks (Supplementary Fig. S5). However, GTEx data (48) reveal rs7186207 is significantly associated with DHODH gene expression in lung (P = 2.1 × 10−7; Fig. 2) and other tissues. The remaining SNP at this locus, rs3213422, is located within the first intron of DHODH (Supplementary Fig. S5) and encodes a missense mutation, although SIFT (49) and PolyPhen-2 (50) both predict the mutation to be tolerated/benign. Given its close proximity to the exon boundary within DHODH, variation at rs3213422 could also affect exon splicing, and GTEx data (48) reveal rs3213422 is a splice QTL for DHODH in aortic artery tissue (P = 2.7 × 10−6). ENCODE data reveal H3K4me3 and H3K27ac markers surrounding rs3213422, indicative of active promoters and regulatory elements, as well as evidence for transcription factor binding (Supplementary Fig. S5).
The DHODH, or dihydroorotate dehydrogenase, gene encodes a 43-kDa enzymatic protein localized to the inner mitochondrial membrane, where it interacts with the mitochondrial respiratory chain and acts as a rate-limiting step in de novo pyrimidine biosynthesis (51–53). Mutations within DHODH have been linked with Miller Syndrome, a recessive disorder characterized by malformations of the limbs and eyes, among other symptoms (54–57). DHODH has also been investigated for a role in cancer, including melanoma (58) and acute myeloid leukemia (59), and decreased expression of DHODH was associated with breast cancer risk (60). Several other studies have examined the utility of DHODH inhibitors in cancer by inducing cell-cycle arrest and apoptosis in cancer cells (59, 61–66). Although DHODH has not been previously associated with lung cancer risk, the abundance of biological evidence for its pleiotropic role in cancer gives credibility to the association.
We identified significant associations on chromosomes 5q14.3 and 17q12. Chromosome 5q14.3 SNP rs336958 is an intronic variant for HAPLN1, hyaluronan and proteoglycan link protein 1, which has been shown to play a role in cell adhesion and extracellular matrix structure. SNP rs336958 is in LD (r2 = 0.97) with rs4466137 which has been associated with prostate cancer risk (67). The larger 5q14.3 region has also been associated with prostate cancer (12), breast cancer (7), and Wilms tumors (68), and allelic imbalance in this region has been associated with multiple cancers, including lung (69–71).
The most significant SNP on chromosome 17q12 is rs11658063, a variant located in the first intron of HNF1B. HNF1 homeobox B (HNF1B) encodes a transcription factor and has been shown to play a role in cell development. SNPs in the HNF1B gene region have been previously associated with pancreatic (72), prostate (12, 73–79), ovarian (80–82), testicular (83), and endometrial (84–86) cancers. Expression of HNF1B has been associated with prognosis in hepatocellular carcinoma (87) and renal cell carcinoma (88). Furthermore, methylation of HNF1B has been observed in prostate (89), ovarian (82, 89–91), and lung (92) cancers and may have utility as a biomarker in ovarian cancer (90, 91).
The final notable region of association was on chromosome 12q21.32, where rs7486184 was associated with lung cancer in females. The intergenic variant rs7486184 is located approximately 40 kb downstream of the KITLG gene and is in strong LD (r2 = 0.97) in Europeans (CEU) with reported variant rs995030 (46), which has been previously associated with testicular germ cell cancer in European-descent populations (93–95). Interestingly, rs7486184 and rs995030 may represent independent signals in African Americans since these SNPs are in weak LD in African-descent populations (r2 = 0.15 for YRI and r2 = 0.42 for ASW; ref. 46).
Of the SNPs previously reported to have a pleiotropic association with lung cancer (4, 5), two (rs62560775 and rs2853676) were not present in our current analysis. The remaining four SNPs (rs3817198, rs1057941, rs4072037, and rs4977756) were not significantly associated with lung cancer risk in our African American population, with ORs close to 1.0 and wide CIs. However, ORs for two of the four SNPs (rs4977756 and rs1057941) were in the same direction as previously reported (observed vs. reported ORs: 1.01 vs. 1.13 and 1.03 vs. 1.04, respectively), whereas the observed OR for rs3817198 was in the opposite direction as previously reported (0.94 vs. 1.10). The OR for rs4072037 was not previously reported. Both previous lung cancer pleiotropy studies (4, 5) utilized predominantly European-decent populations; thus, failure to replicate could represent population-specific effects. However, the minor allele frequency of rs3817198 and rs1057941 is higher in European versus African populations (rs3817198: 30% vs. 12%, respectively; rs1057941: 44% vs. 12%, respectively), which could result in reduced statistical power to detect these associations in our African American study population. Furthermore, our failure to replicate prior pleiotropic associations could be the result of differences in study population characteristics. Namely, our study was somewhat younger (mean age at diagnosis = 58), had a higher percentage of women (55%), and a sizable percentage of never smokers (31%).
Previous pleiotropy studies have failed to consider differences in LD structure between racial/ethnic groups. Such considerations are important given recent publications noting the nontransferability of genetic risk predictions across diverse populations (19, 20). In the current analysis, a notable strength is our effort to expand the list of reported SNPs by considering the LD structure of the racial/ethnic population of the discovery population, thus removing the assumption that the reported SNP has the same correlation structure, and therefore, tagging ability, with the causal SNP in all racial/ethnic groups. Importantly, it was through consideration of LD structure that this study was able to identify novel lung cancer risk associations, as only two of the most significant SNPs (rs2853672 on 5p15.33 and rs1051730 on 15q25.1) were among the list of reported SNPs extracted from the GWAS catalog. The current analysis examined cancer-associated SNPs in the GWAS catalog as of April 2016. Cancer GWAS published after this date may reveal additional pleiotropic associations.
It is important to note that this study is not independent of the study by Zanetti and colleagues as our cases and controls are a subset of the individuals in the Zanetti and colleagues study (22). By restricting the analysis to SNPs with a priori evidence to examine cross-cancer pleiotropic associations, our study was able to identify novel lung cancer risk loci that may have been missed because of stringent multiple test corrections required in genome-wide association studies. Stratification by sex and smoking status revealed strong associations among women and ever smokers, suggesting the observed associations among all lung cancer cases and controls may be driven by these two subgroups. It remains to be determined whether the observed associations among women but not men represent a true biological phenomenon or are simply an artifact of reduced statistical power among men.
With our large sample size of African Americans and consistent results across the pooled and meta-analyses, we have identified several novel regions associated with lung cancer risk. Our findings highlight the need for a national effort dedicated to prioritizing research in diverse populations for future replication and fine mapping in African American lung cancer cases and controls to better understand the underlying mechanisms contributing to these pleiotropic signals. Functional studies should be performed to elucidate the pleiotropic effect of these associations across cancer types and identify common biological pathways across phenotypes that may lead to therapeutic targets for lung cancer.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Disclaimer
The funding source had no role in study design; the collection, analysis, and interpretation of data; in the writing of the report; or in the design to submit the article for publication.
Authors' Contributions
Conception and design: C.C. Jones, M.C. Aldrich
Development of methodology: C.C. Jones
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C.I. Amos, W.J. Blot, S.J. Chanock, C.C. Harris, A.G. Schwartz, M.R. Spitz, J.K. Wiencke, M.R. Wrensch, X. Wu, M.C. Aldrich
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): C.C. Jones, Y. Bradford, C.I. Amos, S.J. Chanock, C.C. Harris, X. Wu, M.C. Aldrich
Writing, review, and/or revision of the manuscript: C.C. Jones, C.I. Amos, W.J. Blot, S.J. Chanock, A.G. Schwartz, M.R. Spitz, X. Wu, M.C. Aldrich
Study supervision: X. Wu, M.C. Aldrich
Acknowledgments
Data on SCCS cancer cases used in this publication were provided by the Alabama Statewide Cancer Registry; Kentucky Cancer Registry, Lexington, KY; Tennessee Department of Health, Office of Cancer Surveillance; Florida Cancer Data System; North Carolina Central Cancer Registry, North Carolina Division of Public Health; Georgia Comprehensive Cancer Registry; Louisiana Tumor Registry; Mississippi Cancer Registry; South Carolina Central Cancer Registry; Virginia Department of Health, Virginia Cancer Registry; and Arkansas Department of Health, Cancer Registry, Little Rock, AR. The Arkansas Central Cancer Registry is fully funded by a grant from the National Program of Cancer Registries, Centers for Disease Control and Prevention (CDC). Data on SCCS cancer cases from Mississippi were collected by the Mississippi Cancer Registry, which participates in the National Program of Cancer Registries (NPCR) of the Centers for Disease Control and Prevention (CDC). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the CDC or the Mississippi Cancer Registry.The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the NIH, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The GTEx data used for the analyses described in this manuscript were obtained from the GTEx Portal on May 14, 2018. This work was supported by the NIH/NCI ( grant no. K07 CA172294, to M.C. Aldrich). C.C. Jones was supported by NIH training grants awarded to Vanderbilt University (4T32GM080178-10, 2013-2017, to principal investigator N.J. Cox) and Vanderbilt University Medical Center (T32 CA160056, 2017-2018, to principal investigator X. Shu). Studies at the Karmanos Cancer Institute at Wayne State University were supported by NIH grants/contracts R01CA060691, R01CA87895, and P30CA22453, and a Department of Health and Human Services contract (HHSN261201000028C, to A.G. Schwartz).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.