Background:

Despite well-established relationships between sun exposure and skin cancer pathogenesis/progression, specific gene–environment interactions in at-risk individuals remain poorly-understood.

Methods:

We leveraged a UK Biobank cohort of basal cell carcinoma (BCC, n = 17,221), cutaneous squamous cell carcinoma (cSCC, n = 2,331), melanoma in situ (M-is, n = 1,158), invasive melanoma (M-inv, n = 3,798), and healthy controls (n = 448,164) to quantify the synergistic involvement of genetic and environmental factors influencing disease risk. We surveyed 8,798 SNPs from 190 DNA repair genes, and 11 demographic/behavioral risk factors.

Results:

Clinical analysis identified darker skin (RR = 0.01–0.65) and hair (RR = 0.27–0.63) colors as protective factors. Eleven SNPs were significantly associated with BCC, three of which were also associated with M-inv. Gene–environment analysis yielded 201 SNP–environment interactions across 90 genes (FDR-adjusted q < 0.05). SNPs from the FANCA gene showed interactions with at least one clinical factor in all cancer groups, of which three (rs9926296, rs3743860, rs2376883) showed interaction with nearly every factor in BCC and M-inv.

Conclusions:

We identified novel risk factors for keratinocyte carcinomas and melanoma, highlighted the prognostic value of several FANCA alleles among individuals with a history of sunlamp use and childhood sunburns, and demonstrated the importance of combining genetic and clinical data in disease risk stratification.

Impact:

This study revealed genome-wide associations with important implications for understanding skin cancer risk in the context of the rapidly-evolving field of precision medicine. Major individual factors (including sex, hair and skin color, and sun protection use) were significant mediators for all skin cancers, interacting with >200 SNPs across four skin cancer types.

This article is featured in Selected Articles from This Issue, p. 1475

Melanoma and keratinocyte carcinomas [KC, basal cell carcinoma (BCC), and cutaneous squamous cell carcinoma (cSCC)] constitute the majority of skin cancers, and collectively affect approximately >3 million individuals per year in the United States alone (1). The incidence of these malignancies has been increasing over the past decades (2, 3).

Skin cancers arise from the interactions of numerous risk factors, including lifetime UV exposure, history of sun burns, fair skin, advancing age, and genetic predisposition (4–7). Despite the well-established relationship between sun exposure and skin cancer pathogenesis, the interactions between specific genetic variants and sun exposure in predisposed individuals remain largely unknown. A particular genetically-determined pathway of interest involves UV-induced DNA damage, which promotes cells to enter a hyperproliferative (precancerous) state as a result of aberrant DNA repair mechanisms (8–12). The impact of this interaction on the development of skin cancer is thought to be mediated by inherited genetic factors that impact the efficiency of such mechanisms. Taken together, these findings necessitate precise characterization of the synergistic contributions of both DNA repair-associated changes and environmental/behavioral factors on the predisposition toward these cancers.

Recent studies have demonstrated the importance of combining genetic and clinical data, often from large biomedical databases, to predict disease risk and facilitate drug design (13–15). Notably, 23andMe, AncestryDNA, and other direct-to-consumer platforms have already undertaken limited risk assessments for diseases on participants’ genetic and phenotypic information and licensed these data for the development of a drug used to treat inflammatory skin conditions (16). According to 23andMe reports, more than 12 million kits were sold to consumers, whereas AncestryDNA sold over 10 million kits. These commercialized genome-wide association study (GWAS)-based analyses sold directly to consumers have provided prediction of health risk for diabetes, Parkinson's, celiac disease, and many other diseases and conditions. Given the demonstrated limitations of using genetic data in isolation and the projected exponential increase in availability of diverse patient data, it is important to explore whether SNP analysis alone is sufficient for quantifying disease risk, and whether such analysis should be combined with demographic, behavioral, and clinical information especially in dermatology (17, 18). As this technology is used by millions of patients, oncologists/dermatologists need to understand and lead the effort of using GWAS appropriately to identify valuable predictive genetic variants in the context of clinical information.

To investigate these relationships in greater depth, we leveraged a dataset of four skin cancers and matched healthy controls from the UK Biobank (UKBB) database and performed associations between disease status, demographic and behavioral factors related to the impact of UV exposure, and SNP genotypes across selected 190 DNA repair genes.

The study design and reporting followed the STrengthening the REporting of Genetic Association Studies (STREGA): An Extension of the STROBE Statement (19).

Participant demographics and genetic data

The UKBB is a large-scale biomedical database that comprises approximately 500,000 adult participants, ages 40 to 69, recruited between 2006 and 2010, and contains detailed genetic and health information collected during baseline and follow-up visits (20). We included participants who self-reported their skin color during the initial UKBB assessment questionnaire. UKBB cancer registry data were obtained through linkage to UK national cancer registries and coded using the 10th revision of the International Classification of Diseases [ICD-10]. We first identified the participants with a diagnosis of either “malignant melanoma of skin” (C43 of ICD-10), or “other malignant neoplasms of skin” (C44 of ICD-10). To confirm a diagnosis of skin cancer, we required that the cancer had a histologic type of BCC, cSCC, or melanoma in-situ (M-is) versus melanoma invasive (M-inv).

For participants meeting the above criteria, as well as healthy subjects matched for self-reported skin tone, we extracted data from the “sun exposure” category, which includes the following items: “use of sun/UV protection (do not go out in sunshine, never/rarely, sometimes, most of the time, and always),” “number of childhood sunburns,” “frequency of sunlamp use (number per year),” “time spent outdoors in summer (hours per day),” and “time spent outdoors in winter (hours per day).” We also extracted information regarding sex, age, “skin color (very fair, fair, light olive, dark olive, brown, black),” “hair color (red, blonde, light brown, dark brown, black),” “subjective age appearance (younger than expected for age, as expected for age, older than expected for age),” and “ease of skin tanning (never tan/only burn, mildly or occasionally tanned, moderately tanned, get very tanned).”

We also obtained SNP data generated by the Affymetrix UK Biobank Axiom microarray. On the basis of a recent study that identified numerous genes directly and indirectly involved in various DNA repair pathways (21), we extracted SNP data (8,798 total markers) corresponding to 190 of these genes (Supplementary Table S1), in addition to data on genetic principal components, genetic ethnic grouping, and genetic kinship to other participants, filtering exclusively for individuals identified demographically as Caucasian.

Quality control of genetic data

Extracted SNP data underwent quality control using PLINK version 2.00 (https://www.cog-genomics.org/plink/2.0/; ref. 22). We removed SNPs missing in more than 1% of individuals, SNPs with a minor allele frequency (MAF) less than 1%, SNPs falling outside Hardy–Weinberg Equilibrium by a threshold of P = 0.000001, and SNPs that lie in areas of high linkage disequilibrium. We also removed individuals with a less than 99% SNP call rate, duplicate individuals and first-degree relatives. Following quality control, 1,912 SNPs remained for downstream analysis.

Following these selection criteria, we attained a cohort consisting of a total 472,672 individuals clustering into the following five groups: BCC (n = 17,221), cSCC (n = 2,331), M-is (n = 1,158), M-inv (n = 3,798), and healthy control individuals (n = 448,164). Within the skin cancer cohort, 1,288 individuals were identified as having greater than one disease diagnosis. Downstream analyses were conducted in a groupwise (disease vs. control) fashion, independently for each disease group.

Disease–environment association analysis

Using the RStudio (version 2022.02.1, Build 461; https://www.rstudio.com/; RRID:SCR_000432) package nnet, we performed multinomial logistic regression to explore the relationship between the above-mentioned demographic and environmental factors, and risk of BCC, cSCC M-is, and M-inv (23). These factors were set as predictor variables of outcome (disease), using one category in each variable as reference (sex: female; skin color: very fair; hair color: red; ease of skin tanning: never tan/only burn; subjective age appearance: younger than expected for age; use sun/UV protection: never/rarely). Factors composed of continuous variables [age, frequency of sunlamp use, number of childhood sunburns and time spent outdoors (summer and winter)] were not transformed. The relevel function was used to select the healthy control group as the baseline outcome group, and analysis was performed using the multinom function for each predictor variable. The log(odds ratios) of each analysis were exponentiated and transformed into relative risk (RR) ratios, defined as the probability of disease for a given predictor variable compared to its reference (RR > 1 indicates increased disease risk; RR < 1 indicates decreased disease risk; RR ∼ 1 indicates no difference in risk between disease and healthy control). Significance of risk was determined using a two-tailed Wald Z test (24).

SNP–disease and SNP–environment analyses

To assess how sun exposure factors interact with genetic susceptibility loci associated with skin cancers, we also performed gene-by-environment analysis using the Robust-joint-interaction package (https://epstein-software.github.io/robust-joint-interaction/), which computes P values corresponding to tests of SNP and SNP–environment interactions (25). Multiple testing correction was performed using the FDR approach (q < 0.05; ref. 26). Significant interactions for each group were visualized as waterfall plots using the GenVisR package (https://bioconductor.org/packages/release/bioc/html/GenVisR.html; ref. 27). Overlaps between significant SNP–environment interacting markers between each disease group were plotted using The Molbiotools Multiple List Comparator (https://molbiotools.com/listcompare.php) was used to plot overlaps between significant SNP–environment interactions (Venn Diagram) and assess for pairwise interactions between individuals with greater than one cancer diagnosis (Jaccard Index Coefficient).

Disease–environment/behavior interactions

Demographic characteristics of our participant cohorts are summarized in Table 1. Using multinomial logistic regression (23), we computed ratios corresponding to skin cancer risk for each of our chosen demographic and environmental variables, using our four disease groups as outcome variables. We observed significant (P < 0.05) trends corresponding to six demographic factors and environmental exposures (sex, hair color, skin color, ease of skin tanning, subjective age appearance, use of sun protection) for each disease group. BCC was positively correlated with male sex (RR = 1.23) and frequent use of sunscreen (RR = 2.40); and inversely correlated with light olive, dark olive, or brown skin (RR = 0.47–0.50), ease of tanning (RR = 0.66), dark brown hair (RR = 0.5), and older than expected appearance (RR = 0.79). cSCC was positively correlated with male sex (RR = 1.97), older-than-expected appearance (RR = 1.28), frequent (“always”) use of sunscreen (RR = 2.26); and inversely correlated with light olive, dark olive, or brown skin (RR < 0.33), ease of tanning (RR = 0.58), as well as light brown, blonde, dark brown hair, and black hair (RR = 0.38–0.68). M-is was directly correlated occasional, frequent, and very frequent use of sunscreen (RR = 1.87–3.58); and inversely correlated with light olive, dark olive, and brown skin (RR = 0.46–0.65), blonde, light brown, dark brown, and black hair (RR < 0.60), and ease of tanning (RR = 0.59). M-inv was positively correlated with occasional, frequent, and very frequent use of sunscreen (RR = 1.36–3.92); and inversely correlated with fair, light olive, dark olive, and brown skin (RR = 0.17–0.58), as well as blonde, light brown, dark brown, and black hair (RR = 0.27–0.72). These findings are summarized graphically in Fig. 1 and numerically in Supplementary Table S2.

SNP–disease association

We conducted SNP–disease association analysis across four UKBB skin cancer populations and identified 11 significant (q < 0.05) alleles both positively and negatively associated with risk for BCC and M-inv (OR, 0.81–1.20). Through this analysis, we do not report any significant markers associated with cSCC and M-is. Of 190 surveyed genes, only 7 [Fanconi anemia complementation group A (FANCA), breast cancer 2 (BRCA2), RAD51 recombinase paralog B (RAD51B), ribonucleotide reductase regulatory TP53 inducible subunit M2B (RRM2B), general transcription factor IIH subunit 4 (GTF2H4), exonuclease 1 (EXO1), and DNA cross-link repair 1B (DCLRE1B)] were highlighted in our analysis as being significantly associated with these cancers (Table 2). Three of the SNP markers (rs9926296, rs3743860, rs2376883) were found in BCC and M-inv groups and map to the intron regions FANCA, and are among the top positively (rs3743860; OR, 1.20) and negatively associated SNPs with M-inv (rs9926296; OR, 0.81). Moreover, two markers were in exon regions and are among the top three with positive association with BCC (rs4149909, EXO1; OR, 1.13 and rs61748588, GTF2H4; OR, 1.19).

SNP–environment interaction

Because SNP analysis alone showed paucity of associations, we performed a robust joint test to assess the interaction between patient demographic, environmental/behavioral patterns and our surveyed DNA repair-associated SNPs. By adding this critical information, we identified 201 significant (q < 0.05) SNPs localized in 104 genes that interact with demographic and environmental/behavioral factors across four skin cancer groups. In the BCC group, we identified 32 such SNPs across 20 genes, 28 of which are in intron regions and 4 in exon regions. Notably, 17 of these 32 SNPs interacted with sunlamp use and 16 interacted with history of childhood sunburns. Hours spent outdoors, age, hair color, and use of sun protection were also important categories, with 9 SNP–variable interactions in each group. Notably, markers rs12046289 (DCLRE1B), rs9926296 (FANCA), and rs3743860 (FANCA) interacted with every demographic and environmental/behavioral variable, whereas rs4942486 (BRCA2), rs2376883 (FANCA), and rs62989960 (FANCA) interacted with all but one variable.

In the cSCC group, we identified 72 significantly interacting SNPs across 54 genes, 59 of which are in introns and 13 in exons. In contrast to BCC, most of these SNPs (67 markers) were associated with sunlamp use, with 64 markers interacting exclusively with this variable. Moreover, rs17882704 (CHEK2) was the only SNP that interacted with greater than two variables, including sunlamp use, sex, tanning behavior, history of childhood sunburns, and aging appearance. In the cSCC group, we did not identify any SNPs that interacted significantly with time spent outdoors, age, hair color, and use of sun protection. One SNP [rs79594681 (RAD51B)] was conserved between both keratinocyte cancer groups and interacted with sunlamp use in both groups. Similarly, two genes (MNAT and RECQL) represented by distinct SNPs, were conserved across both BCC and cSCC. In the M-is group, we identified 106 significantly interacting SNPs across 69 genes, 90 of which were found in introns with the remaining 16 located in exon regions. Like cSCC, sunlamp use was the variable underpinning the majority (88 markers) of SNP interactions. History of childhood sunburns was the second most frequent variable, representing 16 SNP-variable interactions. Notably, rs5744657 (POLK) and rs4151276 (MNAT1) interacted with all 11 variables, followed by the exonic SNP rs3218786 (POLI), which interacted with sex, history of childhood sunburns, sunlamp use, time outdoors in the winter, and age.

In the M-inv, we identified 29 significantly interacting SNPs across 23 genes. Similar to BCC, sunlamp use was the most common SNP-interacting variable, represented by 27 SNPs, of which 23 interacted exclusively with this variable and 6 of which represented the only exonic markers. Further, three FANCA SNPs (rs9926296, rs3743860, and rs2376883) which were found to interact with nearly all variables in the BCC group were found to interact with every variable in the M-inv group, further underscoring the importance of FANCA in the pathogenesis of both BCC and M-inv. In the M-inv group, rs3092829 (ATM) was the only other SNP that interacted with greater than one variable, including sex, skin color, tanning behavior, sunlamp use, hours spent outdoors in winter and summer, age, and use of sun protection. Among both melanoma groups, we identified five SNPs [rs111885773 (ENDOV), rs150393409 (FAN1), rs61753893 (FANCM), rs114554002 (RAD50)] that interacted exclusively with the sunlamp use variable, suggesting shared genetic and environmental risk factors for both diseases. Moreover, we identified distinct SNPs from two additional genes (ALKBH3 and FANCF), which interacted with different variables in each melanoma group. Although we did not identify any significantly interacting SNPs that were common across all four skin cancers, distinct SNPs from 4 genes (FAM193A, FANCA, MGMT, and RAD51B) were found to interact with at least one demographic and environmental/behavioral variable in each disease group. Figure 2 is an overview of SNP-environment interaction analysis in the M-inv group (overviews of this analysis for other disease groups are presented in Supplementary Figs. S1–S3). Figure 3 highlights genes pertaining to SNPs, which overlapped across disease groups that could be of value for future individual risk assessment for two or more skin cancers.

Analysis of simultaneous skin cancer diagnoses

To further assess the shared risk between skin cancer subtypes, we performed pairwise analysis within our cohort of individuals diagnoses with two or more skin cancers (Fig. 4). Of 1,288 such individuals, representing approximately 5% of the total disease cohort, we identified the greatest overlaps between BCC and cSCC (n = 550), and BCC and M-inv (n = 487), accounting for over 80% of simultaneous diagnoses. The remaining proportion was made up of overlaps between BCC and M-is (n = 144), cSCC and M-inv (n = 45), cSCC and M-is (n = 10), as well as those of individuals with three simultaneous diagnoses, including BCC, SCC, and M-inv (n = 45), and BCC, cSCC, and M-is (n = 16). We identified no individuals with four simultaneous diagnoses, nor of simultaneous diagnoses of M-is and M-inv.

By leveraging robust datasets from the UKBB, we performed disease–environment, disease–gene, and gene–environment investigations, and identified key variables underlying the complex interplay between 190 DNA repair genes and 11 environmental and demographic factors in the pathophysiology of melanoma and keratinocyte carcinomas. The diagnostic accuracy of our disease cohort is supported through the selection of individuals on the basis of the validated ICD coding system and subsequently verified using histopathologic analysis. The first two of these analyses enabled us to identify eleven SNPs in seven key genes and five participant-related factors that are significantly associated with BCC and other skin cancer types, respectively. Notably, it was through joint analysis, taking together genetic and participant-related factors, that we were able to expand significant findings to cumulatively include 147 SNPs across 84 genes interacting with all 11 participant factors across every disease group. Our findings point to overlapping but distinct factors that mediate risk of each cancer and cumulatively underscore the importance of integrating environmental and demographic factors into genetic analyses to draw meaningful conclusions about skin cancer risk prognosis.

We first conducted disease–environment analyses and showed that sex, skin and hair color, skin tanning behavior, and use of sun protection show the greatest associations with risk of all four cancers. This is consistent with previous findings as skin-associated DNA damage upon exposure to UV light is a major driver of most skin cancers, whereas hair color is a weak proxy for skin tone/genetic makeup, and is not directly related to this effect (28). Moreover, male sex was positively associated with BCC and particularly cSCC, consistent with previous findings, and negatively associated with M-inv and M-is, in contrast to the literature (29, 30). Surprisingly at first, frequent use of sunscreen was greatly associated with all cancers. This surprising association was reported in prior studies (31–33). This paradoxical finding was the increasing risk of skin cancers with increased sunscreen use, which we posit can be explained by greater exposure to UV light and/or a lack of reapplication of sunscreen throughout the day, or due to increased use of sun protection following skin cancer diagnosis. Collectively, however, these findings demonstrate the importance of adequate and frequent sunscreen use and minimization of exposure to UV light, particularly in individuals with fair skin.

We subsequently investigated the association between genetic markers related to DNA repair mechanisms and skin cancer and identified 11 significant markers across BCC and M-inv. Three of these markers (rs9926296, rs3743860, rs2376883), located in the FANCA gene (collectively, introns 29 and 31), were associated with both cancers. Two of these SNPs (rs3743860 and rs9926296) have been shown to be associated with colorectal cancer and generalized vitiligo respectively, pointing to the potentially shared pathways in both cancer and skin disease (34, 35). Moreover, rs12046289 was shown to confer two-fold increased risk of cutaneous melanoma among individuals with a strong family history, whereas rs1800347 was significantly associated with high-risk non-BRCA mutant breast cancer in a French Canadian cohort (36, 37). Nine of these SNPs (rs9926296, rs3743860, rs2376883, rs12046289, rs1800347, rs62989960, rs4942486, rs28928581, and rs4902628) are found in intron regions, whereas the other two (rs4149909 and rs61748588) are in exon regions. Notably, the identified G allele of rs4149909 is associated with a nonsynonymous, missense mutation that substitutes asparagine for serine in the EXO1 protein sequence, with potential to disrupt protein structure and confer dysregulated DNA repair function, and has been associated with keratinocyte cancers in a large European cohort study (38). Further, the A allele of rs6178588 is associated with a synonymous mutation; despite maintaining wild-type protein sequence. Synonymous mutations have been shown to have potential to impact protein function through changes to mRNA splicing, translation and protein folding, disruption of microRNA-mediated gene regulation, and formation of novel haplotypes with altered gene function. In a similar fashion, intronic SNPs have been shown impact gene expression through altered gene interactions with transcription factors and long noncoding RNAs, as well as influencing epigenetic gene regulation through changes to genomic imprinting and chromatin–DNA interactions (39, 40). Collectively, these findings suggest that the identified SNPs may both play a regulatory role and have a direct impact on protein-coding sequence in mediating skin cancer risk.

Combining all available data, we then conducted gene–environment analyses, which provide the most informative conclusions about the interplay between DNA repair genes and participant-linked factors in each disease group. In BCC, the strongest genetic effects were in the FANCA and DCLRE1B genes, whereas the strongest environmental effects were linked to the number of lifetime sunburns (Supplementary Fig. S1); the latter is consistent with epidemiologic literature which highlights the effect of lifetime sunburns and genetic predisposition as major contributors to BCC development (41). Similarly, cSCC is mediated by interactions between the number of lifetime sunburns and, to a lesser degree, aging appearance as interacting variables with many loci (Supplementary Fig. S2). M-inv shows similar trends with the three FANCA markers significant for all demographic and environmental variables, with the most common environmental interacting variables being lifetime sunburns and sunlamp use; these findings further support the importance of FANCA in predisposing to these cancers (Table 2; Fig. 2; Supplementary Fig. S1). In contrast, M-is shows fewer significant interactions between a small subset of specific genes and various environmental factors, and instead appears to be to a significant degree driven by sunlamp use, an environmental factor that shows interactions with numerous genes (Supplementary Fig. S3). These findings are also largely consistent with previous literature identifying the most important risk factors for each skin cancer group, namely chronic lifetime sun exposure in cSCC, number of lifetime sunburns and genetics in BCC, and sunlamp as well as sunburns in both in situ and invasive melanoma (42–46).

For the purposes of using GWAS to predict skin cancer risk, as now performed by 23andMe and AncestryDNA for a variety of other diseases, our most consistent finding across keratinocyte carcinoma and invasive melanoma populations is the association of loci in the FANCA gene with all types of skin cancer analyzed especially for Caucasian individuals who report sunlamp use and a history of childhood sunburns. FANCA codes for a subunit of a protein family involved in post-replication repair, and mutation in these genes have been associated with increased risk of numerous hereditary and sporadic cancers (47, 48). Although another FANCA SNP has been previously associated with overall survival of melanoma, we have not observed reports of this particular marker in the context of BCC, cSCC, and M-inv (49). Nevertheless, our findings and prior reports underscore the importance of this gene as an important mediator of age, sex, and behavior in the development of both melanoma and keratinocyte carcinomas.

Our study has several limitations. Our genetic analyses were limited to markers (SNPs) associated with 190 DNA repair genes, which are known to have far-reaching interactions with other pathways that were not captured here. Also, our study design does not capture additional environmental factors that may be relevant to disease pathophysiology. In addition, UKBB data does not delineate factors associated with skin cancer and is known to suffer from some degree of selection bias (50). Moreover, given that approximately 5% of our total disease cohort is made up of individuals with more than skin cancer diagnosis, we acknowledge the introduction of some degree of bias and potential for confounding in our analyses. Finally, our behavioral findings are self-reported by individuals, where recall bias can play a significant role, and do not account for changes in behavior before and after disease diagnosis.

Our study highlights the importance of conducting dynamic investigations of genetic variation by incorporating demographic, environmental and behavioral factors to fully appreciate the complex pathophysiology of skin cancers. Our findings highlight the prognostic value of specific SNPs in the FANCA gene in Caucasian individuals who report frequent sunlamp use and a history of childhood sunburns. These data will require validation in other populations where GWAS data are available and can be correlated with specific exposures. Additional SNPs identified in this report may also have significant association with the development of common skin cancers. Our findings have laid the foundation for a precision medicine approach to risk assessment based on molecular markers in key DNA repair genes and individual factors. Given recent increases in highly affordable risk assessments based on GWAS/SNP analyses, this work underscores that risk stratification can be greatly improved when diseases are studied holistically.

P. Lefrançois reports personal fees from Abbvie, Amgen, UCB, Sanofi, Novartis, Bausch Health, Pfizer, Arcutis, Beiersdorf, Sun Pharma, Eli Lilly, Leo, Galderma, and L'Oreal outside the submitted work. No disclosures were reported by the other authors.

R. Jeremian: Resources, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. P. Xie: Conceptualization, resources, data curation, software, formal analysis, validation, investigation, methodology, writing–review and editing. M. Fotovati: Data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–review and editing. P. Lefrançois: Data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–review and editing. I.V. Litvinov: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing.

This work was supported by the Canadian Institutes for Health Research (CIHR) Project Scheme Grant No. 426655 to Dr. I.V. Litvinov; CIHR Catalyst Grant No. 428712 to Drs. I.V. Litvinov, Ghazawi, Le, Lagacé, Mukovozov, Cyr, Mourad, Claveau, Netchiporouk, Gniadecki, Sasseville, and Rahme; Cancer Research Society (CRS)-CIHR Partnership Grant No. 25343 to Dr. I.V. Litvinov; Canadian Dermatology Foundation research grants to Dr. I.V. Litvinov and Sasseville, and by the Fonds de la recherche du Québec – Santé to Dr. I.V. Litvinov (Nos. 34753, 36769, and 296643).

The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).

1.
Rogers
HW
,
Weinstock
MA
,
Feldman
SR
,
Coldiron
BM
.
Incidence estimate of nonmelanoma skin cancer (keratinocyte carcinomas) in the U.S. population, 2012
.
JAMA Dermatol
2015
;
151
:
1081
6
.
2.
Conte
S
,
Aldien
AS
,
Jette
S
,
LeBeau
J
,
Alli
S
,
Netchiporouk
E
, et al
.
Skin cancer prevention across the G7, Australia and New Zealand: a review of legislation and guidelines
.
Curr Oncol
2023
;
30
:
6019
40
.
3.
Conte
S
,
Ghazawi
FM
,
Le
M
,
Nedjar
H
,
Alakel
A
,
Lagace
F
, et al
.
Population-based study detailing cutaneous melanoma incidence and mortality trends in Canada
.
Front Med
2022
;
9
:
830254
.
4.
Ransohoff
KJ
,
Jaju
PD
,
Tang
JY
,
Carbone
M
,
Leachman
S
,
Sarin
KY
.
Familial skin cancer syndromes: increased melanoma risk
.
J Am Acad Dermatol
2016
;
74
:
423
34
;
quiz 35–6
.
5.
Que
SKT
,
Zwald
FO
,
Schmults
CD
.
Cutaneous squamous cell carcinoma: incidence, risk factors, diagnosis, and staging
.
J Am Acad Dermatol
2018
;
78
:
237
47
.
6.
Marzuka
AG
,
Book
SE
.
Basal cell carcinoma: pathogenesis, epidemiology, clinical features, diagnosis, histopathology, and management
.
Yale J Biol Med
2015
;
88
:
167
79
.
7.
Lagacé
F
,
Noorah
BN
,
Conte
S
,
Mija
LA
,
Chang
J
,
Cattelan
L
, et al
.
Assessing skin cancer risk factors, sun safety behaviors and melanoma concern in Atlantic Canada: a comprehensive survey study
.
2023
;
15
:
3753
.
8.
Cleaver
JE
,
Crowley
E
.
UV damage, DNA repair and skin carcinogenesis
.
Front Biosci
2002
;
7
:
d1024
43
.
9.
Lefrancois
P
,
Xie
P
,
Gunn
S
,
Gantchev
J
,
Villarreal
AM
,
Sasseville
D
, et al
.
In silico analyses of the tumor microenvironment highlight tumoral inflammation, a Th2 cytokine shift and a mesenchymal stem cell-like phenotype in advanced in basal cell carcinomas
.
J Cell Commun Signal
2020
;
14
:
245
54
.
10.
Litvinov
IV
,
Xie
P
,
Gunn
S
,
Sasseville
D
,
Lefrancois
P
.
The transcriptional landscape analysis of basal cell carcinomas reveals novel signalling pathways and actionable targets
.
Life Sci Alliance
2021
;
4
:
e202000651
.
11.
Xie
P
,
Lefrancois
P
,
Sasseville
D
,
Parmentier
L
,
Litvinov
IV
.
Analysis of multiple basal cell carcinomas (BCCs) arising in one individual highlights genetic tumor heterogeneity and identifies novel driver mutations
.
J Cell Commun Signal
2022
;
16
:
633
5
.
12.
Gantchev
J
,
Messina-Pacheco
J
,
Martinez Villarreal
A
,
Ramchatesingh
B
,
Lefrancois
P
,
Xie
P
, et al
.
Ectopically expressed meiosis-specific cancer testis antigen HORMAD1 promotes genomic instability in squamous cell carcinomas
.
Cells
2023
;
12
:
1627
.
13.
Kong
D
,
Giovanello
KS
,
Wang
Y
,
Lin
W
,
Lee
E
,
Fan
Y
, et al
.
Predicting Alzheimer's disease using combined imaging-whole genome SNP Data
.
J Alzheimers Dis
2015
;
46
:
695
702
.
14.
Diogo
D
,
Tian
C
,
Franklin
CS
,
Alanne-Kinnunen
M
,
March
M
,
Spencer
CCA
, et al
.
Phenome-wide association studies across large population cohorts support drug target validation
.
Nat Commun
2018
;
9
:
4285
.
15.
Gruendner
J
,
Wolf
N
,
Tögel
L
,
Haller
F
,
Prokosch
HU
,
Christoph
J
.
Integrating genomics and clinical data for statistical analysis by using GEnome MINIng (GEMINI) and fast healthcare interoperability resources (FHIR): system design and implementation
.
J Med Internet Res
2020
;
22
:
e19879
.
16.
Abbasi
J
.
23andMe develops first drug compound using consumer data
.
JAMA
2020
;
323
:
916
.
17.
Crawford
DC
,
Sedor
JR
.
Biobanks linked to electronic health records accelerate genomic discovery
.
J Am Soc Nephrol
2021
;
32
:
1828
9
.
18.
Khan
R
,
Mittelman
D
.
Consumer genomics will change your life, whether you get tested or not
.
Genome Biol
2018
;
19
:
120
.
19.
Little
J
,
Higgins
JP
,
Ioannidis
JP
,
Moher
D
,
Gagnon
F
,
von Elm
E
, et al
.
STrengthening the REporting of genetic association studies (STREGA): an extension of the STROBE statement
.
PLoS Med
2009
;
6
:
e22
.
20.
Sudlow
C
,
Gallacher
J
,
Allen
N
,
Beral
V
,
Burton
P
,
Danesh
J
, et al
.
UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age
.
PLoS Med
2015
;
12
:
e1001779
.
21.
Chae
YK
,
Anker
JF
,
Carneiro
BA
,
Chandra
S
,
Kaplan
J
,
Kalyan
A
, et al
.
Genomic landscape of DNA repair genes in cancer
.
Oncotarget
2016
;
7
:
23312
21
.
22.
Purcell
S
,
Neale
B
,
Todd-Brown
K
,
Thomas
L
,
Ferreira
MA
,
Bender
D
, et al
.
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
2007
;
81
:
559
75
.
23.
Venables
WN
,
Ripley
BD
.
Modern applied statistics with S
. 4th ed. ed.
New York
:
Springer
;
2002
.
24.
Hauck
WW
,
Donner
A
.
Wald's test as applied to hypotheses in logit analysis
.
J Am Statist Assoc
1977
;
72
:
851
3
.
25.
Almli
LM
,
Duncan
R
,
Feng
H
,
Ghosh
D
,
Binder
EB
,
Bradley
B
, et al
.
Correcting systematic inflation in genetic association tests that consider interaction effects: application to a genome-wide association study of posttraumatic stress disorder
.
JAMA Psychiatry
2014
;
71
:
1392
9
.
26.
Benjamini
Y
,
Hochberg
Y
.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
.
J R Stat Soc Ser B
1995
;
57
:
289
300
.
27.
Skidmore
ZL
,
Wagner
AH
,
Lesurf
R
,
Campbell
KM
,
Kunisaki
J
,
Griffith
OL
, et al
.
GenVisR: genomic visualizations in R
.
Bioinformatics
2016
;
32
:
3012
4
.
28.
Narayanan
DL
,
Saladi
RN
,
Fox
JL
.
Ultraviolet radiation and skin cancer
.
Int J Dermatol
2010
;
49
:
978
86
.
29.
Bassukas
ID
,
Tatsioni
A
.
Male sex is an inherent risk factor for basal cell carcinoma
.
J Skin Cancer
2019
;
2019
:
8304271
.
30.
Olsen
CM
,
Thompson
JF
,
Pandeya
N
,
Whiteman
DC
.
Evaluation of sex-specific incidence of melanoma
.
JAMA Dermatol
2020
;
156
:
553
60
.
31.
Wolf
P
,
Quehenberger
F
,
Mullegger
R
,
Stranz
B
,
Kerl
H
.
Phenotypic markers, sunlight-related factors and sunscreen use in patients with cutaneous melanoma: an Austrian case-control study
.
Melanoma Res
1998
;
8
:
370
8
.
32.
Whiteman
DC
,
Valery
P
,
McWhirter
W
,
Green
AC
.
Risk factors for childhood melanoma in Queensland, Australia
.
Int J Cancer
1997
;
70
:
26
31
.
33.
Rueegg
CS
,
Stenehjem
JS
,
Egger
M
,
Ghiasvand
R
,
Cho
E
,
Lund
E
, et al
.
Challenges in assessing the sunscreen-melanoma association
.
Int J Cancer
2019
;
144
:
2651
68
.
34.
Pardini
B
,
Corrado
A
,
Paolicchi
E
,
Cugliari
G
,
Berndt
SI
,
Bezieau
S
, et al
.
DNA repair and cancer in colon and rectum: novel players in genetic susceptibility
.
Int J Cancer
2020
;
146
:
363
72
.
35.
Jin
Y
,
Birlea
SA
,
Fain
PR
,
Ferrara
TM
,
Ben
S
,
Riccardi
SL
, et al
.
Genome-wide association analyses identify 13 new susceptibility loci for generalized vitiligo
.
Nat Genet
2012
;
44
:
676
80
.
36.
Liang
XS
,
Pfeiffer
RM
,
Wheeler
W
,
Maeder
D
,
Burdette
L
,
Yeager
M
, et al
.
Genetic variants in DNA repair genes and the risk of cutaneous malignant melanoma in melanoma-prone families with/without CDKN2A mutations
.
Int J Cancer
2012
;
130
:
2062
6
.
37.
Litim
N
,
Labrie
Y
,
Desjardins
S
,
Ouellette
G
,
Plourde
K
,
Belleau
P
, et al
.
Polymorphic variations in the FANCA gene in high-risk non-BRCA1/2 breast cancer individuals from the French Canadian population
.
Mol Oncol
2013
;
7
:
85
100
.
38.
Liyanage
UE
,
Law
MH
,
Han
X
,
An
J
,
Ong
JS
,
Gharahkhani
P
, et al
.
Combined analysis of keratinocyte cancers identifies novel genome-wide loci
.
Hum Mol Genet
2019
;
28
:
3148
60
.
39.
Oscanoa
J
,
Sivapalan
L
,
Gadaleta
E
,
Dayem Ullah
AZ
,
Lemoine Nicholas
R
,
Chelala
C
.
SNPnexus: a web server for functional annotation of human genome sequence variation (2020 update)
.
Nucleic Acids Res
2020
;
48
:
W185
W92
.
40.
Deng
N
,
Zhou
H
,
Fan
H
,
Yuan
Y
.
Single nucleotide polymorphisms and cancer susceptibility
.
Oncotarget
2017
;
8
:
110635
49
.
41.
Armstrong
BK
,
Kricker
A
,
English
DR
.
Sun exposure and skin cancer
.
Australas J Dermatol
1997
;
38
Suppl 1
:
S1
6
.
42.
Kilgour
JM
,
Jia
JL
,
Sarin
KY
.
Review of the molecular genetics of basal cell carcinoma; inherited susceptibility, somatic mutations, and targeted therapeutics
.
Cancers
2021
;
13
:
3870
.
43.
Conforti
C
,
Zalaudek
I
.
Epidemiology and risk factors of melanoma: a review
.
Dermatol Pract Concept
2021
;
11
(
Suppl 1
):
e2021161S
.
44.
Ghazawi
FM
,
Cyr
J
,
Darwich
R
,
Le
M
,
Rahme
E
,
Moreau
L
, et al
.
Cutaneous malignant melanoma incidence and mortality trends in Canada: a comprehensive population-based study
.
J Am Acad Dermatol
2019
;
80
:
448
59
.
45.
Ghazawi
FM
,
Le
M
,
Lagace
F
,
Cyr
J
,
Alghazawi
N
,
Zubarev
A
, et al
.
Incidence, mortality, and spatiotemporal distribution of cutaneous malignant melanoma cases across Canada
.
J Cutan Med Surg
2019
;
23
:
394
412
.
46.
Ghazawi
FM
,
Lu
J
,
Savin
E
,
Zubarev
A
,
Chauvin
P
,
Sasseville
D
, et al
.
Epidemiology and patient distribution of oral cavity and oropharyngeal SCC in Canada
.
J Cutan Med Surg
2020
;
24
:
340
9
.
47.
Del Valle
J
,
Rofes
P
,
Moreno-Cabrera
JM
,
López-Dóriga
A
,
Belhadj
S
,
Vargas-Parra
G
, et al
.
Exploring the role of mutations in fanconi anemia genes in hereditary cancer patients
.
Cancers
2020
;
12
:
829
.
48.
Chen
H
,
Zhang
S
,
Wu
Z
.
Fanconi anemia pathway defects in inherited and sporadic cancers
.
Translational Pediatrics
2014
;
3
:
300
4
.
49.
Yin
J
,
Liu
H
,
Liu
Z
,
Wang
LE
,
Chen
WV
,
Zhu
D
, et al
.
Genetic variants in fanconi anemia pathway genes BRCA2 and FANCA predict melanoma survival
.
J Invest Dermatol
2015
;
135
:
542
50
.
50.
Swanson
JM
.
The UK Biobank and selection bias
.
Lancet
2012
;
380
:
110
.

Supplementary data