Abstract
There is a large gap between the aspiration of considering sex as biological variable and the execution of such studies, particularly in genomic studies of human cancer. This represents a lost opportunity to identify sex-specific molecular etiologies that may underpin the dramatic sex differences in cancer incidence and outcome. There are conceptual and practical challenges associated with considering sex as a biological variable, including the definition of sex itself and the need for novel study designs. A better understanding of cancer mechanisms, resulting in improved outcomes, will reward the effort invested in incorporating sex as a biological variable.
Introduction
Sex is the most universal recognizable phenotypic difference in humans, yet its incorporation in cancer studies varies dramatically. Molecular and cellular sex differences have been identified in metabolism, development, and in neoplastic disease pathophysiology and outcome (1). Notice NOT-OD-15-102 from the NIH, effective January 2016, states that across institutes, “NIH expects that sex as a biological variable will be factored into research designs, analyses, and reporting in vertebrate animal and human studies.” The gap between this stated aspiration and reality is particularly acute when cancer genomic studies are examined. These studies routinely exclude sex or gender from their analysis. It is unsettling that a search of The Cancer Genome Atlas (TCGA) findings available through PubMed on July 20, 2019 for “TCGA” and “Cancer” returned 5,636 publications but only 72 if we add “sex” (or 141 if we add the related, but not biologically defined, “gender”). All articles that mention sex also mention gender. Using the broader, “gender,” there are only 2.5% of articles that refer TCGA cancer analysis and sex or gender. If we limit to studies published in 2018 until today, the numbers are 2900, 37, and 74, respectively, resulting in the same, generous, 2.5% of articles in the last year and a half that incorporate sex/gender. Even if we are off by factor of 10, this means 75% of TCGA-associated articles are not meaningfully addressing sex differences in their study of cancer. While recognizing that TCGA does not represent all genomic cancer analyses, and may be underpowered to identify sex-biased or sex-specific variation, the routine exclusion sex differences from these seminal articles represents a lost opportunity for fundamental understanding of cancer etiology.
We posit that previous studies are not wrong, but that they are incomplete. We note that there are an overwhelming amount of molecular biological similarities between the sexes, and there is certainly sex-shared variation that drives gene expression. Instead, we propose that studies treating both sexes the same, or regressing out sex-effects, will miss sex-mediated modification of autosomal variation, fail to identify sex-specific mechanisms, and do not have the capacity to capture drivers of oncogenesis with opposite effects in the sexes. In addition to global regulatory and environmental sex differences there are genetic differences between the sexes. However, the sex chromosomes are often not included in genetic analyses. Notably, current genome-wide association studies typically do not analyze the sexes separately, and only 33% incorporate the X or Y chromosomes (2). As a result, whole classes of novel mechanisms are currently undiscovered. For example, in patients with glioblastoma, standard treatment is more effective in females over males, and more than 30% of the variation in tumor gene expression is sex specific (3), suggesting potential mechanisms and novel targets for prevention and treatment.
Sex Differences Are Observed in Cancer
It is consensus that there are sex differences in incidence and prevalence across multiple cancers (https://seer.cancer.gov), but the molecular etiologies underpinning these sex differences have yet to be systematically investigated. Hepatocellular carcinoma (HCC) is a quintessential cancer demonstrating dramatic differences in sex-specific incidence and an example of the insights that can be gained from inclusion of sex as biological variable. HCC is the second leading cause of global cancer-related death and is one of a small number of cancers with an increasing worldwide incidence—doubling in the last three decades (4). HCC shows a dramatic difference in male-to-female incidence ratio, varying between 1.3:1 and 5.5:1 across populations (4). HCC clinical presentation also varies by sex with cancer in males demonstrating earlier age of onset and with more/larger nodules (5).
We observed sex-specific differences in gene expression in nondiseased liver, in tumor adjacent tissue, and in tumors. We have performed sex-specific characterization of the Gene Expression Tissue Expression (GTEx) nondiseased liver and TCGA's HCC dataset (5). This analysis used the self-reported sex designation provided as metadata by GTEx and TCGA. In addition, in spite of smaller sample size, when tumors were contrasted to tumor adjacent tissue, more differentially expressed genes were observed in sex-specific analysis than when the sexes were pooled. Interestingly, we observed that 24% of inherited DNA variants that modulated gene expression (expression quantitative trait loci) differed between the sexes. The DNA variants on autosomes appeared to be regulated by trans-effects (e.g., sex chromosomes regulating autosomal expression, or global hormonal effects). Gene set enrichment analysis of gene expression showed that the sexes shared the previously known activation of pathways related to p53, cell cycle, and apoptosis, but provocatively showed male-specific tumor–tumor adjacent differences in the Notch signaling pathway. Similar to HCC (5) and glioblastoma (3), we speculate that sex-specific analysis in other nonreproductive cancers will similarly show differences in molecular etiology.
There are multiple mechanisms underlying sex differences
Sex differences in humans are modulated by multiple factors. In humans, typical females have two X chromosomes, while typical males have a single X chromosome, and a Y chromosome. Sex differences in gene expression are conserved across mammals, for approximately 3,000 genes (6). This suggests conserved genetic mechanisms of sex differences. Genes that show sex differences in expression are found on the autosomes and sex chromosomes. The X chromosome, in particular, has at least 23 tumor suppressor genes, with at least eight that escape X-chromosome inactivation in genetic females that have been suggested as a direct mechanism for sex differences in cancer incidence (ref. 7; https://bioinfo.uth.edu/TSGene/). Although sex differences begin before birth, first driven by genetic differences, they can be compounded by gonadal hormones, especially at puberty. Circulating hormones are important for both healthy tissue development, and in cancer risk. These hormones are important, not only for physiologic development and reproduction, but also for modulating the immune system.
One of the critical modulators in sex differences in cancer etiology could arise from evolved sex differences in the immune system, and could be tested by studying sex differences in cancer risk across species. Notably there are sex differences in immune function across species (8). In placental mammals, an additional significant driver of sex differences is the evolution of the placenta (9). The observed differences may reflect adaptations associated with female immune system specialization required to accommodate pregnancy, tolerating an antigenically differing fetus, while maintaining immune competence with respect to infectious disease and parasites (9). The evolution of the placenta and pregnancy likely resulted in significant sex differences in selection pressure based on morphology, physiology, and immune function that could be affecting differences in cancer between the sexes. The sex differences may be facilitated by the placenta, and the associated expansion of fetal microchimerism with additional pregnancies may alter female immune tolerance (10). More specifically, the very evolution of an invasive organ, could have also allowed every cell in our body to have a program that facilitates metastasis. However, while every cell has the “program” for developing the placenta (angiogenesis, extracellular matrix breakdown, immune evasion, etc.), the evolution of genetic machinery for building a placenta does not in itself explain sex differences in cancer incidence and mortality. Gene expression on the X chromosome could be a potential mechanism; in humans, the X chromosome contains at least 107 immune-related genes, of which, 55 escape inactivation (9).
There is complexity in incorporating sex as a biological variable
As a consequence of all the factors driving cancer and sex differences, it is not trivial to incorporate sex as a biological variable into these studies. The 2016 NIH policy makes it clear that sex differences are important and must be included in NIH-funded research, but the policy provides limited guidance on the best means to incorporate sex as a biological variable, and there is no consensus on best practices. This is because there are significant conceptual and practical challenges to incorporating sex as a biological variable in cancer research. As a community, we can work on addressing these together:
There are many conceptual and practical challenges for the field:
(i) Sex is more than a binary. Sex chromosomes are not sex specific, and gonadal hormones are not sex limited. Sex presentation is not 100% concordant with circulating hormones or chromosome complement. At a minimum level, we can first take into account reported sex, as well as compare germline sex chromosome complement of patient with reported sex/gender.
(ii) The sex chromosome complement of the patient may not match the sex chromosome complement of the tumor. For example, prostate tumors sometimes have increased copy numbers of chromosome X, while in multiple myeloma, cancers in males lose the Y chromosome and cancers in genetic females lose one of the two X chromosomes. This can be addressed by independently assessing the sex chromosome complement of the sample.
(iii) Circulating hormone levels are not sex specific. Androgens do not occur only in males, and estrogens not only in females. In addition, there are likely age-by-sex interactions, including different hormonal complements of people who are undergoing menstrual cycles, or not, whether they are pregnant or not, and so on. One way to account for this is to explicitly measure circulating hormone levels.
(iv) Incorporating sex as a biological variable introduces study design challenges. It may be difficult to get statistically significant sample sizes in each sex, and so new approaches are needed to account for the differences in sample size. Furthermore, because sex is not a strict binary, study design needs to consider which axis of variation is going to be tested (e.g., sex chromosome gene expression, complement, gonadal hormone levels, environment, etc.).
(v) Most genomics software does not inherently incorporate sex differences or account for the unique features of the sex chromosomes. Commonly used software for alignment does not take into account the sex chromosome complement. New tools are being developed that can be incorporated into traditional pipelines for this (11). In addition, software for variant calling is not aware, inherently, of whether a chromosome is haploid or diploid (e.g., the nonpseudoautosomal regions of the X and Y in males). This can be addressed by calling variants in each region in males and females separately.
Over and above the biological and technical challenges, there is social controversy about the existence, and application of, the findings that highlight sex-related distinctions in health and disease. Sex chromosome complement, hormone level, and environmental–social factors, can have independent, confounding, or compounding effects. As such, differences in sex are multifaceted, and do not justify social or medical discrimination based on gender, which is an imperfect proxy for all of these sex differences combined. For example, it would be inappropriate to deny treatment based on gender, because gender does not capture the genetic, hormonal, and environmental factors that were studied to develop the treatment. In the same way that, as a society, we do not recommend a double mastectomy for all people with breasts but rather only those with genetic susceptibility. In addition, intensive screening for liver cancer is not done exclusively for males, despite the 4:1 incidence, but is targeted at hepatitis virus carriers. Similarly, then, going forward it is critical to be nuanced with our treatment of the multiple dimensions of sex differences.
Considering sex as a biological variable is good science
All of the necessary measurements of sex differences are not available within currently generated public data. Going forward, in the collection, generation, and analysis of large datasets, these broad measurements should be incorporated into study design. A recent study suggests progress is being made, but slowly, and with uncertainty regarding support (12). In fact, a recent evaluation of a tier 1 research university found that since 2016, only 2% of new proposals of topics with evidence of sex/gender effects proposed to consider sex as a biological variable (13). Given that we now have complete consensus about the existence of sex differences in cancer incidence, treatment, and survival, it is poor science to not consider sex as a biological variable in molecular cancer research going forward. As demonstrated with glioblastoma and HCC, there are consequences to not accounting for sex differences, both in missed opportunities for mechanistic discovery, and potentially in the delivery of ineffective interventions. Ignorance is not bliss.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: M.A. Wilson, K.H. Buetow
Development of methodology: M.A. Wilson, K.H. Buetow
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): M.A. Wilson
Writing, review, and/or revision of the manuscript: M.A. Wilson, K.H. Buetow
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M.A. Wilson, K.H. Buetow
Study supervision: M.A. Wilson, K.H. Buetow