Cancer-related genes are under intense evolutionary pressure. In this study, we conjecture that X-linked tumor suppressor genes (TSG) are not protected by the Knudson's two-hit mechanism and are therefore subject to negative selection. Accordingly, nearly all mammalian species exhibited lower TSG-to-noncancer gene ratios on their X chromosomes compared with nonmammalian species. Synteny analysis revealed that mammalian X-linked TSGs were depleted shortly after the emergence of the XY sex-determination system. A phylogeny-based model unveiled a higher X chromosome-to-autosome relocation flux for human TSGs. This was verified in other mammals by assessing the concordance/discordance of chromosomal locations of mammalian TSGs and their orthologs in Xenopus tropicalis. In humans, X-linked TSGs are younger or larger in size. Consistently, pan-cancer analysis revealed more frequent nonsynonymous somatic mutations of X-linked TSGs. These findings suggest that relocation of TSGs out of the X chromosome could confer a survival advantage by facilitating evasion of single-hit inactivation.

Significance:

This work unveils extensive trafficking of TSGs from the X chromosome to autosomes during evolution, thus identifying X-linked TSGs as a genetic Achilles' heel in tumor suppression.

Sex is determined by karyotype in major vertebrate groups. To convey sex-related developmental signals, sex chromosomes have undergone substantial phylogenetic and structural differentiation since emerging from the ancestral autosomes ∼300 million years ago in sauropsids and ∼165 to 150 million years ago in prototherian and therian mammals. Consequently, a prominent disparity of sex chromosomes from the autosomes was observed in many aspects of genome biology, such as organization and gene content (1). In mammals, despite the existence of two copies of X chromosomes in each female somatic cell, both males and females carry only a single active copy, as the other X chromosome in females is transcriptionally silenced – a mechanism known as X-inactivation to achieve the equalization of the X-linked gene dosage between males and females. A specific enrichment for genes related to brain function, muscle function, sex, and reproduction, was observed in the human X chromosome (2). Besides, a biased gene traffic driven by origination of new genes through duplication of existing genes occurred between sex chromosomes and autosomes during evolution (3). Nevertheless, there is still controversy over the precise driving forces for the interchromosomal gene duplication and the underlying functional significance.

Neoplastic diseases, inherent in all metazoan life, occur as a result of abnormal outgrowth of “selfish” cells in a supposedly altruistic multicellular community. Cancer that disrupts the multicellular harmony and jeopardizes the organismal survival and reproductive fitness thus constitutes a significant selective pressure on metazoan genome evolution. Aberrant activation of oncogenes and inactivation of tumor suppressor genes (TSG) caused by nonsynonymous somatic mutations, copy-number alterations or structural variation are fundamental to tumorigenesis. Multicellular organisms have thus evolved a diverse set of strategies, such as antagonism of oncogenes by TSGs (with their gene number in an ∼1-to-10 ratio), multiple circuitries of cell death, and telomere shortening at cell division, to limit cancer risk (4). Besides, natural selection has given rise to genomic configurations that favor tumor suppression. In this connection, we reported on a nonrandom genomic distribution of oncogenes and TSGs in species from Drosophila through mouse to human, in which selective pressure drove TSGs to move toward oncogenes, thereby attenuating the effects of somatic copy-number alterations (5). Moreover, we unveiled the progressive expansion in size of oncogenes for suppressing oncogenic somatic amplification (6) and the increased promoter CpG content of TSGs for achieving high expression in normal tissues and resisting downregulation in cancer (7). Nevertheless, it still remains unclear if neoplastic diseases have left other evolutionary footprints in our genome.

The Knudson's two-hit mechanism stipulates that a genetic hit (e.g., a hypomorphic or amorphic mutation) of a single copy of TSG does not lead to tumorigenesis, as such loss is compensated by the other functional copy on the paired autosome (8). The Knudson's two-hit mechanism thus serves as a fail-safe machinery to protect TSGs from inadvertent inactivation, except for haploinsufficient TSGs (e.g., CDKN1B, TP53, DMP1, NF1, and PTEN) that require both copies to execute the tumor-suppressing function (9) and TSGs that are susceptible to dominant negative mutations. Nevertheless, we conjectured that the Knudson's model is also not applicable to X-linked TSGs as single-hit inactivation would become dominant because of hemizygosity in males and random X-inactivation in females. As cancer-related genes are known to be under intense evolutionary pressure (10), we hypothesized that X-linked TSGs would have been subject to strong negative selection during evolution so as to minimize cancer risk driven by single-hit inactivation (Fig. 1A).

Figure 1.

Depletion of TSG content from mammalian X-chromosomes and TSG synteny in amniotes’ chromosomes. A, Proposed mechanism of survival advantage at species level underlying the biased duplication of TSGs from the X chromosome to the autosomes. We hypothesized that X-linked TSGs are not protected by the Knudson's two-hit mechanism because of hemizygosity in males and random X-inactivation in females and therefore prone to inactivation by single-hit mutation and thus cancer formation that undermines the reproductive fitness. Xi, inactivated X chromosome. Natural selection is therefore expected to remove X-linked TSGs. B, The TSG-to-noncancer gene ratios of all chromosomes (red circle for X chromosomes and blue dot for autosomes) across 21 therians were calculated. For human, a total of 981 TSGs as curated by the TSGene 2.0 and 18,184 well-defined noncancer genes with HGNC symbols were included in the analysis. For other species, only orthologous genes identified through BioMart with official gene names were considered. ***, P < 0.001, significantly different between groups by Wilcoxon matched-pairs signed rank test. Four autosomes of very large size from Monodelphis domestica (opossum) have been omitted from the scatter plots to enhance data visualization. C, The orthologous genes in other species corresponding to human TSGs were identified through BioMart. Genes were ranked according to the number of times the gene located on the X and Z chromosome. The chromosomal location (i.e., X/Z chromosome or autosome) of human TSGs or their orthologs in 21 therian and 7 avian lineages (top) is depicted with those located on the X or Z chromosome highlighted (middle). The phylogenetic tree was calibrated to the species’ origin time (left). The odds ratio of TSGs or their orthologs locating on the X/Z chromosome relative to the autosomes in each species is shown (right). *, P < 0.05; **, P < 0.01.

Figure 1.

Depletion of TSG content from mammalian X-chromosomes and TSG synteny in amniotes’ chromosomes. A, Proposed mechanism of survival advantage at species level underlying the biased duplication of TSGs from the X chromosome to the autosomes. We hypothesized that X-linked TSGs are not protected by the Knudson's two-hit mechanism because of hemizygosity in males and random X-inactivation in females and therefore prone to inactivation by single-hit mutation and thus cancer formation that undermines the reproductive fitness. Xi, inactivated X chromosome. Natural selection is therefore expected to remove X-linked TSGs. B, The TSG-to-noncancer gene ratios of all chromosomes (red circle for X chromosomes and blue dot for autosomes) across 21 therians were calculated. For human, a total of 981 TSGs as curated by the TSGene 2.0 and 18,184 well-defined noncancer genes with HGNC symbols were included in the analysis. For other species, only orthologous genes identified through BioMart with official gene names were considered. ***, P < 0.001, significantly different between groups by Wilcoxon matched-pairs signed rank test. Four autosomes of very large size from Monodelphis domestica (opossum) have been omitted from the scatter plots to enhance data visualization. C, The orthologous genes in other species corresponding to human TSGs were identified through BioMart. Genes were ranked according to the number of times the gene located on the X and Z chromosome. The chromosomal location (i.e., X/Z chromosome or autosome) of human TSGs or their orthologs in 21 therian and 7 avian lineages (top) is depicted with those located on the X or Z chromosome highlighted (middle). The phylogenetic tree was calibrated to the species’ origin time (left). The odds ratio of TSGs or their orthologs locating on the X/Z chromosome relative to the autosomes in each species is shown (right). *, P < 0.05; **, P < 0.01.

Close modal

Gene information and definition of TGSs and X-conserved region-/X-added region–linked genes

All information (including genome location, sequence, paralogs, and orthologs) of X-linked and autosomal genes from the 21 mammalian species that have their complete genomes sequenced were downloaded from Ensembl Database [release 96]. Noncancer gene is defined as a gene that is not curated by the TSGene 2.0 (11) nor listed as an oncogene as summarized by Vogelstein and colleagues (12). Gene size was defined as the distance between gene start site and gene end site. In nonhuman species, a gene was regarded as a TSG, if its corresponding human ortholog was also a TSG. Otherwise, the gene was classified as a noncancer gene. Ortholog pairs with confidence score less than 1 were not considered. An X-linked gene conserved in eutherians whose ortholog was located on the X chromosome of at least one marsupial animal (opossum or Tasmanian devil) was considered as an X-conserved region (XCR)-linked gene. Otherwise the gene was deemed to be X-added region (XAR)-linked.

Inferring the history of gene duplication and calculation of relocation flux in human

The coding sequences (CDS) of TSGs and their paralogs designated as the exclusive UniProt ID and RefSeq mRNA ID were downloaded from the Ensembl database. For genes with multiple transcript variants, the largest transcript was selected. Gene age classes were assigned through the combination of the homolog clustering and phylogeny inference as described by Yin and colleagues (13). The CDS of TSGs and their noncancer paralogs on the autosomes and the X-chromosome were sorted to build the phylogenetic tree through the standard Phyml bootstrap method using the ETE Toolkit with default parameters (14). The ancestral gene of a TSG and its noncancer paralogs was identified by tracing the older gene sharing the most similar CDS in the phylogenetic tree. Duplication of the TSG and its noncancer paralogs to the autosome or the X chromosome from an autosomal or X-linked ancestral gene was then determined. If the ancestral gene and the progeny gene were both located on the same chromosome, the progeny gene was classified as “immobile”. If the ancestral origin could not be inferred due to the unavailability of gene age information or paralog or the lack of gene age difference, the gene was designated as “gene with indeterminable chromosomal origin”. The relocation flux was calculated as the proportion of autosomal genes duplicated from the X chromosome divided by the proportion of X-linked genes duplicated from autosomes. The statistical significance of the difference in the relocation flux of TSGs versus their noncancer paralogs was then evaluated using permutation test by randomly shuffling the gene classification labels (i.e., TSG and noncancer paralog) one million times. The same workflow was applied for XAR/XCR-linked TSGs versus their noncancer paralogs.

An orthogonal model for calculation of relocation flux of TSGs and their noncancer paralogs using decision tree

The decision tree classifier with method set as “deviance splitting criteria” was applied to analyze the relationship between the features (i.e., chromosomal location of X tropicalis orthologs of mammalian noncancer genes) and the outcomes [i.e., chromosomal status (i.e., autosomal vs. X-linked) of the corresponding mammalian genes]. The orthologs of mammalian Y-linked and mitochondrial genes in Xenopus tropicalis were omitted and only 1-to-1 X. tropicalis orthologs of mammalian noncancer genes (excluding the noncancer paralogs of TSGs or oncogenes) were used to train the decision tree model. The nodes of the decision tree split the X. tropicalis genome into two regions for each mammalian species – the orthologous regions of the X chromosome and autosomes. We then applied the trained decision tree to predict the expected chromosome status of mammalian TSGs versus their noncancer paralogs based on the locations of the corresponding X. tropicalis orthologs. By comparing the expected and observed chromosome status of TSGs versus their noncancer paralogs, the direction of interchromosomal gene duplication was inferred. The statistical significance of the difference in the relocation flux of TSGs versus their noncancer paralogs was then determined as mentioned above.

Calculation of the mutation frequencies and significance of TSGs

We obtained the somatic mutation data of 33 types of cancer from The Cancer Genome Atlas (TCGA), which was accessed through the Genome Data Commons Data Portal (https://portal.gdc.cancer.gov/). Only four types of nonsynonymous somatic mutations (missense mutations, frameshift small deletions, frameshift small insertions, and nonsense mutations) were chosen to calculate the mutation frequency of TSGs per gene. We compared the mutation frequencies of TSGs located on the X chromosome with those on the 22 autosomes by Wilcoxon rank-sum test in each cancer type. We then used multiple linear regression model to evaluate the effect of gene chromosomal status (i.e., X chromosome or autosome) on mutation frequency by adjusting for other confounding covariates, including gene expression level across 91 cell lines curated by the Cancer Cell Line Encyclopedia (CCLE), log10(length of CDS), compartment openness inferred from Hi-C sequencing, and local GC content. The data of the above confounding covariates were obtained from the supplementary data of the paper by Lawrence and colleagues (15). We also used MutSigCV (15) to quantify tissue specific mutation significance [–log10(P value)] of nonsilent mutations in a gene with background mutation rate estimated by silent mutations with abovementioned confounding covariates taken into account. Only TSGs with the tissue specific mutation significance < 2.321 (P value < 0.2) in a given cancer type were deemed to be tissue-selective TSGs.

Statistics

All data were analyzed using R or Python software unless otherwise specified. The decision tree classifier was built using the “tree” R package. The odds ratio was calculated by the “oddsratio” function in the “fmsb” R package. The meta-analysis was performed by “meta” R package. P value less than 0.05 was considered statistically significant.

Data availability

Data supporting the findings of this study are included in the manuscript. Detailed information or unique materials of the study is available from the leading corresponding author on reasonable request. Code and datasets supporting the findings of this article are available for download at https://doi.org/10.5281/zenodo.4035954

Depletion of TSGs from mammalian X chromosomes

To test our hypothesis, we first determined the densities of TSGs and noncancer genes in the X chromosomes and the autosomes across 21 therian mammals, including human. A total of 981 TSGs as curated by the TSGene 2.0 (11) and 18,184 well-defined noncancer genes with HGNC symbols in human (Supplementary Table S1 for the gene list; Supplementary Fig. S1 for their genomic distribution) and official gene names in the other 20 species were included in the analysis. TSGene 2.0 curated TSGs from over 9000 articles. Using pan-cancer data of TCGA, these curated genes were shown to exhibit expression and mutation patterns consistent with TSGs (11). This list of TSGs has been used in our previous studies, which demonstrated that these TSGs have been subject to stronger evolutionary pressure than noncancer genes across multiple species. In this respect, these TGSs curated by the TSGene 2.0 were found to exhibit distinct genomic features, including TSG-oncogene clustering, more extensive protein–protein interaction, and higher promoter CpG content (5–7). Monotremes (prototherian mammals) were excluded because of existence of multiple pairs of sex chromosomes. We observed significant decreases in gene densities for both TSGs and noncancer genes in the X chromosomes as compared with the autosomes across mammalian species (P = 9.537e–07 for TSGs; P = 1.907e–06 for noncancer genes; paired Mann-Whitney U test; Supplementary Fig. S2A and S2B; Supplementary Table S2). We next calculated the TSG-to-noncancer gene ratios and observed a marked reduction of the proportion of TSGs on the X chromosomes as compared with the autosomes (P = 9.537e–07; paired Mann–Whitney U test; Fig. 1B; Supplementary Table S2). Consistently, one-sample proportion test revealed a decrease of TSGs relative to noncancer genes on the X chromosomes in most of the mammalian species (P < 0.05 in 18 of 21 species; Supplementary Table S3). Thus, our results showed a depletion of TSGs from the mammalian X chromosomes.

The human X chromosome consists of two regions – XCR (shared by X chromosomes of all viviparous mammals; more than 300 million years old) and XAR (shared by X chromosomes of placental mammals but autosomal in marsupials; added to the X chromosome ∼100–148 million years ago; ref. 16). To measure the extent of TSG depletion in these two regions, the proportions of XCR/XAR-linked TSGs to XCR-/XAR-linked noncancer genes were determined. We inferred whether an X-linked gene is linked to XCR or XAR by checking the concordance of the location of its ortholog between marsupials (opossum and Tasmanian devil) and eutherians. Results showed that TSGs were significantly depleted in both XCR and XAR as compared with the autosomes, but the extent of depletion was more drastic in XAR (Supplementary Fig. S3).

Conserved synteny of TSGs within therian and avian species

To depict the depletion of X-linked TSGs over phylogeny, we built a phylogenetic tree by incorporating 21 therian (XY sex-determination system) and 7 avian [ZW sex-determination system with homogametic ZZ and heterogametic ZW corresponding to males and females, respectively; The Z chromosome is larger (with over 500 genes) than the W chromosome (probably containing tens of genes)] species, in which the tree is calibrated to species divergence time as proposed by Hedges and colleagues (Fig. 1C, left; ref. 17). Then, chromosomal locations of all orthologs of human TSGs in these species were curated and ranked (Fig. 1C, top; Supplementary Table S4). We observed a largely conserved but independent chromosomal distribution of X-linked TSGs within therians and Z-linked TSGs within avians (Fig. 1C, middle). One possibility is that TSGs started to deplete in the newly emerged X chromosome in the common ancestor of therian mammals and such depletion was fixed to a great extent within a relatively short period of time before the ancestor branched out to different therians. Consistent with this interpretation, Monodelphis domestica, a marsupial species that diverted from eutherians ∼159 million years ago (18), displayed the most divergent pattern of TSGs’ chromosomal locations within therians. Notably, in contrast to the reduced likelihood of TSGs being on the X chromosomes in most of the therian mammals, avians do not show such a trend (Fig. 1C, right). Alternatively, there could be naturally emerging variation in the density of TSGs across the genome at the time the XY sex-determination system evolved in the common ancestor, and the chromosome with the lowest TSG density conferred the best selective advantage when chosen as the X chromosome.

Inferring chromosomal direction of duplication of TSGs from ancestral genes in human

Because synteny on the mammalian X chromosome is extremely well conserved, the results in Fig. 1B do not represent independent data from many mammalian species, but rather the same pattern pseudoreplicated that artificially lowers the P values. We therefore devised a model to infer the history of gene duplication and assess if the depletion of X-linked TSGs was caused by an imbalanced gene duplication to the X chromosome and autosomes from ancestral genes in human. This would confirm if the hypothesis that TSG depletion occurred after the emergence of the X chromosome is correct or not. Due to the lack of algorithms, which can quantitatively describe such interchromosomal gene duplication, we built a model based on the premise that a new TSG could have originated from the duplication of an ancestral gene located on the same or a different chromosome. Phylogenetic trees using the CDS homology of TSGs and their corresponding paralogs retrieved from Ensembl were constructed, followed by assigning an age class as proposed by Yin and colleagues (13) to each gene. We were able to validate the phylogenetic clustering of two sets of drosophila genes whose ancestral relationship had been established (nsr and Scid1/2/3/4 derived from qkr58E-3 and sw, respectively; Supplementary Fig. S4; ref. 19). The chromosomal location of the immediate ancestral gene was then compared with that of the TSG to infer interchromosomal duplication (Fig. 2A). This analysis intended to quantitate how likely human TSGs, as compared with their corresponding noncancer paralogs, were duplicated to an autosome or the X chromosome from an autosomal or X-linked ancestral gene.

Figure 2.

Predicting the chromosomal direction of duplication of TSGs or their noncancer paralogs from ancestral genes in human. A, Proposed scheme of inferring the chromosomal direction of gene duplication. Paralogs of TSGs were first retrieved from Ensembl followed by phylogenetic tree clustering and gene age assignment. The chromosomal status of the immediate older paralog in the phylogenetic tree of a given TSG or its paralog was then used to infer the history of gene duplication. This analysis quantitates how often TSGs, as compared with their corresponding noncancer paralog(s), were duplicated to an autosome or X chromosome from an ancestral gene originally located on an autosome or X chromosome. B, Duplication status of TSGs and their noncancer paralogs were inferred on the basis of the concordance/discordance of the chromosomal location of their immediate ancestral genes. “Immobile genes” refer to TSGs or their noncancer paralogs having the same chromosomal origin (autosomes or X-chromosome) with their ancestral genes. C, The “outflow” of TSGs and their noncancer paralogs were calculated, with a value > 1 indicating genes were more likely to be duplicated to autosomes from X-linked ancestral genes and less likely to be duplicated to X chromosome from autosomal ancestral genes. D, Permutation test by shuffling gene labels (i.e., TSG vs. noncancer paralog) a million times without disrupting the tree structure by retaining the gene age information at each position confirmed the significant difference in the relocation flux between TSGs and their noncancer paralogs (P = 0.0264). E, Age classes, ranging from archaea/bacteria (age class 26) to human (age class 1) spanning ∼4,000 million years, were applied to categorize X-linked and autosomal TSGs. F, Physical size of young (mammal-specific; age class < 14) and ancient (age class ≥ 14) TSGs on the autosomes and the X chromosome were compared. *, P < 0.05; **, P < 0.01; significantly different between groups as analyzed by Mann-Whitney test or Wilcoxon matched-pairs signed rank test where appropriate. N.S., nonsignificant.

Figure 2.

Predicting the chromosomal direction of duplication of TSGs or their noncancer paralogs from ancestral genes in human. A, Proposed scheme of inferring the chromosomal direction of gene duplication. Paralogs of TSGs were first retrieved from Ensembl followed by phylogenetic tree clustering and gene age assignment. The chromosomal status of the immediate older paralog in the phylogenetic tree of a given TSG or its paralog was then used to infer the history of gene duplication. This analysis quantitates how often TSGs, as compared with their corresponding noncancer paralog(s), were duplicated to an autosome or X chromosome from an ancestral gene originally located on an autosome or X chromosome. B, Duplication status of TSGs and their noncancer paralogs were inferred on the basis of the concordance/discordance of the chromosomal location of their immediate ancestral genes. “Immobile genes” refer to TSGs or their noncancer paralogs having the same chromosomal origin (autosomes or X-chromosome) with their ancestral genes. C, The “outflow” of TSGs and their noncancer paralogs were calculated, with a value > 1 indicating genes were more likely to be duplicated to autosomes from X-linked ancestral genes and less likely to be duplicated to X chromosome from autosomal ancestral genes. D, Permutation test by shuffling gene labels (i.e., TSG vs. noncancer paralog) a million times without disrupting the tree structure by retaining the gene age information at each position confirmed the significant difference in the relocation flux between TSGs and their noncancer paralogs (P = 0.0264). E, Age classes, ranging from archaea/bacteria (age class 26) to human (age class 1) spanning ∼4,000 million years, were applied to categorize X-linked and autosomal TSGs. F, Physical size of young (mammal-specific; age class < 14) and ancient (age class ≥ 14) TSGs on the autosomes and the X chromosome were compared. *, P < 0.05; **, P < 0.01; significantly different between groups as analyzed by Mann-Whitney test or Wilcoxon matched-pairs signed rank test where appropriate. N.S., nonsignificant.

Close modal

On the basis of our model, we inferred that 11 of 14 (78.6%) human TSGs and 33 of 63 (52.4%) noncancer paralogs originated from ancestral genes located on the X chromosome had duplicated to the autosomes, whereas only 9 of 329 (2.7%) human TSGs and 46 of 1,089 (4.2%) noncancer paralogs arose from ancestral genes located on the autosomes had duplicated to the X chromosome (Fig. 2B). After discarding genes with indeterminable origin, the duplication ratio of TSGs and their noncancer paralogs between the X chromosome and the autosomes were calculated (Fig. 2C), with the “net outflow” > 1 indicating a preferential X-to-autosome relocation flux. While both human TSGs and their noncancer paralogs showed an excessive duplication from ancestral genes out of the X chromosome, TSGs showed a 2.32-fold higher relocation flux than their noncancer paralogs and confirmed by a permutation test with within-tree shuffling of gene classifications (i.e., TSG vs. noncancer paralog) across multiple trees for a million times without disrupting the tree structure by retaining gene age information at each position (P = 0.0264; Fig. 2D). We applied the same procedures to analyze human oncogenes and found that they did not exhibit a significant difference in the relocation flux as compared with their noncancer paralogs (P = 0.4891 by permutation test). The X-linked oncogenes versus their autosomal counterparts also did not show any difference in their propensity for amplification (Supplementary Fig. S5). The results indicated that human TSGs, as compared with their noncancer paralogs, were more likely to be duplicated to an autosome from ancestral genes originally located on the X chromosome and less likely to be duplicated to the X chromosome from ancestral genes originally located on autosomes. We also inferred interchromosomal duplication of TSGs versus their noncancer paralogs between XCR/XAR and autosomes. Concordant with the depletion of TSGs in both regions, preferential XCR-/XAR-to-autosome relocation (log2 ratio = 1.2268 and 1.0424 for XCR and XAR, respectively) was observed for TSGs as compared with their noncancer paralogs, but only the TSG flow between XCR and autosomes showed statistical significance in the permutation test (P = 0.0498 and 0.1780 for XCR and XAR, respectively; Supplementary Fig. S6).

TSGs on the human X chromosome are in the pseudoautosomal regions, younger or larger in size

On the basis of our conjecture, the preferential duplication of TSGs out of the X chromosome from an X-linked ancestral gene should confer survival advantage. It is therefore unclear why some TSGs are still on the X chromosome. To assess if TSGs remained on the X chromosome plays a role in sex determination, we determined if these TSGs showed sex-biased expression as curated by the SAGD database (20). A gene was defined as sex-associated gene if it showed sex-biased expression in two or more tissues from the database. We found that none of the X-linked TSGs was sex-associated gene, suggesting that TSGs remained on the X chromosome do not play a key role in sex determination.

Genes in the pseudoautosomal regions (PAR1 and PAR2) of the X chromosome are known to have homologs on the Y chromosome. These PAR-linked genes behave like genes on an autosome and recombine during meiosis (21). In this respect, we found that the proportion of X-linked genes with a homolog on the Y chromosome was higher for TSGs (4 of 27; 14.8%) than noncancer genes (26 of 799; 3.3%; P = 0.0016; χ2 test). This finding indicates that four TSGs on the X chromosome might behave like autosomal genes.

Our previous studies demonstrated that gene origin time and gene size are two important factors restricting gene movement (5, 6). Here, we found that human X-linked TSGs are much younger than their autosomal counterparts (Fig. 2E). Age-matched gene size analysis also revealed that ancient TSGs (≥14 gene age classes) on the human X chromosome are significantly larger in size than those on the autosomes, whereas such discrepancy was not observed for young, mammal-specific TSGs (<14 gene age classes) (Fig. 2F). These findings suggest that TSGs remained on the X chromosome might not have sufficient time or were too bulky to move around.

Relocation flux of TSGs across mammalian species

The genome of X. tropicalis (Western clawed frog) exhibits long-range synteny with the human genome and has an advantage of fewer chromosomes (22). To test if the unbalanced interchromosomal duplication of TSGs occurred in other mammals, we built another model to infer the “relocation flux” by applying a decision tree classifier to determine if the orthologs of mammalian TSGs in X tropicalis are more or less likely to be located in the orthologous regions of the mammalian X chromosome in the X tropicalis genome than expected (Fig. 3A). To achieve this, we first prepared a gene table with a 1-to-1 matching of protein-coding noncancer genes (excluding oncogenes, TSGs, and their noncancer paralogs as well as Y-linked and mitochondrial genes) in each mammalian species and their orthologs in X. tropicalis, followed by application of a decision tree to reconstruct the orthologous regions of the X chromosome of each species in the X. tropicalis genome. We then measured the discrepancies between the expected and the actual chromosomal locations of TSGs to infer the direction of interchromosomal movement. Afterwards, the relocation flux of TSGs was calculated as illustrated in Fig. 2C and compared with that of their noncancer paralogs in each species using permutation test. Using this approach, we observed that the relocation flux of TSGs out of the X chromosome was significantly higher than that of their noncancer paralogs (P < 0.05) in 11 of 21 mammalian species (a total of 18 of 21 species, including human, reached the statistical significance of P < 0.1; Fig. 3B). This finding supports the occurrence of biased duplication of TSGs out of the X chromosome across mammals.

Figure 3.

Higher relocation flux of TSGs out of the X chromosome in mammals. A, Proposed scheme of inferring the direction of interchromosomal gene movement in 21 mammalian species with human as an example. The 1-to-1 human-frog (Xenopus tropicalis) orthologous relationship of protein-coding noncancer genes (excluding oncogenes, TSGs, and their noncancer paralogs as well as Y-linked and mitochondrial genes) was retrieved from the Ensembl v.103, followed by construction of a decision tree classifier, where the chromosomal status (i.e., autosome vs. X chromosome) is the binary target variable and frog gene position is the input variable. The X. tropicalis genome was then divided into two regions orthologous to the human X chromosome and autosomes, respectively, based on the nodes of the decision tree. The expected location of human TSGs and their noncancer paralogs deduced by the chromosomal location of their orthologous genes in X. tropicalis was then compared with their actual (observed) location to infer the direction of interchromosomal gene movement. This process was iterated in the other mammalian species. B, The “outflow” of TSGs and their noncancer paralogs were calculated based on formula shown in Fig. 2C. The log2-fold change between the relative risk of migration out of the X chromosome of TSGs and their noncancer paralogs was calculated across primates and other mammals. The distribution of all possible fold change is shown on the left with the red area under the curves representing P values of the permutation test by shuffling gene labels (i.e., TSG vs. noncancer paralog) 100,000 times per species. The –log10P values are visualized as the bar chart on the right. *, P < 0.05; **, P < 0.01; significantly different between TSGs and their noncancer paralogs by permutation test. FC, fold-change; ncTSG, noncancer paralog of TSG.

Figure 3.

Higher relocation flux of TSGs out of the X chromosome in mammals. A, Proposed scheme of inferring the direction of interchromosomal gene movement in 21 mammalian species with human as an example. The 1-to-1 human-frog (Xenopus tropicalis) orthologous relationship of protein-coding noncancer genes (excluding oncogenes, TSGs, and their noncancer paralogs as well as Y-linked and mitochondrial genes) was retrieved from the Ensembl v.103, followed by construction of a decision tree classifier, where the chromosomal status (i.e., autosome vs. X chromosome) is the binary target variable and frog gene position is the input variable. The X. tropicalis genome was then divided into two regions orthologous to the human X chromosome and autosomes, respectively, based on the nodes of the decision tree. The expected location of human TSGs and their noncancer paralogs deduced by the chromosomal location of their orthologous genes in X. tropicalis was then compared with their actual (observed) location to infer the direction of interchromosomal gene movement. This process was iterated in the other mammalian species. B, The “outflow” of TSGs and their noncancer paralogs were calculated based on formula shown in Fig. 2C. The log2-fold change between the relative risk of migration out of the X chromosome of TSGs and their noncancer paralogs was calculated across primates and other mammals. The distribution of all possible fold change is shown on the left with the red area under the curves representing P values of the permutation test by shuffling gene labels (i.e., TSG vs. noncancer paralog) 100,000 times per species. The –log10P values are visualized as the bar chart on the right. *, P < 0.05; **, P < 0.01; significantly different between TSGs and their noncancer paralogs by permutation test. FC, fold-change; ncTSG, noncancer paralog of TSG.

Close modal

A higher mutation frequency of X-linked TSGs across human cancer types

To determine if X-linked TSGs could be vulnerable points in tumor suppression because they are not protected by the Knudson's two-hit mechanism, we extracted the somatic mutation status of X-linked and autosomal TSGs from 33 human cancer types included in TCGA studies. While both X-linked and autosomal TSG showed similar levels of mRNA expression across major human normal tissues (Supplementary Fig. S7), 9 of 33 cancer types showed a significantly higher rate of nonsynonymous somatic mutations (i.e., missense mutations, frameshift small deletions, frameshift small insertions, nonsense mutations) in the X-linked TSGs than their autosomal counterparts (Fig. 4A; Wilcoxon rank test). Paired analysis of X-linked and autosomal TSGs also revealed a concordant overall difference in mutation rate (Fig. 4B; P < 0.001, paired-Wilcoxon rank test). We also used multivariate linear regression to estimate how different genetic features independently contribute to mutation frequency. The considered features are (1) chromosomal location (i.e., X-chromosome or autosome; ref. 2); log10(average expression level of the gene across 91 cell lines curated by the CCLE; ref. 3); log10(length of CDS; ref. 4); compartment openness inferred from Hi-C sequencing; and (5) local GC content (Supplementary Table S5). We confirmed that being on the X chromosome is an independent factor significantly associated with increased mutation frequency of a TSG (P = 0.0026; Table 1). The somatic mutation rate of X-linked TSGs was also confirmed to be significantly higher as compared with X-linked noncancer genes (Supplementary Fig. S8A and S8B). The information on the mutation rate of individual X-linked TSGs in different cancer types is available as Supplementary Table S6.

Figure 4.

Mutation frequencies per gene and MutSigCV-derived mutation significance of X-linked versus autosomal TSGs across TCGA-sequenced human cancer types. A, The means of nonsynonymous somatic mutation frequencies per gene of X-linked (red dots) versus autosomal (blue dots) TSGs are shown for each cancer type. Only four types of nonsynonymous somatic mutations, namely frameshift small deletions, frameshift small insertions, missense mutations, and nonsense mutations in TSGs were counted. B, Tissue-matched paired analysis of nonsynonymous somatic mutation frequencies per gene was conducted for X-linked versus autosomal TSGs. Only four types of nonsynonymous somatic mutations as described above were counted. C, Tissue-selective TSGs were defined as TSGene 2.0-curated TSGs with MutSigCV-derived P values < 0.2 in a given cancer type. Mutation significance [–log10(P value)] was then derived from MutSigCV analysis of X-linked versus autosomal tissue-selective TSGs. D, Proportion of X-linked versus autosomal TSGene 2.0-curated TSGs with MutSigCV-derived P values < 0.2 in different cancer types is shown. E, Meta-analysis with the fixed-effect model was used to compare the nonsynonymous somatic mutation frequency of X-linked TSGs versus autosomal TSGs in male cancer samples without adjusting for the hemizygosity. *, P < 0.05; **, P < 0.01; ***, P < 0.001; significantly different between groups as analyzed by Mann-Whitney U test or Wilcoxon matched-pairs signed rank test where appropriate. SMD, standardized mean difference.

Figure 4.

Mutation frequencies per gene and MutSigCV-derived mutation significance of X-linked versus autosomal TSGs across TCGA-sequenced human cancer types. A, The means of nonsynonymous somatic mutation frequencies per gene of X-linked (red dots) versus autosomal (blue dots) TSGs are shown for each cancer type. Only four types of nonsynonymous somatic mutations, namely frameshift small deletions, frameshift small insertions, missense mutations, and nonsense mutations in TSGs were counted. B, Tissue-matched paired analysis of nonsynonymous somatic mutation frequencies per gene was conducted for X-linked versus autosomal TSGs. Only four types of nonsynonymous somatic mutations as described above were counted. C, Tissue-selective TSGs were defined as TSGene 2.0-curated TSGs with MutSigCV-derived P values < 0.2 in a given cancer type. Mutation significance [–log10(P value)] was then derived from MutSigCV analysis of X-linked versus autosomal tissue-selective TSGs. D, Proportion of X-linked versus autosomal TSGene 2.0-curated TSGs with MutSigCV-derived P values < 0.2 in different cancer types is shown. E, Meta-analysis with the fixed-effect model was used to compare the nonsynonymous somatic mutation frequency of X-linked TSGs versus autosomal TSGs in male cancer samples without adjusting for the hemizygosity. *, P < 0.05; **, P < 0.01; ***, P < 0.001; significantly different between groups as analyzed by Mann-Whitney U test or Wilcoxon matched-pairs signed rank test where appropriate. SMD, standardized mean difference.

Close modal
Table 1.

Multivariate linear regression of nonsynonymous mutation frequencies of X-linked versus autosomal TSGs.

EstimateLower Limit (2.5%)Upper Limit (97.5%)SEt valuePr(>|t|)
(Intercept) −0.041009 −0.045468 −0.036551 0.002275 −18.029008 0.000000 
Chromosome X 0.002342 0.000815 0.003869 0.000779 3.006749 0.002643 
HiC compartment −0.019784 −0.033239 −0.006329 0.006865 −2.881991 0.003954 
Local GC Content −0.000446 −0.006250 0.005358 0.002961 −0.150672 0.880235 
Log10(Expression of CCLE) −0.000993 −0.001311 −0.000674 0.000162 −6.113895 0.000000 
log10(Length of CDS) 0.008641 0.008306 0.008975 0.000171 50.638994 0.000000 
EstimateLower Limit (2.5%)Upper Limit (97.5%)SEt valuePr(>|t|)
(Intercept) −0.041009 −0.045468 −0.036551 0.002275 −18.029008 0.000000 
Chromosome X 0.002342 0.000815 0.003869 0.000779 3.006749 0.002643 
HiC compartment −0.019784 −0.033239 −0.006329 0.006865 −2.881991 0.003954 
Local GC Content −0.000446 −0.006250 0.005358 0.002961 −0.150672 0.880235 
Log10(Expression of CCLE) −0.000993 −0.001311 −0.000674 0.000162 −6.113895 0.000000 
log10(Length of CDS) 0.008641 0.008306 0.008975 0.000171 50.638994 0.000000 

As TSGs might be mutated in a tissue type–selective manner, we conducted a similar analysis restricted to TSGs that are important in each specific tissue type using MutSigCV (15). We first shortlisted tissue-selective TSGs (defined as TSGs curated by TSGene 2.0 with P values <0.2 by MutSigCV in a given cancer type), followed by quantitating the mutation significance [i.e., –log10(P value)] of X-linked versus autosomal TSGs. All the abovementioned genetic features that covary with the background mutation rate were also fed into MutSigCV when calculating the mutation significance. With this analysis, we found that X-linked tissue-selective TSGs were more “significantly mutated” (Fig. 4C) than their autosomal counterparts. The proportion of TSGs with MutSigCV-derived P values <0.2 on the X-chromosome was also higher than those on the autosomes (Fig. 4D). To further corroborate this finding, we restricted our analysis to male patients as somatic hypermutation in the inactive X chromosome in female cancer genomes had been reported by Jäger and colleagues (23). In this subset analysis, even without adjusting for the hemizygosity of X-linked TSGs in male subjects, an overall significantly higher mutation rate for X-linked TSGs was revealed by the meta-analysis of mutation data from 29 nongynecologic cancer types (P < 0.01; Fig. 4E; Supplementary Fig. S9 for meta-analyses of female samples). In contrast, X-linked noncancer genes have a much lower mutation rate than those on the autosome in male cancer samples (Supplementary Fig. S10), presumably owing to the hemizygosity of X-linked genes (as mutation rate is expressed as nonsynonymous somatic mutations per gene instead of per copy of gene).

Aside from tissue specificity, another factor that potentially affects the mutation frequency of X-linked TSGs is whether these genes could escape X-inactivation. Using the dataset from Zhang and colleagues (24), 4 and 9 X-linked TSGs were determined to be “escaped TSGs” and “X-inactivated TSGs”, respectively, whereas the remaining X-linked TSGs showed discordant status. The proportion of “escaped” genes was higher for X-linked TSGs (4 of 27) than the X-linked noncancer paralogs (6 of 224; P = 0.0023; χ2 test). However, we found that “escaped TSGs” and “X-inactivated TSGs” did not show significant difference in the rate of nonsynonymous somatic mutations in female cancer samples (Supplementary Fig. S11).

The X chromosome experienced a distinct evolutionary pressure. An excess of duplicated genes translocating from the X chromosome to the autosomes has been identified in various species, including Drosophila, mouse and human, which leads to the unequal chromosomal distribution of overall and sex-related gene content. Several theories, including sexual antagonism, meiotic X-inactivation during spermatogenesis and dosage compensation, have been introduced to explain this phenomenon. However, to date, most research on interchromosomal gene transfer are concerned with sex-biased genes (1). Here, an excessive trafficking of TSGs from the X chromosome to the autosomes was discerned for the first time, which can be explained by the lack of protection of X-linked TSGs by the Knudson's two-hit mechanism. Natural selection thus forced TSGs to relocate to more favorable genomic locations (i.e., the autosomes) to ensure their normal function. It is worthwhile to note that, as opposed to our analysis, Davoli and colleagues reported that the X chromosome was enriched for TSGs relative to autosomes (25). This discrepancy might arise as a result of using the mutation features of genes in the authors’ algorithm for defining TSGs, in which the effect of hypermutation of the inactive X chromosome in female samples had not been accounted for.

Spermatogenesis is a source of mutations in evolution. Sperm-producing cells of a 30-year old male could have passed through ∼400 rounds of cell divisions since fertilization, thereby giving more chances for mutations to occur and accumulate before passing on to the next generation (26). Because the X chromosome only spent one-third of its time in males versus half for the autosome throughout their evolutionary history (27), the X chromosome mutation rate is expected to be much lower than that of the autosomes because of its reduced gene content and the partial shielding from the high mutation rate in males (28). This postulation has been experimentally verified by MaVean and Hurst (29), supporting that the X chromosome has a high genetic stability. However, in contrast with its privileged position throughout the evolutionary history of mammalian species, our analysis revealed that X-linked TSGs exhibited a much higher somatic mutation rate across human cancers. Our finding support that the X chromosome is vulnerable when facing intense selective pressure during tumorigenesis – an evolutionary process within an individual where sexual reproduction is not in operation.

As the X chromosome spent double of its time in females then males (26), the increased susceptibility to cancer formation due to the vulnerability of X-linked TSGs in females should constitute the strongest selective pressure for TSG translocation. It should be recognized that, for our conjecture to hold, complete X-inactivation in females is assumed. Nevertheless, it has been reported that sporadic reactivation of X-linked genes during ageing from the inactive X chromosome could occur (30). The disappearance of the inactive X chromosome and transcriptional reactivation of the linked cancer-promoting genes have been also described in breast and ovarian cancers (31, 32). TSGs that escape from X-inactivation were also found to contribute to sex disparity in cancer incidence (33). Therefore, the selective pressure on the depletion of X-linked TSGs might not be as strong as that depicted by our model. Consistent with this postulation, in this study, avian species that in general lack Z-inactivation and lost the resulting global gene dosage compensation between the opposite sexes do not show TSG depletion in their Z chromosomes (34, 35). It is noteworthy that, although Z-linked genes are more highly expressed in homogametic (ZZ) males than heterogametic (ZW) females in all tissues, compensation could occur in a gene-specific manner. It has been postulated that gene dosage compensation is not necessary in avian species because Z-linked genes evolved male-biased functions (35).

Taken together, our present study delineated a biased relocation of TSGs from the X chromosome to the autosomes in human and other mammals, which might confer survival advantage by limiting cancer risk caused by single-hit inactivation of the X-linked TSGs. Our findings not only provided the first evidence and explanation for the extensive trafficking of cancer-related genes between the X chromosome and the autosomes, which shed new light on how cancers have shaped our genome into its present configuration, but also identified X-linked TSGs as a genetic Achilles’ heel in tumor suppression.

M.H. Wang reports other support from Beth Bioinformatics outside the submitted work. D. Plewczynski reports grants from Polish National Science Center, Foundation for Polish Science, Warsaw University of Technology within the Excellence Initiative: Research University (IDUB) program; and grants from Foundation for Polish Science during the conduct of the study; and grants from Marie Sklodowska-Curie action (MSCA) Innovative Training Network (ITN) named Enhpathy outside the submitted work. No disclosures were reported by the other authors.

X. Wang: Data curation, formal analysis, investigation, visualization, methodology, writing–original draft. W. Hu: Writing–original draft. X. Li: Formal analysis, investigation. D. Huang: Formal analysis, investigation. Q. Li: Investigation. H. Chan: Investigation. J. Zeng: Investigation. C. Xie: Investigation. H. Chen: Investigation. X. Liu: Investigation. T. Gin: Investigation. M.H. Wang: Investigation. A.S.L. Cheng: Investigation. W. Kang: Investigation. K.F. To: Investigation. D. Plewczynski: Investigation. Q. Zhang: Investigation. X. Chen: Investigation. D.C.W. Chan: Investigation. H. Ko: Investigation. S.H. Wong: Supervision, investigation. J. Yu: Supervision, investigation. M.T.V. Chan: Supervision, funding acquisition, investigation, writing–review and editing. L. Zhang: Supervision, investigation, writing–review and editing. W.K.K. Wu: Conceptualization, formal analysis, supervision, funding acquisition, investigation, writing–review and editing.

The results shown here are in part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. This work was supported by the Shenzhen Science and Technology Program (JCYJ20180508161604382) and Shenzhen Science and Technology Innovation Commission. D. Plewczynski was supported by Polish National Science Center (2014/15/B/ST6/05082) and Foundation for Polish Science (TEAM to D. Plewczynski) cofinanced by the European Union under the European Regional Development Fund.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Vicoso
B
,
Charlesworth
B
.
Evolution on the X chromosome: unusual patterns and processes
.
Nat Rev Genet
2006
;
7
:
645
53
.
2.
Gurbich
TA
,
Bachtrog
D
.
Gene content evolution on the X chromosome
.
Curr Opin Genet Dev
2008
;
18
:
493
8
.
3.
Emerson
JJ
,
Kaessmann
H
,
Betrán
E
,
Long
M
.
Extensive gene traffic on the mammalian X chromosome
.
Science
2004
;
303
:
537
40
.
4.
Casás-Selves
M
,
Degregori
J
.
How cancer shapes evolution, and how evolution shapes cancer
.
Evolution
2011
;
4
:
624
34
.
5.
Wu
WK
,
Li
X
,
Wang
X
,
Dai
RZ
,
Cheng
AS
,
Wang
MH
, et al
.
Oncogenes without a neighboring tumor suppressor gene are more prone to amplification
.
Mol Biol Evol
2017
;
34
:
903
7
.
6.
Wang
X
,
Li
X
,
Zhang
L
,
Wong
SH
,
Wang
MHT
,
Tse
G
, et al
.
Oncogenes expand during evolution to withstand somatic amplification
.
Ann Oncol
2018
;
29
:
2254
60
.
7.
Huang
D
,
Wang
X
,
Liu
Y
,
Huang
Z
,
Hu
X
,
Hu
W
, et al
.
Multi-omic analysis suggests tumor suppressor genes evolved specific promoter features to optimize cancer resistance
.
Brief Bioinform
2021
;
22
:
bbab040
.
8.
Knudson
AG
.
Two genetic hits (more or less) to cancer
.
Nat Rev Cancer
2001
;
1
:
157
62
.
9.
Inoue
K
,
Fry
EA
.
Haploinsufficient tumor suppressor genes
.
Adv Med Biol
2017
;
118
:
83
122
.
10.
Thomas
MA
,
Weston
B
,
Joseph
M
,
Wu
W
,
Nekrutenko
A
,
Tonellato
PJ
.
Evolutionary dynamics of oncogenes and tumor suppressor genes: higher intensities of purifying selection than other genes
.
Mol Biol Evol
2003
;
20
:
964
8
.
11.
Zhao
M
,
Kim
P
,
Mitra
R
,
Zhao
J
,
Zhao
Z
.
TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genes
.
Nucleic Acids Res
2016
;
44
:
D1023
31
.
12.
Vogelstein
B
,
Papadopoulos
N
,
Velculescu
VE
,
Zhou
S
,
Diaz
LA
Jr
,
Kinzler
KW
.
Cancer genome landscapes
.
Science
2013
;
339
:
1546
58
.
13.
Yin
H
,
Wang
G
,
Ma
L
,
Yi
SV
,
Zhang
Z
.
What signatures dominantly associate with gene age?
Genome Biol Evol
2016
;
8
:
3083
9
.
14.
Huerta-Cepas
J
,
Serra
F
,
Bork
P
.
ETE 3: Reconstruction, analysis, and visualization of phylogenomic data
.
Mol Biol Evol
2016
;
33
:
1635
8
.
15.
Lawrence
MS
,
Stojanov
P
,
Polak
P
,
Kryukov
GV
,
Cibulskis
K
,
Sivachenko
A
, et al
.
Mutational heterogeneity in cancer and the search for new cancer-associated genes
.
Nature
2013
;
499
:
214
8
.
16.
Mácha
J
,
Teichmanová
R
,
Sater
AK
,
Wells
DE
,
Tlapáková
T
,
Zimmerman
LB
, et al
.
Deep ancestry of mammalian X chromosome revealed by comparison with the basal tetrapod Xenopus tropicalis
.
BMC Genomics
2012
;
13
:
315
.
17.
Hedges
SB
,
Dudley
J
,
Kumar
S
.
TimeTree: a public knowledge-base of divergence times among organisms
.
Bioinformatics
2006
;
22
:
2971
2
.
18.
Goodstadt
L
,
Heger
A
,
Webber
C
,
Ponting
CP
.
An analysis of the gene complement of a marsupial, Monodelphis domestica: evolution of lineage-specific genes and giant chromosomes
.
Genome Res
2007
;
17
:
969
81
.
19.
Chen
S
,
Krinsky
BH
,
Long
M
.
New genes as drivers of phenotypic evolution
.
Nat Rev Genet
2013
;
14
:
645
60
.
20.
Shi
MW
,
Zhang
NA
,
Shi
CP
,
Liu
CJ
,
Luo
ZH
,
Wang
DY
, et al
.
SAGD: a comprehensive sex-associated gene database from transcriptomes
.
Nucleic Acids Res
2019
;
47
:
D835
40
.
21.
Helena Mangs
A
,
Morris
BJ
.
The human pseudoautosomal region (PAR): origin, function, and future
.
Curr Genomics
2007
;
8
:
129
36
.
22.
Mitros
T
,
Lyons
JB
,
Session
AM
,
Jenkins
J
,
Shu
S
,
Kwon
T
, et al
.
A chromosome-scale genome assembly and dense genetic map for Xenopus tropicalis
.
Dev Biol
2019
;
452
:
8
20
.
23.
Jäger
N
,
Schlesner
M
,
Jones
DT
,
Raffel
S
,
Mallm
JP
,
Junge
KM
, et al
.
Hypermutation of the inactive X chromosome is a frequent event in cancer
.
Cell
2013
;
155
:
567
81
.
24.
Zhang
Y
,
Castillo-Morales
A
,
Jiang
M
,
Zhu
Y
,
Hu
L
,
Urrutia
AO
, et al
.
Genes that escape X-inactivation in humans have high intraspecific variability in expression, are associated with mental impairment but are not slow evolving
.
Mol Biol Evol
2013
;
30
:
2588
601
.
25.
Davoli
T
,
Xu
AW
,
Mengwasser
KE
,
Sack
LM
,
Yoon
JC
,
Park
PJ
, et al
.
Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome
.
Cell
2013
;
155
:
948
62
.
26.
Scally
A
.
Mutation rates and the evolution of germline structure
.
Philos Trans R Soc Lond B Biol Sci
2016
;
371
:
20150137
.
27.
Johnson
NA
,
Lachance
J
.
The genetics of sex chromosomes: evolution and implications for hybrid incompatibility
.
Ann N Y Acad Sci
2012
;
1256
:
E1
22
.
28.
Giannelli
F
,
Green
PM
.
The X chromosome and the rate of deleterious mutations in humans
.
Am J Hum Genet
2000
;
67
:
515
7
.
29.
McVean
GT
,
Hurst
LD
.
Evidence for a selectively favorable reduction in the mutation rate of the X chromosome
.
Nature
1997
;
386
:
388
92
.
30.
Chaligné
R
,
Heard
E
.
X-chromosome inactivation in development and cancer
.
FEBS Lett
2014
;
588
:
2514
22
.
31.
Pageau
GJ
,
Hall
LL
,
Ganesan
S
,
Livingston
DM
,
Lawrence
JB
.
The disappearing Barr body in breast and ovarian cancers
.
Nat Rev Cancer
2007
;
7
:
628
33
.
32.
Chaligné
R
,
Popova
T
,
Mendoza-Parra
MA
,
Saleem
MA
,
Gentien
D
,
Ban
K
, et al
.
The inactive X chromosome is epigenetically unstable and transcriptionally labile in breast cancer
.
Genome Res
2015
;
25
:
488
503
.
33.
Dunford
A
,
Weinstock
DM
,
Savova
V
,
Schumacher
SE
,
Cleary
JP
,
Yoda
A
, et al
.
Tumor suppressor genes that escape from X-inactivation contribute to cancer sex bias
.
Nat Genet
2017
;
49
:
10
6
.
34.
Mank
JE
,
Ellegren
H
.
All dosage compensation is local: gene-by-gene regulation of sex-biased expression on the chicken Z chromosome
.
Heredity
2009
;
102
:
312
20
.
35.
Naurin
S
,
Hasselquist
D
,
Bensch
S
,
Hansson
B
.
Sex-biased gene expression on the avian Z chromosome: highly expressed genes show higher male-biased expression
.
PLoS One
2012
;
7
:
e46854
.

Supplementary data