Li-Fraumeni syndrome (LFS) is a rare hereditary cancer syndrome associated with an autosomal-dominant mutation inheritance in the TP53 tumor suppressor gene and a wide spectrum of cancer diagnoses. The previously developed R package, LFSPRO, is capable of estimating the risk of an individual being a TP53 mutation carrier. However, an accurate estimation of the penetrance of different cancer types in LFS is crucial to improve the clinical characterization and management of high-risk individuals. Here, we developed a competing risk-based statistical model that incorporates the pedigree structure efficiently into the penetrance estimation and corrects for ascertainment bias while also increasing the effective sample size of this rare population. This enabled successful estimation of TP53 penetrance for three LFS cancer types: breast (BR), sarcoma (SA), and others (OT), from 186 pediatric sarcoma families collected at MD Anderson Cancer Center (Houston, TX). Penetrance validation was performed on a combined dataset of two clinically ascertained family cohorts with cancer to overcome internal bias in each (total number of families = 668). The age-dependent onset probability distributions of specific cancer types were different. For breast cancer, the TP53 penetrance went up at an earlier age than the reported BRCA1/2 penetrance. The prediction performance of the penetrance estimates was validated by the combined independent cohorts (BR = 85, SA = 540, and OT = 158). Area under the ROC curves (AUC) were 0.92 (BR), 0.75 (SA), and 0.81 (OT). The new penetrance estimates have been incorporated into the current LFSPRO R package to provide risk estimates for the diagnosis of breast cancer, sarcoma, or other cancers.

Significance:

These findings provide specific penetrance estimates for LFS-associated cancers, which will likely impact the management of families at high risk of LFS.

See related article by Shin et al., p. 347

Li-Fraumeni syndrome (LFS) is a rare familial cancer syndrome that is characterized by early cancer onset and a wide diversity of tumor types in contrast to other cancer syndromes (1, 2). The initial identification of the syndrome was through aggregated family cancer history with the major hallmarks including bone and soft tissue sarcoma, breast cancer, adrenal cortical cancers, brain tumors, and leukemia (3–7), but recent studies have shown that its spectrum gets more diverse and includes lung cancer (8) and prostate cancer (7).

LFS is known to be associated with deleterious germline mutations in the TP53 tumor suppressor gene. It is crucial to accurately estimate the penetrance of TP53 mutations to provide better clinical management to the individuals at high risk of LFS. Because of the wide spectrum of LFS cancers, it is likely that specific TP53 mutations have different effects for different cancer types. However, in recent literature (9), the cancer type information is collapsed and the disease status is simply dichotomized (cancer or not). This type of penetrance, which we call overall cancer penetrance, ignores the cancer type information and fails to quantify the cancer-specific effect of a TP53 germline mutation. It is necessary to estimate a cancer-specific penetrance to not only reveal the cancer-specific effect of the TP53 mutations but also improve the risk prediction performance by accommodating more detailed disease histories. However, it is a complex task to take into account multiple types of cancers simultaneously because onsets of different cancers are competing with others.

Recently, a competing risk model was developed that captures the nature of the multiple cancers associated with LFS (10). In this article, we provide a set of cancer-specific age-at-onset penetrance estimates of TP53 under the model proposed (10). First, we provide a more realistic penetrance by taking into account deaths unrelated to LFS as a baseline competing risk factor. Second, we utilize the pedigree information into the estimation, which can substantially improve estimation accuracy and efficiency by including more patient data where genotype information is unknown (9). Recently cancer-specific observations of LFS have been reported (11); however, they calculate the cumulative incidence for different cancer types based on deleterious TP53+ individuals only. Third, we carefully handle the ascertainment bias inevitable in rare disease studies like LFS and provide an unbiased penetrance estimate that can be applied to the general population.

As mentioned above, an accurate cancer-specific characterization of LFS is an extremely difficult task due to its nature involving such a wide spectrum of cancer diagnoses. We hereby provide a new set of cancer-specific penetrance estimates for individuals with different TP53 mutation status, sex, and age. Cancer-specific penetrance estimates could be used by genetic counselor and physicians to provide more detailed counseling to families that have newly been found to harbor a TP53 germline mutation.

### Study population (training set)

We used a cohort of 186 families identified via patients (probands) with childhood soft-tissue sarcoma or osteosarcoma before the age of 16 years and, with survival of 3 years or greater, treated at The University of Texas MD Anderson Cancer Center (MDACC, Houston, TX) from 1944 to 1983 to train the model. Study researchers contacted the parents of eligible study participants, or participants over 18 years, via a letter inviting them to participate in a phone interview to collect information on the current status of the probands and to collect their family history including vital status, and dates of birth, death, and tumor diagnosis for the probands, his/her parents, siblings, aunts and uncles, grandparents, and offspring (12–14). An average of three relatives were contacted for each kindred to complete data collection (12). Death certificates and medical records were requested for all reported deaths and cancer (12–15). Only confirmed invasive cancers, confirmed by either records or validated through multiple family members, were included in the MDACC pediatric sarcoma cohort (12–15). A summary of the MDACC pediatric sarcoma cohort is presented in Table 1. Participants were invited by letter to provide follow-up information by phone once a year. Follow-up information was only taken when the participants contacted the researchers. Medical records and death certificates were used to confirm cancer histories, where possible. We define the subjects as affected only if diagnosed with malignant tumors. Cancers were classified by three categories: breast cancer, sarcoma (osteosarcomas and soft-tissue sarcomas), and other cancer (brain, adrenal, lung, leukemia, and all other cancers). Both linkage and segregation analysis were carried out for selecting high-risk families prior to testing TP53. The high-risk families in the training cohort were identified by those pedigrees that differentiated the most in linkage and segregation analysis from the general population and chosen to be tested for a TP53 mutation. TP53 mutation status was then determined by PCR sequencing of exons 2–11. If a TP53 mutation was identified, all first-degree relatives of the proband and any other family member with an increased risk of carrying the mutation were offered TP53 testing. Extending germline testing to additional family members based on mutation status instead of phenotypes should not introduce ascertainment bias during analysis (7, 9). Individuals unavailable for testing who are linked to, or between, confirmed mutation carriers were considered inferred mutation carriers. Immediate family members were not tested if the proband did not have TP53 mutation. Of the 186 families, 17 were referred to as (TP53 mutation) positive families, which have at least one mutation carrier identified, and the remaining 182 families, in which there was no carrier identified, were considered wild-type.

Table 1.

MDACC cohort summary: average ages of diagnosis or at death are given in parentheses.

SexGenotypeBreast cancerSarcomaOther cancersDeathCensoredTotal
Male Unknown 0 (NA) 11 (21.3) 136 (55.3) 244 (55.2) 1,047 (39.7) 1,416 (43.7)
Wild-type 0 (NA) 76 (9.6) 32 (57.1) 38 (63.6) 257 (40.0) 403 (37.9)
Mutation 0 (NA) 12 (13.2) 27 (45.3) 1 (47.0) 8 (32.5) 48 (35.2)
Subtotal 0 (NA) 99 (11.3) 195 (54.2) 283 (56.3) 1,312 (39.7) 1,889 (42.3)
Female Unknown 40 (55.6) 4 (8.2) 68 (46.5) 172 (59.7) 1,035 (41.5) 1,319 (44.5)
Wild-type 8 (53.2) 96 (11.1) 20 (54.5) 45 (64.4) 296 (41.2) 465 (38.0)
Mutation 19 (33.2) 12 (9.2) 9 (23.1) 1 (22.0) 7 (37.9) 48 (25.8)
Subtotal 67 (49.0) 112 (10.8) 97 (46.0) 218 (60.5) 1,338 (41.4) 1,832 (42.3)
Total  67 (49.0) 211 (11.1) 273 (51.5) 501 (58.1) 2650 (40.6) 3,721 (42.3)
SexGenotypeBreast cancerSarcomaOther cancersDeathCensoredTotal
Male Unknown 0 (NA) 11 (21.3) 136 (55.3) 244 (55.2) 1,047 (39.7) 1,416 (43.7)
Wild-type 0 (NA) 76 (9.6) 32 (57.1) 38 (63.6) 257 (40.0) 403 (37.9)
Mutation 0 (NA) 12 (13.2) 27 (45.3) 1 (47.0) 8 (32.5) 48 (35.2)
Subtotal 0 (NA) 99 (11.3) 195 (54.2) 283 (56.3) 1,312 (39.7) 1,889 (42.3)
Female Unknown 40 (55.6) 4 (8.2) 68 (46.5) 172 (59.7) 1,035 (41.5) 1,319 (44.5)
Wild-type 8 (53.2) 96 (11.1) 20 (54.5) 45 (64.4) 296 (41.2) 465 (38.0)
Mutation 19 (33.2) 12 (9.2) 9 (23.1) 1 (22.0) 7 (37.9) 48 (25.8)
Subtotal 67 (49.0) 112 (10.8) 97 (46.0) 218 (60.5) 1,338 (41.4) 1,832 (42.3)
Total  67 (49.0) 211 (11.1) 273 (51.5) 501 (58.1) 2650 (40.6) 3,721 (42.3)

For the MDACC pediatric sarcoma cohort, which began collecting data in the 1980s, blood samples tested for a germline TP53 mutation were compared with the available literature for a suggestion of pathogenicity. If the mutation had not been reported then available family data were used to observe whether direct transmission of the same TP53 mutation was present and whether cancers segregated as established LFS phenotypes. This method was used by the study investigator to annotate the mutations due to lack of functional studies at the time. After the International Agency for Research on Cancer database was established, the study investigator used the database as a resource to review prior pathogenicity determinations and update the patients if necessary.

### Study population (validation set)

We validated the penetrance estimates by comparing disease statuses and cancer-specific risk of genotype-observed subjects in the MDACC data. The cancer-specific risk for a genotype-observed individual is readily obtained by differentiating the estimated cancer-specific penetrance estimates. As large proportions of genotype-observed individuals are probands, who have excessively high risk for a TP53 mutation, we excluded probands in the validation study to mitigate this bias. Using the MDACC data for both estimating and validating penetrance would consequently make validation results seem too optimistic. We hence conducted validation analysis using two distinct prospective cohorts. One cohort is recruited to the International Sarcoma Kindred Study (ISKS) through patients with adult onset sarcoma from six clinics in Australia. Once pedigree information and samples were collected, cancers were verified by Australian and New Zealand cancer registries and death certificates (16). The proband's TP53 mutation status was determined by PCR sequencing, high-resolution melt analysis, and multiplex ligation-dependent probe amplification to detect large deletions or genomic rearrangements (16). The ISKS cohort consists of 582 families, of which, 19 are TP53 mutation–positive families (Table 2). The second independent cohort was recruited to the NCI. The NCI LFS study is a long-term prospective, natural history cohort study that started in August 2011 and follows up participants each time they participate in the biannual LFS screening protocol. The study includes individuals meeting the classic LFS (17) or Birch Li-Fraumeni-like (LFL; ref. 4) criteria, having a pathogenic germline TP53 mutation or a first- or second-degree relative with a mutation, or having a personal history of choroid plexus carcinoma, adrenocortical carcinoma, or at least three primary cancers (NCT01443468). A detailed family history questionnaire was obtained, including information on birth date, vital status, date or age at death if deceased, history of cancer, and if positive, type, and year of diagnosis or age at diagnosis, for all first-, second-, and third-degree relatives and any other extended family members for whom the information was available. There were 2,676 individuals from 102 families included in this NCI LFS cohort (Table 3; ref. 18). All studies were approved by their respective institutional review boards and were conducted in accordance with ethical guidelines laid out by the Belmont Report. All the participants signed written informed consent forms before recruitment.

Table 2.

ISKS cohort summary.

SexGenotypeBreast cancerSarcomaOther cancersDeathCensoredTotal
Male Unknown 2 (53.5) 12 (29.2) 696 (60.6) 2,282 (64.5) 5,586 (42.4) 8,578 (49.7)
Wild-type 0 (NA) 265 (45.6) 47 (50.3) 0 (NA) 6 (54.2) 318 (46.5)
Mutation 0 (NA) 9 (28.3) 4 (33.2) 0 (NA) 1 (22.0) 14 (29.3)
Subtotal 2 (53.5) 286 (44.4) 747 (59.8) 2,282 (64.5) 5,593 (42.4) 8,910 (49.6)
Female Unknown 243 (55.2) 18 (38.3) 544 (56.6) 1,857 (68.4) 5,360 (43.8) 8,022 (50.7)
Wild-type 15 (52.9) 212 (45.9) 38 (41.8) 0 (NA) 11 (36.5) 276 (45.3)
Mutation 3 (32.0) 6 (43.3) 1 (10.0) 0 (NA) 3 (31.0) 13 (35.3)
Subtotal 261 (54.8) 236 (45.2) 583 (55.5) 1,857 (68.4) 5,374 (43.8) 8,311 (50.5)
Total  263 (54.8) 522 (44.8) 1,330 (57.9) 4,139 (66.2) 10,967 (43.1) 17,221 (50.0)
SexGenotypeBreast cancerSarcomaOther cancersDeathCensoredTotal
Male Unknown 2 (53.5) 12 (29.2) 696 (60.6) 2,282 (64.5) 5,586 (42.4) 8,578 (49.7)
Wild-type 0 (NA) 265 (45.6) 47 (50.3) 0 (NA) 6 (54.2) 318 (46.5)
Mutation 0 (NA) 9 (28.3) 4 (33.2) 0 (NA) 1 (22.0) 14 (29.3)
Subtotal 2 (53.5) 286 (44.4) 747 (59.8) 2,282 (64.5) 5,593 (42.4) 8,910 (49.6)
Female Unknown 243 (55.2) 18 (38.3) 544 (56.6) 1,857 (68.4) 5,360 (43.8) 8,022 (50.7)
Wild-type 15 (52.9) 212 (45.9) 38 (41.8) 0 (NA) 11 (36.5) 276 (45.3)
Mutation 3 (32.0) 6 (43.3) 1 (10.0) 0 (NA) 3 (31.0) 13 (35.3)
Subtotal 261 (54.8) 236 (45.2) 583 (55.5) 1,857 (68.4) 5,374 (43.8) 8,311 (50.5)
Total  263 (54.8) 522 (44.8) 1,330 (57.9) 4,139 (66.2) 10,967 (43.1) 17,221 (50.0)
Table 3.

NCI cohort summary.

SexGenotypeBreast cancerSarcomaOther cancersDeathLost follow-upTotal
Male Unknown 1 (60.0) 27 (25) 238 (46.8) 0 (NA) 876 (25.2) 1,142 (29.7)
Wild-type 0 (NA) 3 (56.7) 4 (39.5) 0 (NA) 107 (22.4) 114 (23.9)
Mutation 0 (NA) 18 (38.7) 29 (43) 0 (NA) 31 (27.5) 78 (35.8)
Subtotal 1 (60) 48 (32.1) 271 (46.3) 0 (NA) 1,014 (24.9) 1,334 (29.6)
Female Unknown 143 (50.7) 22 (30.1) 151 (42.4) 0 (NA) 756 (25.6) 1,072 (31.4)
Wild-type 16 (54.3) 3 (41.3) 8 (49.8) 0 (NA) 108 (23.4) 135 (29)
Mutation 51 (43.1) 24 (35.6) 27 (35.9) 0 (NA) 33 (20) 135 (34.7)
Subtotal 210 (49.1) 49 (33.5) 186 (41.8) 0 (NA) 897 (25.1) 1,342 (31.5)
Total  211 (49.2) 97 (32.8) 457 (44.5) 0 (NA) 1,911 (25) 2,676 (30.5)
SexGenotypeBreast cancerSarcomaOther cancersDeathLost follow-upTotal
Male Unknown 1 (60.0) 27 (25) 238 (46.8) 0 (NA) 876 (25.2) 1,142 (29.7)
Wild-type 0 (NA) 3 (56.7) 4 (39.5) 0 (NA) 107 (22.4) 114 (23.9)
Mutation 0 (NA) 18 (38.7) 29 (43) 0 (NA) 31 (27.5) 78 (35.8)
Subtotal 1 (60) 48 (32.1) 271 (46.3) 0 (NA) 1,014 (24.9) 1,334 (29.6)
Female Unknown 143 (50.7) 22 (30.1) 151 (42.4) 0 (NA) 756 (25.6) 1,072 (31.4)
Wild-type 16 (54.3) 3 (41.3) 8 (49.8) 0 (NA) 108 (23.4) 135 (29)
Mutation 51 (43.1) 24 (35.6) 27 (35.9) 0 (NA) 33 (20) 135 (34.7)
Subtotal 210 (49.1) 49 (33.5) 186 (41.8) 0 (NA) 897 (25.1) 1,342 (31.5)
Total  211 (49.2) 97 (32.8) 457 (44.5) 0 (NA) 1,911 (25) 2,676 (30.5)

Supplementary Table S1 lists all deleterious TP53 proteins that were collected from the MDACC cohort and the validation cohorts.

### Cancer-specific penetrance estimation

Because of the low prevalence of most cancer types in LFS, we grouped all cancers related to LFS into three types: breast cancer (BR), sarcoma (SA), and all others (OT). “Other” cancers include additional LFS spectrum cancer types, such as, adrenocortical carcinomas, brain tumors, leukemia, and lung cancer, as well as all additional non-LFS–related malignant cancer diagnoses in the families like pancreatic cancer, colorectal cancer, ovarian cancer, prostate cancer, etc. Breast and the combination of osteosarcomas and soft-tissue sarcomas were used as specific cancer groups due to the available numbers of each cancer in the data (Supplementary Table S2). In our competing risk model, each different cancer type is competing with all others as the first event. As a baseline competing event, we consider deaths irrelevant to LFS. For example, breast cancer is observed only if the breast cancer appears before other cancers or death. Therefore, we have four competing events: (i) BR, (ii) SA, (iii) OT, and (iv) dying without having been diagnosed with cancer.

The age-at-onset cancer-specific penetrance is defined by a cumulative probability (up to a certain age) of experiencing a particular competing event prior to all others, otherwise referred to as cumulative incidence in the competing risk literature. This is different from a net probability of being diagnosed as a particular cancer type when possible competing risks are removed, which consequently overestimates the actual risk. Considering the sex (male and female) as an additional covariate, we estimate a complete set of penetrance estimates for the four competing events at four different configurations of covariates, that is, {male, female} * {wildtype, mutated}. Introducing death events as a baseline competing risk, we can estimate the penetrance for death irrelevant to LFS, which is a natural occurrence and a crucial factor for assessing real cancer risk. The overall cancer penetrance without considering its subtype can be obtained by simply accumulating the cancer-specific penetrance. Similarly, the TP53 mutation penetrance for either experiencing any cancer diagnosis or dying without cancer is obtained by summing up the overall cancer penetrance plus the death penetrance. The penetrance estimates for the wild-type population are directly comparable with the corresponding Surveillance Epidemiology and End Results (SEER) estimates without any further modification.

### Model and estimation

Let |{T_k}$| denote the time of the |k$|th type of event happened (⁠|k = $|1, 2, 3 represent the three cancer types BR, SA, and OT diagnosed, respectively, and 4 represents a death event irrelevant to cancers). Define |T = \mathop {\min}_k {T_k}$| and |D = k$|⁠. If the |k$|th event is observed, then the cancer-specific penetrance |{q_k}( t )\ $|is where |G$| denotes the TP53 mutation status coded as 1 for mutation and 0 for wild-type, and |X\ $|is sex coded as 1 for male and 2 for female. The cancer-specific penetrance can be directly calculated from the cancer-specific hazard |{\lambda _k}(t\ |\ G,\ X)$| modeled as follows.

where |{\lambda _{0,k}}( t )$| denotes a baseline hazard and |\xi \$|denotes a family-specific random frailty to handle underlying dependency with the family. To incorporate pedigree structure and family cancer history into the estimation, we introduce the familywise likelihood (9), which is a marginal likelihood of each family over missing genotypes whose joint distribution is available from the Mendelian laws of inheritance.

We also adjusted for the ascertainment bias by exploiting the ascertainment-corrected joint likelihood (19) that essentially reweights the familywise likelihood by the ascertainment probability of each family. The ascertainment probability is estimable because the recruiting rule of families in the LFS data is available, that is, the proband developed sarcoma during childhood. Finally, we refer to our methods article (10) for complete details about the model.

### Prevalence and de novo mutation rate

We assumed the TP53 mutation follows Hardy–Weinberg equilibrium but this could be changed when homozygous genotype information is published. The mutation prevalence for pathogenic TP53 mutations was specified as 0.0006 for LFSPRO, which was derived in our previous study (18). This estimate was validated using clinical cohorts, although it falls within the range of the high and low estimates of TP53 mutations from population studies (18). The assumed frequencies for the three genotypes (homozygous reference, heterozygous, and homozygous variant) were 0.9988, 0.0005996, and 3.6e-07, respectively. We used 0.00012 as a default value of de novo mutation rate when evaluating the familywise likelihood (18, 20).

### Validation study design

We evaluated the model prediction performance on cancer-specific risk using the average annual risk computed with our cancer-specific TP53 penetrance estimates. The cancer-specific risk was calculated as the cumulative probability of developing one type of cancer divided by the follow-up time. The ROC curve was used to evaluate the sensitivity and specificity of predicting incidence of a specific cancer type, using the estimated risk probability at various cutoffs. We also provide 95% bootstrap confidence intervals for AUC estimates (21, 22).

### Cancer-specific penetrance estimates

As shown in Table 1, the pediatric sarcoma cohort at MDACC provided clinical outcomes of 3,683 individuals from 168 families. Among them, there are 66 breast cancer cases, 211 sarcoma cases, 273 other cancers, 485 deaths, and 2,648 lost to follow-up. Within each of the four competing events, 27 breast cancer cases (41%), 196 sarcoma cases (93%, explained by ascertainment as most of them are probands), 57 other cancers (21%), and 85 deaths from causes other than cancer (18%) had known TP53 genotype status. Our competing risk and pedigree-based hazard model accounts for outcomes of individuals without genotype status by using Mendelian inheritance that assigns these individuals with weights for being a carrier or a noncarrier. This approach substantially increased the sample size and the accuracy for penetrance estimation.

The set of cancer-specific penetrance estimated from the pediatric sarcoma cohort are depicted in Fig. 1. Figure 1 shows the five sex-specific, cancer-specific penetrance estimates for breast cancer, sarcoma, and other cancers combined, respectively, for TP53 mutation carriers. Note that the estimated cancer-specific penetrance estimates display different patterns for different cancers, which could be identified only through the proposed cancer-specific approach. Penetrance estimates for noncarriers can be found in Supplementary Fig. S1: in each panel, we present four penetrance estimates each of which corresponds to one of configurations of sex and genotype. Supplementary Figure S1D depicts the penetrance for death before cancer diagnosis. Supplementary Figure S1E and 1F are the combined penetrance of overall cancer and cancer or death, respectively.

Figure 1.

Cancer-specific penetrance estimates for TP53 mutation carriers from the pediatric sarcoma cohort.

Figure 1.

Cancer-specific penetrance estimates for TP53 mutation carriers from the pediatric sarcoma cohort.

Close modal

Risk of breast cancer greatly increases between 20 and 40 years of age for female TP53 mutation carriers (Fig. 1). We further compare it with the female penetrance estimate of BRCA1/2, which are two well-known breast cancer susceptibility genes (Fig. 2). Because BRCA1/2 penetrance ignores possible competing events such as death and other cancers, they are not directly comparable with our TP53 breast cancer penetrance. However, our estimates can be easily converted to the corresponding net counterpart that ignores competing risks (16). Through the head-to-head comparison, we observed that the female TP53 mutation carrier has excessively high probability of developing breast cancer before 25, which is not the case for BRCA1/2 (Fig. 2B). Therefore, we provide quantitative evidence for a very early-onset breast cancer being regarded as a clinical evidence of mutations in TP53. In addition, the noncarrier penetrance of both TP53 and BRCA1/2 do coincide, which supports the validity of our estimates (Fig. 2B).

Figure 2.

A, Comparison of TP53 net penetrance to the BRCA1/2 net penetrance in breast cancer. B, A female TP53 mutation carrier has excessively high probability of the breast cancer in early age, which is not the case for BRCA1/2. A very early-onset breast cancer can be regarded as a clinical evidence of mutations in TP53. It is observed that noncarrier penetrance estimates of both TP53 and BRCA1/2 coincide. MT, mutation; WT, wild-type.

Figure 2.

A, Comparison of TP53 net penetrance to the BRCA1/2 net penetrance in breast cancer. B, A female TP53 mutation carrier has excessively high probability of the breast cancer in early age, which is not the case for BRCA1/2. A very early-onset breast cancer can be regarded as a clinical evidence of mutations in TP53. It is observed that noncarrier penetrance estimates of both TP53 and BRCA1/2 coincide. MT, mutation; WT, wild-type.

Close modal

The sarcoma-specific TP53 penetrance increases by about 10% from wild-type individuals from birth to 45 years of age. Although the MDACC pediatric sarcoma cohort is not a random sample and contains increased number of cases of sarcoma compared with the general population, we have the sarcoma penetrance estimated at very low risk for wild-type individuals (Supplementary Fig. S1). This demonstrates our estimates successively corrected for the ascertainment. Recalling that the penetrance curves are estimated from MDACC data collected through patients with pediatric sarcoma, it is not surprising to observe sarcoma penetrance for carriers rapidly increases at an early age and then plateau.

The other cancer–combined penetrance for TP53 carriers is noticeably different than the wild-type individuals. For females, the risk of other cancer increases from 5% to 22% between 5 and 55 years of age and then begins to plateau. For males there is a significant penetrance increase between 35 and 65 years of age.

Interestingly, the penetrance of death before cancer for TP53 carriers was different than that of noncarriers and of the general population SEER estimates, which was used for validation purposes. Missing information in cancer outcomes in obligate carriers might contribute to this observation. We also observed higher death rate among the noncarriers (Supplementary Fig. S1D). The MDACC pediatric sarcoma cohort family data date back from the 1940s (Supplementary Fig. S2) while the most recent SEER estimates are from 2008 to 2010. As shown in Supplementary Fig. S3, the mortality rate in the U.S. population has decreased over time hence the observed discrepancy.

Next, Supplementary Fig. S1E depicts the overall cancer penetrance and Supplementary Fig. S1F shows probabilities of either having any cancer or dying without cancer. Again, we observed high concordances between the noncarrier penetrance estimates to the corresponding SEER estimates for any cancer, and any cancer or dying without cancer.

### Validation results

Tables 2 and 3 provide summaries of the validation datasets, the ISKS and NCI cohorts. Figure 3 depicts ROC curves of cancer-specific risk estimates and their actual cancer status of genotype-observed subjects. Figure 3A is the results for the MDACC data. A total of 772 tested individuals were used, excluding probands. It is not surprising that the cancer-specific risk estimates directly from the cancer-specific penetrances estimated from the MDACC data performed very well with the AUC ROC curve equal to 0.855 [95% confidence interval (CI), 0.781–0.929] for breast cancer, 0.76 (95% CI, 0.589–0.93) for sarcoma, and 0.749 (95% CI, 0.689–0.808) for other cancers. Figure 3B shows the validation result for the independent cohorts, ISKS and NCI, which are not used for penetrance estimation. A total of 402 tested individuals (38 from ISKS, see Table 2; 364 from NCI, see Table 3) were used, excluding probands. We observed that the penetrance estimates accurately quantifies the associated cancer risks even for these two independent datasets. The AUCs are 0.914 (95% CI, 0.872–0.957) for breast cancer, 0.748 (95% CI, 0.678–0.818) for sarcoma, and 0.805 (95% CI, 0.745–0.865) for other cancers. These results strongly support the validity of the cancer-specific estimates.

Figure 3.

Validation results using the training dataset (MDACC) and the testing dataset (ISKS+NCI). ROC curves that compare cancer-specific risks obtained from the estimated penetrances and disease status of genotypes-observed subjects only are depicted for MDACC data (A) and the ISKS and NCI data (B). Probands from all cohorts are excluded to mitigate the bias induced by ascertainment.

Figure 3.

Validation results using the training dataset (MDACC) and the testing dataset (ISKS+NCI). ROC curves that compare cancer-specific risks obtained from the estimated penetrances and disease status of genotypes-observed subjects only are depicted for MDACC data (A) and the ISKS and NCI data (B). Probands from all cohorts are excluded to mitigate the bias induced by ascertainment.

Close modal

We provide the first set of cancer-specific penetrance estimates of TP53 germline mutations. These cancer-specific penetrance estimates are modeled under a competing risk framework because all the cancers are competing with others as the first event. A baseline competing event of death irrelevant to the cancers is also incorporated in the estimation. This leads us to precisely estimate a crude risk of cancer without additional calibration based on external data source. Our final penetrance estimates for specific cancer types are age-of-diagnosis–dependent and vary with cancer type, sex, and TP53 mutation status. On the basis of the new penetrance estimates, we observed the risk for breast cancer is higher than that from BRCA1/2 mutations before age 25. Our penetrance estimates are validated using independent cohorts from NCI and ISKS. We have integrated the new penetrance estimates in our risk prediction software LFSPRO and upgraded to version 2.0.0 to provide risk estimates, which is freely available at https://bioinformatics.mdanderson.org/public-software/lfspro/.

Like other cancer syndromes, LFS is extremely rare and it is challenging to collect sufficient amounts of data to be analyzed. However, the pediatric sarcoma cohort from MDACC is a large and comprehensive cohort that only looked at children with a sarcoma diagnosis, excluding any bias of only obtaining families based on LFS clinical criteria (17, 23, 24). For validation purposes, we combined the NCI and ISKS cohort for a larger data set and to combine different ascertainment strategies.

With cancer-specific penetrance estimates available, we developed an associated risk assessment extension to the LFS mutation carrier risk prediction tool, LFSPRO R package, based on the BayesMendel model (18, 25). The lfspro.mode function set to “1st.cs” in the updated R package uses an individual's family history to estimate the risk of breast cancer diagnosis, sarcoma diagnosis, other diagnosis, or death for the next 5, 10, 15, and 20 years. Current LFS standard screening protocols (26) have been instituted in clinics throughout the world (27–31). At MDACC, the Li-Fraumeni Education and Early Detection Program consists of “a centralized approach to patient management, screening exams performed at the same institution, including whole-body MRI and brain MRI, using same technique and same machines enabling more consistent comparison of findings between and across interval exams in patients and a multidisciplinary team that reviews all findings and patient issues to develop follow-up recommendations” (29). Implementation of cancer-specific penetrance estimation in LFS screening programs could give patients a more complete picture of predicted risk. With a 90% lifetime risk of cancer in TP53 mutation carriers (7, 14), and a near 100% risk of cancer for female TP53 mutation carriers by age 70 (9), patients that have tested positive for a germline mutation are constantly on alert for when a cancer diagnosis will occur. Our goal is for LFSPRO is be used as a tool for genetics counselors and clinicians that work with TP53 mutation carriers to provide more information to their patients during their screening visits. However, this estimation is only for the first primary cancer diagnosis. Unless a mutation has already been identified within a family, most individuals do not find out they are a TP53 germline mutation carrier until after their first or second primary. Clinical use of our penetrance estimates will be most useful for genetic counselors during mutation testing results disclosure prior to the first cancer diagnosis.

At this time we were not able to take into account the effects of germline TP53 allelic heterogeneity. It is hypothesized that the different point mutations in TP53 contribute to different effects of LFS (32). However, because of limited sample size of each mutation within the training dataset (Supplementary Table S1) and lack of a standard on which to group the different mutations, we did not include this information into our cancer-specific modeling. The grouping of cancer types in this study is determined on the basis of the sample size requirement and can be altered (and then rerun our Bayesian semi-parametric model for parameter estimation) to reflect specific biological and clinical needs in future analysis.

In summary, we provide cancer-specific penetrance estimates of TP53 mutation carriers with improved resolution that allows us to utilize the cancer type information. Looking deeper into the cancer-specific penetrance of LFS leads to more information for patients and their clinicians. Although treatment targets are being researched, patients long for more information on how LFS affects them now. Adding to the current understanding of the cancer-specific penetrance leads to better understanding of the disease and improve risk management of healthy individuals from families with LFS.

M.N. Frone has ownership interest (including patents) in CancerGene Connect. No potential conflicts of interest were disclosed by the other authors.

Conception and design: S.J. Shin, Y. Yuan, L.C. Strong, W. Wang

Development of methodology: S.J. Shin, G. Peng, C.I. Amos, Y. Yuan, W. Wang

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J. Bojadzieva, P.P. Khincha, P.L. Mai, S.A. Savage, M.L. Ballinger, D.M. Thomas, L.C. Strong

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): S.J. Shin, G. Peng, J. Bojadzieva, J. Chen, C.I. Amos, S.A. Savage, Y. Yuan, L.C. Strong, W. Wang

Writing, review, and/or revision of the manuscript: S.J. Shin, E.B. Dodd-Eaton, J. Bojadzieva, C.I. Amos, P.P. Khincha, P.L. Mai, S.A. Savage, M.L. Ballinger, D.M. Thomas, Y. Yuan, L.C. Strong, W. Wang

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): E.B. Dodd-Eaton, M.N. Frone, P.P. Khincha, S.A. Savage, L.C. Strong, W. Wang

Study supervision: Y. Yuan, L.C. Strong, W. Wang

Other (directly involved in all of the data collection and documentation): L.C. Strong

S.J. Shin, E.B. Dodd-Eaton, and W. Wang were supported in part by the Cancer Prevention Research Institute of Texas through grant number RP130090. S.J. Shin was supported in part by National Research Foundation of Korea funded by the Korea government (MSIT) through grant number 2019R1A4A1028134. E.B. Dodd-Eaton and W. Wang were supported in part by the U.S. NCI through grant numbers 1R01 CA183793, 1R01 CA174206, and 2R01 CA158113. W. Wang was supported in part by U.S. NCI grant number P30CA016672. G. Peng was supported in part by the U.S. NCI grant numbers 3P50CA196530 and 3P30CA016359. J. Chen was supported by U.S. NCI grant number 1R01 CA183793. The work of S.A. Savage, P.L. Mai, M.N. Frone, and P.P. Khincha was supported by the intramural research program of the Division of Cancer Epidemiology and Genetics, NCI. J. Bojadzieva and L.C. Strong were supported in part by the U.S. NIH through grant number P01CA34936. M.L. Ballinger was supported by the Cancer Institute NSW Career Development Fellowship 2018/CDF004. D.M. Thomas was supported in part by the Australian Government NHMRC Principal Research Fellowship 1104364, NHMRC project grant GNT1125042, and Rainbows for Kate Foundation. C.I. Amos was supported by the U.S. NCI 1U01CA196386, CA196386S1, and 1R01CA186566. C.I. Amos would also like to thank the CPRIT RR170048 grant and the U.S. NCI U19CA203654 and U19CA203654S1 grants for their additional funding. Y. Yuan was supported by the University of Texas MD Anderson grant CA016672. We thank Dr. Banu Arun and Jessica Ross for their helpful comments.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Li
FP
,
Fraumeni
JF
.
Rhabdomyosarcoma in children: epidemiologic study and identification of a familial cancer syndrome
.
J Natl Cancer Inst
1969
;
43
:
1365
73
.
2.
Li
FP
,
Fraumeni
JF
.
Soft-tissue sarcomas, breast cancer, and other neoplasms. A familial syndrome?
Ann Intern Med
1969
;
71
:
747
52
.
3.
Strong
LC
,
Stine
M
,
Norsted
TL
.
Cancer in survivors of childhood soft tissue sarcoma and their relatives
.
J Natl Cancer Inst
1987
;
79
:
1213
20
.
4.
Birch
JM
,
Hartley
AL
,
Tricker
KJ
,
Presser
J
,
Condie
A
,
Kelsey
AM
, et al
Prevalence and diversity of constitutional mutations in the p53 Gene among 21 Li-Fraumeni families
.
Cancer Res
1994
;
54
:
1298
304
.
5.
Birch
JM
,
Heighway
J
,
Teare
MD
,
Kelsey
AM
,
Hartley
AL
,
Tricker
KJ
, et al
Linkage studies in a Li-Fraumeni family with increased expression of p53 protein but no germline mutation in p53
.
Br J Cancer
1994
;
70
:
1176
81
.
6.
Nichols
KE
,
Malkin
D
,
Garber
JE
,
Fraumeni
JF
,
Li
FP
.
Germ-line p53 mutations predispose to a wide spectrum of early-onset cancers
.
Cancer Epidemiol Biomarkers Prev
2001
;
10
:
83
7
.
7.
Hwang
S-J
,
Lozano
G
,
Amos
CI
,
Strong
LC
.
Germline p53 mutations in a cohort with childhood sarcoma: sex differences in cancer risk
.
Am J Hum Genet
2003
;
72
:
975
83
.
8.
Hwang
S-J
,
Cheng
LS-C
,
Lozano
G
,
Amos
CI
,
Gu
X
,
Strong
LC
.
Lung cancer risk in germline p53 mutation carriers: association between an inherited cancer predisposition, cigarette smoking, and cancer risk
.
Hum Genet
2003
;
113
:
238
43
.
9.
Wu
C-C
,
Strong
LC
,
Shete
S
.
Effects of measured susceptibility genes on cancer risk in family studies
.
Hum Genet
2010
;
127
:
349
57
.
10.
Shin
SJ
,
Yuan
Y
,
Strong
LC
,
J
,
Wang
W
.
Bayesian semiparametric estimation of cancer-specific age-at-onset penetrance with application to Li-Fraumeni syndrome
.
J Am Stat Assoc
2019
;
114
:
541
52
.
11.
Mai
PL
,
Best
AF
,
Peters
JA
,
DeCastro
RM
,
Khincha
PP
,
Loud
JT
, et al
Risks of first and subsequent cancers among TP53 mutation carriers in the National Cancer Institute Li-Fraumeni syndrome cohort
.
Cancer
2016
;
122
:
3673
81
.
12.
Strong
LC
,
Williams
WR
.
The genetic implications of long-term survival of childhood cancer. A conceptual framework
.
Am J Pediatr Hematol Oncol
1987
;
9
:
99
103
.
13.
ED
,
Williams
WR
,
Bondy
ML
,
Strom
S
,
Strong
LC
.
Segregation analysis of cancer in families of childhood soft-tissue-sarcoma patients
.
Am J Hum Genet
1992
;
51
:
344
56
.
14.
Wu
C-C
,
Shete
S
,
Amos
CI
,
Strong
LC
.
Joint effects of germ-line p53 mutation and sex on cancer risk in Li-Fraumeni syndrome
.
Cancer Res
2006
;
66
:
8287
92
.
15.
Bondy
ML
,
ED
,
Strom
SS
,
Strong
LC
.
Segregation analysis of 159 soft tissue sarcoma kindreds: comparison of fixed and sequential sampling schemes
.
Genet Epidemiol
1992
;
9
:
291
304
.
16.
Mitchell
G
,
Ballinger
ML
,
Wong
S
,
Hewitt
C
,
James
P
,
Young
MA
, et al
High frequency of germline TP53 mutations in a prospective adult-onset sarcoma cohort.
PLoS One
2013
;
8
:
e69026
.
17.
Li
FP
,
Fraumeni
JFJ
,
Mulvihill
JJ
,
Blattner
WA
,
Dreyfus
MG
,
Tucker
MA
, et al
A cancer family syndrome in 24 kindreds
.
Cancer Res
1988
;
48
:
5358
62
.
18.
Peng
G
,
J
,
Ballinger
ML
,
Li
J
,
Blackford
AL
,
Mai
PL
, et al
Estimating TP53 mutation carrier probability in families with Li–Fraumeni syndrome using LFSPRO
.
Cancer Epidemiol Biomarkers Prev
2017
;
26
:
837
44
.
19.
Iversen
ES
Jr
,
Chen
S
.
Population-calibrated gene characterization: estimating age at onset distributions associated with cancer genes
.
J Am Stat Assoc
2005
;
100
:
399
409
.
20.
Gonzalez
KD
,
Buzin
CH
,
Noltner
KA
,
Gu
D
,
Li
W
,
Malkin
D
, et al
High frequency of de novo mutations in Li-Fraumeni syndrome
.
J Med Genet
2009
;
46
:
689
93
.
21.
Robin
X
,
Turck
N
,
Hainard
A
,
Tiberti
N
,
Lisacek
F
,
Sanchez
J-C
, et al
Package “pROC”
;
2019
. Available from: https://cran.r-project.org/web/packages/pROC/pROC.pdf.
22.
Robin
X
,
Turck
N
,
Hainard
A
,
Tiberti
N
,
Lisacek
F
,
Sanchez
J-C
, et al
pROC: an open-source package for R and S+ to analyze and compare ROC curves
.
BMC Bioinformatics
2011
;
12
:
77
.
23.
Chompret
A
,
Abel
A
,
Stoppa-Lyonnet
D
,
Brugieres
L
,
Pages
S
,
Feunteuns
J
, et al
Sensitivity and predictive value of criteria for p53 germline mutation screening
.
J Med Genet
2001
;
38
:
43
7
.
24.
Bougeard
G
,
Renaux-Petel
M
,
Flaman
J-M
,
Charbonnier
C
,
Fermey
P
,
Belotti
M
, et al
Revisiting Li-Fraumeni syndrome from TP53 mutation carriers
.
J Clin Oncol
2015
;
33
:
2345
52
.
25.
Chen
S
,
Wang
W
,
Broman
KW
,
Katki
HA
,
Parmigiani
G
.
Bayesmendel: an R environment for Mendelian risk prediction
.
Stat Appl Genet Mol Biol
2004
;
3
:
1
19
.
26.
Villani
A
,
Shore
A
,
Wasserman
JD
,
Stephens
D
,
Kim
RH
,
Druker
H
, et al
Biochemical and imaging surveillance in germline TP53 mutation carriers with Li-Fraumeni syndrome: 11 year follow-up of a prospective observational study
.
Lancet Oncol
2016
;
17
:
1295
305
.
27.
Ballinger
ML
,
Best
A
,
Mai
PL
,
Khincha
PP
,
Loud
JT
,
Peters
JA
, et al
Baseline surveillance in Li-Fraumeni syndrome using whole-body magnetic resonance imaging: a meta-analysis baseline surveillance with MRI in Li-Fraumeni Syndrome
.
JAMA Oncol
2017
;
3
:
1634
9
.
28.
Kratz
CP
,
Achatz
MI
,
Brugières
L
,
Frebourg
T
,
Garber
JE
,
Greer
M-LC
, et al
Cancer screening recommendations for individuals with Li-Fraumeni syndrome
.
Clin Cancer Res
2017
;
23
:
e38
.
29.
J
,
Amini
B
,
Day
SF
,
Jackson
TL
,
Thomas
PS
,
Willis
BJ
, et al
Whole body magnetic resonance imaging (WB-MRI) and brain MRI baseline surveillance in TP53 germline mutation carriers: experience from the Li-Fraumeni Syndrome Education and Early Detection (LEAD) clinic
.
Fam Cancer
2018
;
17
:
287
94
.
30.
O'Neill
AF
,
Voss
SD
,
Jagannathan
JP
,
Kamihara
J
,
Nibecker
C
,
Itriago-Araujo
E
, et al
Screening with whole-body magnetic resonance imaging in pediatric subjects with Li–Fraumeni syndrome: a single institution pilot study
.
Pediatr Blood Cancer
2018
;
65
:
e26822
.
31.
Paixão
D
,
Guimarães
MD
,
KC
,
Nóbrega
AF
,
Chojniak
R
,
Achatz
MI
.
Whole-body magnetic resonance imaging of Li-Fraumeni syndrome patients: observations from a two rounds screening of Brazilian patients
.
Cancer Imaging
2018
;
18
:
27
.
32.
Walerych
D
,
Napoli
M
,
Collavin
L
,
Del Sal
G
.
The rebel angel: mutant p53 as the driving oncogene in breast cancer
.
Carcinogenesis
2012
:33;2007–17.