Abstract
Normal-tissue adverse effects following radiotherapy are common and significantly affect quality of life. These effects cannot be accounted for by dosimetric, treatment, or demographic factors alone, and evidence suggests that common genetic variants are associated with radiotherapy adverse effects. The field of radiogenomics has evolved to identify such genetic risk factors. Radiogenomics has two goals: (i) to develop an assay to predict which patients with cancer are most likely to develop radiation injuries resulting from radiotherapy, and (ii) to obtain information about the molecular pathways responsible for radiation-induced normal-tissue toxicities. This review summarizes the history of the field and current research.
Significance: A single-nucleotide polymorphism–based predictive assay could be used, along with clinical and treatment factors, to estimate the risk that a patient with cancer will develop adverse effects from radiotherapy. Such an assay could be used to personalize therapy and improve quality of life for patients with cancer. Cancer Discov; 4(2); 155–65. ©2014 AACR.
Late Radiotherapy Adverse Effects in Cancer Treatment
Radiotherapy can be an excellent treatment option and is currently included, either as a primary therapy or as part of combination therapy, for approximately half of cancer treatment regimens worldwide (1). However, as with all cancer treatment options, some patients experience adverse treatment effects following radiotherapy that can last for years or even be permanent and can have a negative effect on quality of life. On the basis of the 2008 5-year prevalence estimate of 28.8 million cancers worldwide (2), if approximately half receive radiotherapy, this means nearly 15 million cancer survivors are at risk for late effects. The National Cancer Institute has recognized adverse treatment effects as an important survivorship issue that warrants increased research aimed at reducing burden of illness borne by cancer survivors and associated costs to the health care system (3).
Radiotherapy adverse effects can be early (occurring during or within weeks of treatment) or late (occurring 3 months to years following completion of radiotherapy). Although early effects tend to resolve within a few months of treatment, late effects can persist over months or years and in some cases remain a chronic problem for the remainder of the patient's life. Depending on the tumor site, severe adverse effects occur in 5% to 10% of individuals treated (e.g., refs. 4–6), whereas up to 50% of individuals experience less severe, but still bothersome, effects (e.g., refs. 7, 8).
Risk of late adverse effects is the dose-limiting factor for most radiotherapy protocols. Even though only a subset of the patient population will develop radiation injuries, because there is little information available to identify such individuals, standard protocols are designed using doses that minimize incidence of severe adverse effects based on all patients (Fig. 1). Because tolerance doses commonly used in radiotherapy are based primarily on the radiosensitive portion of the population, the majority of patients who will not go on to develop adverse effects may be undertreated. A predictive tool to identify radiosensitive patients, based on patient-specific factors such as genetics, would allow more personalized cancer treatment. More aggressive treatments could then be used in low-risk patients and possibly result in an increased rate of cure. Conversely, it may be beneficial for patients at high risk to receive a nonradiation treatment (if available) or a modified radiotherapy protocol that results in a lower dose to normal tissues that could improve the therapeutic outcome. Another consideration is that the cancers possessed by radiosensitive individuals may prove radiosensitive and potentially could be eradicated through use of a lower radiotherapy dose. A predictive assay to stratify patients in this way would improve the therapeutic index of radiotherapy.
Late adverse effects occur following treatment of many cancer types, and they have a range of effects on quality of life (9). Among women treated with radiotherapy for breast cancer, late effects include telangiectasia, edema, shrinkage, pigmentation changes, pain, and oversensitivity. Adverse effects experienced by individuals treated with radiotherapy for pelvic cancers (cervical, prostate, and colorectal) include genitourinary effects, gastrointestinal and rectal effects, as well as effects on sexual function. Individuals treated with radiotherapy for head and neck cancers often experience effects on swallowing, dry and/or sore mouth, changes in taste sensation, and tooth decay. Lung radiotherapy can result in development of lung pneumonitis/fibrosis or cardiac effects, which can be life-threatening. Although current radiogenomics studies focus mainly on prostate and breast cancers, as reviewed below, the field as a whole is interested in a variety of radiotherapy-induced normal-tissue toxicities, and studies are ongoing to investigate effects seen following treatment of head and neck, cervical, lung, and other cancer types.
For a given tumor site, multiple surrounding tissues can be affected by radiotherapy, and in some cases there may be multiple types of endpoints within a normal tissue or organ adjacent to the tumor and within the radiation field. For example, common late effects of radiotherapy for prostate cancer involve several tissue types leading to urinary symptoms, rectal symptoms, and sexual dysfunction. Within a given tissue, such as the rectum, a range of endpoints is seen including bleeding, incontinence, and pain. The pathogenesis underlying these tissue effects includes fibrosis, atrophy, neural and vascular damage, and endocrine disruption. The biologic basis for radiation damage has been reviewed previously (10–12).
Because the risk of late effects limits treatment efficacy, there is much interest in better understanding factors that cause some individuals to develop adverse effects following radiotherapy. There is a body of literature investigating correlation of dosimetric, clinical, and demographic factors with adverse treatment effects for cancers commonly treated with radiotherapy (13–19). Attempts have been made, with some success, to combine dose and volume parameters into normal-tissue complication probability (NTCP) models (20, 21). A recent series of publications on “Quantitative Analyses of Normal Tissue Effects in the Clinic” (QUANTEC) summarizes the current state of knowledge on available radiotherapy outcome data and reviews studies reporting predictors of normal-tissue adverse effects (22). The QUANTEC initiative also aimed to identify future avenues for research that would improve risk prediction, recognizing that there remains much patient-to-patient variability in the risk of developing adverse treatment effects, and predictive models have limited sensitivity and specificity in clinical practice.
Evidence of a Genetic Basis for Radiotherapy Adverse Effects
Even after accounting for dosimetric, treatment, clinical, and demographic factors, late-radiotherapy adverse effects show a large degree of interpatient variability in incidence and severity, suggesting that genetics plays a role. There are known genetic syndromes that predispose to radiation sensitivity. For example, mutations in the ATM gene result in ataxia-telangiectasia, a syndrome characterized by extreme radiosensitivity and increased risk for developing cancer (23). Other clinically relevant radiosensitivity syndromes result from rare mutations in genes that play central roles in DNA repair, such as NBS1 (Nijmegen breakage syndrome) and LIG4 (DNA ligase IV deficiency; for review, see refs. 24, 25). However, high-penetrance rare mutations do not explain the incidence of |commonly seen adverse effects, and it has been long hypothesized that low-penetrance common genetic variants largely determine individual response to radiation. Taken together, the likely tens or hundreds of such variants could explain a large proportion of the interpatient variability in radiosensitivity.
Initial evidence in support of common genetic factors being responsible for interpatient variation in radiosensitivity was obtained through an examination of radiation-induced telangiectasia in patients with breast cancer (26). This study revealed substantial variation in development of telangiectasia for the same radiation treatment. A determination was reached that 80% to 90% of the variation was due to deterministic effects related to the existence of possibly genetic differences between individuals, whereas only 10% to 20% of the variation could be explained through stochastic events arising from the random nature of radiation-induced cell killing and random variations in dosimetry and dose delivery. Further evidence supporting a genetic basis for individual radiosensitivity is provided by studies showing that the rate of apoptosis in CD4 and CD8 T lymphocytes collected from patients undergoing radiotherapy can, to some extent, predict radiation-induced late toxicity seen in those same patients (27–29).
The Candidate Gene Approach to Identifying Genetic Predictors
Work toward identifying common genetic risk factors for radiotherapy adverse effects has been ongoing for more than 10 years with more than 60 publications to date. The main approach taken in these early studies was to select candidate genetic variants, mainly single-nucleotide polymorphisms (SNP), located within genes shown in cell culture and animal experiments to play a role in processes underlying the pathologic basis for radiation response. Such processes include DNA damage repair, inflammation, apoptosis, and growth signaling. SNPs within these genes were tested in germline DNA from radiotherapy patients for association with incidence of radiotherapy adverse effects. These studies have been recently reviewed (30–32).
To date, no genetic variants examined in candidate gene SNP studies have been definitively linked with radiotherapy adverse response. Of the significant SNP–phenotype associations reported, follow-up studies showed conflicting results, with some confirming association and others detecting no association. Some SNP–phenotype associations have not yet been followed up in validation cohorts to confirm the initial findings. Often, when the same SNP was studied in different patient cohorts, there were differences in treatment and clinical factors that were not adjusted for, and in some cases different adverse effect endpoints were analyzed, making it difficult to draw comparisons or conclusions between studies. Ethnicity is rarely reported in candidate gene studies, and genetic ancestry is unaccounted for, leading to the possibility that conflicting results across studies could be due to confounding by population stratification. Furthermore, despite the fact that most studies tested multiple SNPs, few reported corrected P values to account for multiple comparisons. Only a small number of published studies provided power calculations to describe the effect sizes they were capable of detecting given the population prevalence of the SNPs studied. This has led to a high likelihood of identification of false-positive associations. Furthermore, given the relatively narrow scope of genes and SNPs selected for study, there is also a high probability of false negatives—SNPs that are truly associated with radiotherapy adverse effects but were missed by the candidate gene approach.
Radiogenomics: Using Genome-Wide Association Studies to Identify Genetic Predictors of Clinical Radiosensitivity
Recognizing the limitations of the candidate gene approach, and coincident with advancements in genotyping technology, the field has shifted toward a broader, genome-wide approach to identify genetic predictors of radiotherapy adverse effects. This field of research, termed “radiogenomics,” parallels pharmacogenomics, whose goal is to identify genetic predictors of drug response (33). The shared goals of the radiogenomics research community, outlined in concurrent publications in the two leading radiation oncology journals (34, 35), are (i) to develop an assay capable of predicting which patients with cancer are most likely to develop radiation injuries resulting from treatment with a standard radiotherapy protocol, and (ii) to obtain information to assist with the elucidation of the molecular pathways responsible for radiation-induced normal-tissue toxicities.
The main approach used in radiogenomics is the genome-wide association study (GWAS). GWASs are studies of the association between SNPs (the independent variable) and a phenotype of interest (the dependent variable), which is adverse effects of radiotherapy in the case of radiogenomics. Study designs used in GWASs are generally the same as those used in traditional epidemiology, such as case–control, cohort, and nested case–control. However, there are some study design and analytic issues specific to GWASs that are not typically considered in nongenetic studies. These include the potential for confounding by population structure, genetic linkage between groups of SNPs tested independently, and the need to correct association test results for multiple comparisons, as hundreds of thousands up to several million SNPs are being tested for association with a single phenotype in a single study. GWASs harness the block-like structure of the human genome to survey nearly all common genetic variants in a cost-effective manner to test association with the phenotype of interest. In the context of GWASs, the term “common variants” generally refers to SNPs present in the population with a prevalence of at least 1%. Because the ultimate goal in radiogenomics is to develop a clinically useful screening assay, it is important that GWASs identify relatively common genetic risk factors, rather than rare mutations such as those found in ATM or other radiosensitivity syndrome genes. GWAS as a study design has been reviewed extensively elsewhere (36). Here, we discuss some aspects with particular relevance to radiogenomics.
Genotyping microarrays used in GWASs are designed to take advantage of linkage disequilibrium blocks, so that by genotyping a few hundred thousand SNPs, one can indirectly survey nearly all genetic variation in the genome. This makes GWASs quite cost-effective. However, it also means that the immediate results of a GWAS do not necessarily translate to easily understandable functional effects on a gene or the protein that gene encodes. Rather, SNP(s) identified through a GWAS likely tag upstream or downstream un-genotyped functional variant(s). In fact, most tag SNPs lie in noncoding parts of genes, intergenic regions, or even so-called gene deserts, which are large chunks of the genome that are distant from the next nearest gene. An example of this is the locus on chr8q24 that was initially found to be associated with risk of developing prostate cancer (37–42). The first sets of SNPs identified were several hundred thousand base pairs away from the nearest gene, the MYC oncogene, and none of them correlated with variants within MYC. It was not until subsequent fine-mapping studies that it became clear that this locus was linked to functional SNPs in multiple genes as well as upstream regulatory regions affecting MYC (43).
Because GWASs survey the entire genome, it is often the case that identified SNPs tag functional variants in genes not previously known to affect the phenotype of interest. This may actually be advantageous for radiogenomics studies, as much of the biology underlying normal tissue and organ injuries following irradiation is poorly understood. Although the general cellular and molecular response to ionizing radiation is well characterized, tissue- and organ-specific effects are less well understood and likely represent more broad biology. However, it should be noted that for the purpose of meeting the first goal of radiogenomics, it may not be essential to identify the gene being tagged or to understand what functional effect the causal variant is exerting on the protein. Although identification of the genes involved and elucidation of the molecular pathways that result in normal-tissue toxicities would be very useful to gain insight into the biology behind tissue-specific radiosensitivity for the purpose of developing mitigating agents, identification of SNPs that are strongly associated with an adverse effect is sufficient to distinguish patients at risk.
Study Designs in Radiogenomics
Two main study designs have been used by radiogenomics GWASs: a two-stage approach and meta-analysis (Fig. 2). In the two-stage approach, a single cohort is split randomly into a stage I (“discovery”) group and a stage II (“replication”) group (Fig. 2A). The discovery group is genotyped for a set of hundreds of thousands to millions of SNPs spread throughout the genome using commercially available genome-wide arrays. These data are then analyzed for association with the radiotherapy adverse effect endpoint of interest, and top SNPs are selected for analysis in the replication group using either a custom SNP array or individual genotyping assays. The main advantage of the two-stage approach is that it is cost-effective, as only a small number of SNPs are genotyped in the replication cohort. Another advantage of the two-stage approach lies in a reduced multiple-comparisons penalty applied to the results of the replication phase. If only 1,000 SNPs are selected for follow-up in the replication cohort, a less stringent P value threshold can be used to distinguish true-positive associations from false-positive associations. The main disadvantage of the two-stage approach is that secondary analysis of the data is limited, because the only SNP data available for the replication group are for those SNPs that were selected specifically on the basis of their association with the primary endpoint in the discovery group. If one wanted to review the data to examine an additional radiotherapy adverse effect endpoint, or the same endpoint at a later follow-up period or assessed using a different case/control definition, these secondary analyses would be limited to the discovery group, for whom genome-wide SNP data are available.
In the meta-analysis approach, two or more individual GWASs are conducted, often using SNP imputation to obtain results for a platform-independent set of SNPs, and the results of these separate studies are meta-analyzed (Fig. 2B). This approach takes advantage of existing datasets, and so this type of study can often be carried out with no additional genotyping costs after the initial studies are completed. Thus, this type of study design tends to be possible only after the primary GWAS results have been published. The main advantages of meta-analysis are that it allows for increased sample size and, therefore, increased statistical power, and that it is less sensitive to interstudy variability in treatment protocol, especially if a random-effects model is used. The main disadvantage of meta-analysis is that it requires the extra step of harmonizing adverse effect endpoints across studies that often use differing measurement tools. This issue is described in more detail below in the section “Challenges in Radiogenomics Studies.”
Statistical models used in radiogenomics are diverse and include linear and logistic regression as well as time-to-event analysis. This is because adverse radiotherapy effects may be characterized as binary, continuous, or ordinal outcomes, and are often assessed at multiple time points over a course of several years following treatment. One approach is to define patients as cases and controls by setting a cutoff point in the toxicity grading scale or symptom score. The cutoff point can allow for dichotomization of all patients, or it can be set such that only individuals at the extremes of the distribution are defined as cases or controls, and the intermediate group is excluded. For example, in assessment of radiotherapy adverse effects using the Common Terminology Criteria for Adverse Events (CTCAE) grading scale, individuals may be considered cases if they have grade 2 or worse, and those with grade 0 or 1 would be considered controls. Alternatively, one could treat the adverse effect grade as an ordered categorical, or ordinal, variable. This method allows all available information to be used and is often statistically more powerful than collapsing adverse effect grades into case/control groups, though it may be more sensitive to misclassification bias in cases where it is difficult to distinguish between severity grades. A third approach is to treat the adverse effect as a quantitative trait, leaving the outcome as a continuous measure. This can be done when the outcome is measured using multi-item symptom questionnaires such as the American Urological Association Symptom Score, which is a seven-item, 35-point questionnaire related to urinary symptoms commonly experienced following radiotherapy for prostate cancer (44). With all of these approaches, different time-frame restraints can be placed on follow-up, or a time-to-event analysis could be used.
Challenges in Radiogenomics Studies
Because of the complex nature of the adverse effects studied, and the fact that these effects occur specifically in response to an environmental exposure, radiogenomic studies are subject to a unique set of challenges (Box 1). These challenges are outlined below, with examples from published and ongoing studies.
Need to account for baseline symptoms
Effect modification by dosimetric variables
Confounding by genetic ancestry and the “center-effect”
Harmonization of adverse effect endpoint measurements
Long-term follow-up needed to capture late effects
First, for some tumor sites, the commonly observed adverse effects overlap with symptoms sometimes seen in the given population that are not due to radiation exposure and thus are not specifically pathogenomic for radiation injury. Because of this, radiogenomics studies must often account for baseline symptoms. For example, in prostate cancer, patients often present with some level of baseline urinary symptoms or erectile dysfunction due to the impact of the tumor on surrounding normal tissues, benign prostatic hyperplasia, or other processes associated with aging. As the goal is to identify SNPs that are associated with radiation-induced damage to these tissue sites, it is important that baseline symptoms are accounted for in SNP association tests. Investigators account for baseline symptoms either by subtracting pretreatment symptom scores from posttreatment scores, excluding patients with poor baseline function, and/or adjusting for baseline function in multivariable analyses.
A second challenge in radiogenomics is variability in radiotherapy protocols, which itself leads to variability in incidence and severity of adverse effects. Dose, volume, radiation type, and delivery method are likely to be important effect modifiers in SNP association with adverse effects following radiotherapy. It is important that detailed treatment and dosimetric data are available for patients included in radiogenomics studies so that these factors can be investigated and, if necessary, accounted for in SNP association tests. This information is critical for the determination of whether SNPs significantly associated with adverse effect endpoints are associated independently of treatment factors. Adjusting for, or stratifying by, such treatment factors allows for a more accurate estimate of SNP effect sizes. Fortunately, in GWAS, the effects of dose and treatment protocol are limited to effect modification and are not confounding. By definition, a confounder must be associated with both the exposure (i.e., SNP) and outcome (i.e., adverse effect). Although dosimetric factors clearly affect incidence of adverse effects, they cannot affect SNP genotype, and their impact on SNP–outcome association is thus limited to modification.
A third challenge, and a potentially significant source of confounding in radiogenomics studies, comes from the so-called “center-effect,” where differences in genetic ancestry and differences in treatment protocol, covariates, or outcome measure cosegregate by study site (45). Because radiogenomics will rely increasingly on collaborative studies and pooled datasets, confounding by genetic ancestry is a real issue. Many of the previously published candidate gene SNP studies did not account for genetic ancestry differences across sites when attempting to replicate previous findings, and this likely contributed to some of the conflicting results in the literature. It will be important, in the GWAS era, to ensure that possible center effects are explored and dealt with by stratifying, or adjusting for, study site, or by conducting a meta-analysis with checks for between-study heterogeneity (46).
A fourth challenge lies in handling the various measurement systems and follow-up schedules used to assess adverse effects. There are several commonly used adverse effect measurement systems, including CTCAE, LENT-SOMA, etc. There are also institution-specific questionnaires used only by single study sites. Each of these tools has a different scale. Some tools are patient-reported, whereas others are physician-assigned. Some tools measure a single endpoint, such as telangiectasia, whereas others measure adverse effects on a whole-tissue basis, such as skin toxicity. Each separate measurement tool lends itself to a different type of statistical analysis. Some studies use a set time point, for example 2 years, to assess toxicity, whereas others take the maximum/worst score out of a block of time, conduct time-to-first-event analysis, or test multiple time points. The lack of uniform measurement and reporting of radiotherapy adverse effects makes it difficult to draw comparisons across studies, and, going forward, presents a challenge to investigators attempting to combine cohorts from different institutions or make generalizations for single-institution studies.
Finally, a fifth challenge lies in the fact that, because radiogenomics aims primarily to identify predictors of late adverse effects, long-term follow-up is needed. By definition, late effects occur after a minimum of 3 to 6 months postradiotherapy, but in practice, many effects do not manifest until several years after treatment. Ideally, radiogenomics cohorts require follow-up for 5 years or longer to ensure that patients are adequately assessed for incidence of adverse effects and to minimize misclassification bias introduced by including patients as “controls” who have not been followed for an adequate amount of time needed for adverse effects to manifest.
The International Radiogenomics Consortium
Despite the challenges faced in designing radiogenomics studies, it will be necessary for the field to move forward through a collaborative effort. This is the only way to build large cohorts of patients, pooled from multiple institutions, to conduct well-powered GWASs. For example, to study radiation proctitis following treatment of prostate cancer, assuming an incidence of 10% for a particular adverse effect, approximately 500 case subjects would be required (out of 5,000 patients total) to achieve 80% power to identify an SNP with a minor allele frequency (MAF) of 20% and a per-allele effect size of 1.5 based on a genome-wide significance level of P = 5.0 × 10−8 (ref. 47; Fig. 3A, pink line). If some SNPs have larger effect sizes, as is the case for many SNPs associated with drug response, smaller sample sizes would be sufficient, at least for that subset of SNPs (Fig. 3A, red line). Also, if one assumes that not just one but several, possibly hundreds of such SNPs exist, the power to detect any one out of many SNPs is higher. Still, sample sizes in the thousands will be needed to carry out high-quality, comprehensive radiogenomics studies, and these numbers cannot be obtained by single treatment centers. This means that radiogenomics must rely heavily on collaborative work with cohorts consisting of samples pooled from more than one institution or study site. Meta-analysis of individual GWASs can also substantially boost power to identify genome-wide significant loci (Fig. 3B).
In 2009, leaders in the field formed the international Radiogenomics Consortium (RGC) to foster collaboration and encourage investigators to pool resources for increased statistical power (34, 35). The RGC is a National Cancer Institute–supported Cancer Epidemiology Consortium (48). To date, the RGC is represented by more than 150 investigators at 80 institutions in 19 countries, and it is open to any investigator interested in radiogenomics research. Ongoing work includes studies aimed at identifying genetic predictors of radiotherapy adverse effects for nearly every cancer type, including breast, prostate, lung, gynecologic, and head and neck cancers. The pooled resources of RGC members have been used to conduct GWASs (49–52), to validate candidate gene SNP associations (53–56), and to develop new analytic methods (57). The following section reviews these published radiogenomics studies.
Published Radiogenomics Studies
Validation of Candidate Gene SNPs
Two articles were recently published that represent a collaborative effort by RGC members to definitively test whether previously reported candidate SNPs are in fact significantly associated with radiotherapy adverse effects. The first study aimed to determine definitively whether the commonly studied SNP rs1800469 in the TGF-β gene (TGFB1), which encodes a profibrotic cytokine, is associated with overall late toxicity following radiotherapy for breast cancer (54). DNA from 2,782 participants from 11 cohorts was tested for association between rs1800469 and overall toxicity as well as breast fibrosis specifically. This study obtained an OR of 0.98 with a 95% confidence interval of 0.85–1.11, which the authors concluded was sufficiently narrow to rule out any clinically relevant effect on toxicity risk associated with this SNP. Importantly, because a meta-analysis approach was used, this study was less prone to bias introduced by between-study variability in adverse effect grading scales or radiotherapy treatment protocols.
The second study aimed to validate 92 previously studied SNPs in 46-candidate gene in a large, independent cohort of patients enrolled in the RAPPER trial (53, 58). The study included patients with both breast and prostate cancer, and the endpoints investigated were both tissue-specific (such as breast fibrosis; urinary frequency) and overall toxicity. Where appropriate, baseline symptoms were accounted for upon assigning an adverse effect score. None of the previously studied SNPs was found to be significantly associated with any endpoints after correction for multiple comparisons. This is an exemplary study due, in part, to the high statistical power it possessed to detect clinically meaningful effects. The study included 1,613 patients (treated for breast or prostate cancer) yielding 99% power to detect an SNP with MAF of 0.35 associated with a per-allele OR of 2.2 for overall toxicity. It is possible that some of these SNPs may have a smaller effect size, which would still be of interest if included in polygenic models, but this large study ruled out, with very high probability, the possibility that any one of these SNPs alone confers high risk of developing adverse effects.
Although these publications report well-designed validation studies, they do not eliminate the possibility that some radiation response pathways play a role in clinical radiosensitivity. Previously studied candidate genes may eventually prove to contain SNPs predictive of radiotherapy adverse effects, perhaps with smaller effect sizes than the current studies have been powered to detect or possibly via different genetic variants that have not yet been captured. Indeed, a major advantage of GWASs is that large amounts of data are generated that can be reanalyzed for a subset of genes in such candidate pathways.
Development of Radiogenomics Analytic Methods
Methods articles are beginning to emerge that aim to develop analytic approaches and tools that address the challenges faced by radiogenomics studies, as described above. In a recent publication, RGC investigators have collaborated to develop a scale- and grade-independent measure of overall toxicity (termed STAT, for Standardized Total Average Toxicity; ref. 57). The authors explain that the purpose of the STAT score is 4-fold: (i) to obtain a measure of overall toxicity in instances where multiple adverse effects are experienced in one tissue site, (ii) to create a scale-independent measure of toxicity for the purpose of pooling samples from different study sites that use different scoring systems, (iii) to deal with missing data in patient datasets, and (iv) to aid in controlling for confounding factors that are not uniformly present across all datasets included in the analysis. To address these four issues, the STAT score is computed by first calculating a standardized Z score for each adverse effect for each patient. The multiple standardized Z scores are then averaged to obtain a standardized score representing all endpoints of interest. By first standardizing each Z score, STAT eliminates the problem of, for example, urinary morbidity being graded on a 0-to-4–point scale but erectile dysfunction graded on a 25-point scale. This would also address the issue of the same endpoint being graded differently between studies.
When tested in a cohort of breast cancer patients who participated in the Cambridge IMRT trial (59, 60), the STAT score correlated well with factors known to be associated with one or more adverse treatment effects, including breast volume, smoking status, acute toxicity, and volume of irradiated tissue (57). The authors also used a “leave-one-out” analysis to show that residuals analysis from the STAT score calculated using all individual adverse effect endpoints was highly correlated with STAT scores calculated after omitting each endpoint one at a time. This lends support to the idea that the STAT score can be used as a measure of toxicity in multiple studies that do not each have data on the exact same endpoints. The authors then showed that modification of the scales for each individual endpoint had minimal effect on the association between STAT score and known predictors of toxicity, supporting the claim that STAT can be used to harmonize endpoints across studies that used differing grading scales. Finally, they confirmed that association between STAT and known predictors of toxicity is similar when all patients are included in the analysis and when patients with missing data are excluded, supporting the claim that the STAT score is able to properly address the problem of missing data.
Genome-Wide Association Studies
Publications are beginning to emerge from radiogenomics GWASs, though many of the large, collaborative efforts are still under way. A PubMed search for (“radiotherapy” OR “radiation”) AND (“genome-wide association study” OR “gwas”) AND “humans” produced 75 publications, of which just nine were found to be primary reports of GWASs of adverse effects of radiotherapy. Excluded studies involved GWASs of survival or other treatment endpoints among radiotherapy cohorts, gene expression studies, candidate SNP studies, studies of environmental UV radiation exposure, and review articles. Among the nine radiogenomics GWAS publications, three report on studies of second malignancies following exposure to radiation (61–63), one reports on a GWAS of acute toxicity (64), and one reports on cellular death in response to radiation (65). The four published GWASs of late effects are reviewed here. All four studies focus on late effects in patients with prostate cancer, due in part to availability of relatively large cohorts. GWASs of adverse effects of radiotherapy for breast cancer, head and neck cancer, and lung cancer are currently in progress (RGC Investigators; personal communication).
The first radiogenomics GWAS was published in 2010 and aimed to identify SNPs associated with development of erectile dysfunction among a small cohort (N = 79) of African American men treated with external beam radiation therapy (EBRT) for prostate cancer (49). One SNP was identified with an association P value that reached genome-wide significance (P = 5.5 × 10−8), and several others were identified that were suggestive of significance (P < 10−6). Although this study was only a discovery GWAS, and the SNPs identified must be replicated in additional cohorts, it is important for several reasons. First, the top SNP is interesting because it tags a locus within the FSHR gene, which encodes the follicle-stimulating hormone receptor involved in gonad development and function (66, 67). Rather than identifying genes involved in the pathways that affect cellular radiosensitivity, this study identified a gene that is involved in the normal function of the tissue affected. Although this does not mean that the general radiation response pathways are not important in radiotherapy adverse effects, it suggests that other, tissue-specific pathways may also be important. This finding highlights the benefits of using a GWAS approach, which does not rely on a priori assumptions about the underlying biology of the phenotype of interest.
A second GWAS examining erectile dysfunction following radiotherapy for prostate cancer (brachytherapy and/or EBRT) was recently published (50). This study included a larger sample size (N = 593) and used a two-stage, nested case–control design with erectile dysfunction case/control status as the phenotype. The blood samples from patients included in this study were part of the Gene-PARE biorepository (68), and this study was a collaborative effort across multiple institutions. A total of 25 SNPs were identified that had low P values, with effects of similar magnitude and in the same direction in both discovery and replication cohorts, though none of these SNPs reached genome-wide significance. A logistic regression model including the set of 12 most robustly associated SNPs produced a receiver operating characteristic curve with an area-under-the-curve (AUC) of approximately 0.8 in two independent test cohorts, though these cohorts were too small to serve as independent replication studies. An interesting aspect of this GWAS is that, similar to the previous GWAS of erectile dysfunction, one of the most strongly associated SNPs lies within a gene involved in sexual function, rather than one of the known radiation response pathways. The SNP rs11648233 resides in the 17-beta-hydroxysteroid dehydrogenase II gene (HSD17B2), which functions in the pathway that produces and regulates testosterone level (69). Other SNPs in HSD17B2 have been found to be associated with testosterone level in men with localized prostate cancer (70). As in the previous GWAS of erectile dysfunction following radiotherapy, this SNP would have been missed if a candidate gene approach had been used.
A third publication reports on a GWAS to identify SNPs associated with the development of urinary symptoms following radiotherapy for prostate cancer, and was carried out in the same cohort included in the erectile dysfunction GWAS (51). Similar to the erectile dysfunction GWAS, this study used a two-stage GWAS. Rather than using a nested case–control design, urinary symptoms were classified as a continuous variable and included the full patient cohort with complete data available (N = 723). In this study, urinary symptoms were measured using the American Urological Association Symptom Score, with baseline score subtracted from the posttreatment score. This GWAS identified a set of eight SNPs tagging a single haplotype block on chromosome 9p21.2. The most strongly associated SNP in this block, rs17779457, had a combined P value of 6.5 × 10−7 and lies just upstream of the IFN-κ (IFNK) gene. This gene is involved in inflammation response to radiation exposure (71–73), though it has not been previously investigated in candidate gene SNP studies. Interestingly, another SNP, rs13035033, which was only marginally associated with overall urinary symptoms (P = 1.2 × 10−5), was associated with urinary straining at genome-wide significance (P = 5.0 × 10−9). This finding lends support to the hypothesis of multiple genetic risk factors for different types of adverse effects, even in the same tissue type.
The fourth article reports on a radiogenomics GWAS of rectal bleeding following radiotherapy for prostate cancer (52). This study also includes the set of patients involved in the GWAS of erectile dysfunction and urinary symptoms, but because rectal bleeding is a rarer outcome, all of these patients were included in the discovery stage, and an independent cohort pooled from several study sites was used as a replication group. To control for the center-effect, SNP association tests conducted in the replication stage were adjusted for study site. This study identified one locus on chromosome 11q14.3 containing two SNPs in strong linkage disequilibrium. The most strongly correlated SNP, rs7120482, had a combined P value of 5.4 × 10−8. This SNP lies in a so-called “gene desert” in between MTNR1B and SLC36A4. Another SNP identified in this study, rs4904509, is located just upstream of FOXN3, which is a DNA-damage checkpoint suppressor protein (74). Although this was not the strongest candidate from this GWAS, if replicated in additional studies, it would support the idea that there exist genetic factors associated with radiation-induced injuries that are related to general radiation response, as was originally hypothesized in the candidate gene studies, as well as tissue-specific genetic factors associated with radiosensitivity.
The most recent radiogenomics GWAS represents a multi-institutional effort to identify SNPs associated with adverse effects following radiotherapy for either prostate or breast cancer (personal communication). This GWAS, from the UK RAPPER study, examined a variety of individual toxicity endpoints as well as overall toxicity at 2 years following radiotherapy. This study is the largest GWAS to date, with 1,217 patients who received adjuvant breast radiotherapy and 633 patients who received radical prostate radiotherapy (EBRT). Top SNPs from this discovery study were tested in three independent cohorts of patients (N = 1,378 prostate; N = 355 breast). The results of this study will be important, as this is the first radiogenomics GWAS to focus on breast cancer patients. The RAPPER GWAS and the GWAS of rectal bleeding are important in that they include independent patient cohorts to test the SNPs initially selected from a discovery GWAS. Although it is challenging to obtain independent test cohorts in radiogenomics due to the requirements of detailed radiotherapy treatment and follow-up data, it is nevertheless important to reduce the chance of identifying false-positive SNP associations.
A Clinical Assay to Identify Patients at Risk of Developing Adverse Effects from Radiotherapy
Radiogenomics, such as pharmacogenomics, is a promising area of research in the broader field of precision medicine because the results of radiogenomics studies are potentially actionable. Risk of adverse effects on surrounding normal tissues is dose limiting, and identification of high-risk individuals could allow increases in dose for the remainder of the patient population, thereby improving the therapeutic index. Therefore, the ultimate goal of radiogenomics is to affect the step in cancer care where treatment decisions are being formulated. A clinical assay to classify a patient's risk of adverse effects based on genetic information could guide the decision-making process at this point. Although radiogenomics studies are clearly still in the early stages, preliminary results suggest that, like pharmacogenomics, effect sizes may be larger than those typically seen in GWASs of complex traits such as type II diabetes or cardiovascular disease risk. Large effect sizes for risk SNPs identified in pharmacogenomics studies have helped to quickly advance the transition from bench to bedside (75), and the hope is that the same course can be followed in radiogenomics.
Radiogenomics investigators envision an SNP-based assay to be used as a complementary tool that could be incorporated into existing risk prediction models already used in radiation oncology. Some work toward designing such a model has already begun. An early article by Cesaretti and colleagues (76) reported that combining information on ATM sequence alterations (SNPs and rare variants) with rectal dose resulted in improved stratification of patients based on incidence of rectal bleeding. In a more recent article, Tucker and colleagues (77) incorporate SNP information from candidate genes TGFb, VEGF, TNFa, XRCC1, and APEX1 into the dosimetry-based Lyman NTCP model, and showed an improved predictive ability of the new model. These studies were both carried out using relatively small numbers of subjects treated at a single institution, but nevertheless provide important examples of how genetic information can be combined with existing risk factors. Going forward, it will be important to identify SNPs that have been replicated and validated in large, diverse cohorts before predictive models can be developed that are generalizable to the broader patient population.
The ultimate goal of radiogenomics is to aid clinicians and patients in personalizing and optimizing therapy. On the basis of the predicted probability that a given patient would develop adverse effects from radiotherapy, balanced against the disease prognosis, a decision could be made to opt for surgery, if possible, or to modify the radiotherapy protocol to use a lower dose, different fractionation schedule, more conformal therapy, or a different type of radiation source such as protons (Fig. 4). For some early-stage or low-risk cancers, such as early-stage prostate cancer, the adverse treatment effect profile may outweigh the predicted benefit of treatment and a patient may choose active surveillance. It should be noted that the risk profile for a given patient may be complex. The initial results of the first few GWASs suggest that different sets of SNPs may predict for different adverse effects. If this proves correct, then undoubtedly some patients will be at high risk for one complication, for example urinary morbidity, but at low risk for another complication, such as rectal bleeding. Thus, the patient would then need to consider this complex risk profile to arrive at the best decision with his or her doctor. The goal becomes achieving the greatest possible efficacy balanced against minimizing toxicity.
Future Directions in Radiogenomics
Success in reaching the goals of radiogenomics will require large-scale, collaborative GWASs, development of robust, accurate predictive models, and cooperation with clinicians who will be the end-users of SNP-based predictive assays. Following the success of GWASs of other complex phenotypes, radiogenomics GWASs should be designed with adequate sample sizes, well-defined and harmonized phenotypes, and rigorous statistical methods. It will be important to follow up on SNP associations from GWASs in independent test cohorts. This will prevent the field from succumbing to the well known “winner's curse” of reporting many false-positive results with no follow-up studies to distinguish the true-positive SNPs that can be used in predictive models. It will also be important to include ethnically diverse cohorts so that clinical assays can include SNPs relevant to the full spectrum of patient populations in need. Finally, as the field begins to develop predictive risk models, it will be critical to bridge the gap between research and clinical practice, and include all stakeholders—researchers, clinicians, and patients—in studies of acceptability of genetic testing and clinical decision making. These steps should point the field toward positively affecting quality of life for millions of cancer survivors.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: S.L. Kerns, H. Ostrer, B.S. Rosenstein
Development of methodology: H. Ostrer, B.S. Rosenstein
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): H. Ostrer, B.S. Rosenstein
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): H. Ostrer, B.S. Rosenstein
Writing, review, and/or revision of the manuscript: S.L. Kerns, H. Ostrer, B.S. Rosenstein
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): S.L. Kerns, H. Ostrer, B.S. Rosenstein
Study supervision: H. Ostrer, B.S. Rosenstein
Grant Support
This research was supported by grants RSGT-05-200-01-CCE from the American Cancer Society (to B.S. Rosenstein), PC074201 from the Department of Defense, and 1R01CA134444 from the NIH (to B.S. Rosenstein and H. Ostrer).