Traditional endpoints such as progression-free survival and overall survival do not fully capture the pharmacologic and pharmacodynamic effects of a therapeutic intervention. Incorporating mechanism-driven biomarkers and validated surrogate proximal endpoints can provide orthogonal readouts of anti-tumor activity and delineate the relative contribution of treatment components on an individual level, highlighting the limitation of solely relying on aggregated readouts from clinical trials to facilitate go/no-go decisions for precision therapies.
INTRODUCTION
Randomized phase III trials are the current gold-standard instruments utilized to prove how a novel therapeutic approach may be superior against standard of care as a control arm. To demonstrate improvement, differences in outcomes should be clinically meaningful and statistically significant often captured in imaging-based endpoints of objective response rates (ORR), progression-free survival (PFS), and overall survival (OS). Clinical trial endpoints provide objective measures that reflect the safety and efficacy of therapeutic interventions at each phase of drug development in all disease areas. Importantly, they also represent checkpoints for go/no-go decisions, ultimately with the aim of identifying interventions that prolong OS and/or improve quality of life (QOL) across two aggregated cohorts. Once a predefined threshold for a clinically meaningful magnitude of benefit of a specific intervention is demonstrated, regulatory approvals and payer reimbursement can follow, thereafter enabling widespread implementation to the broader community.
Yet the traditional phase III paradigm presents challenges as we attempt to leverage recent advances in precision oncology. Routine application of next-generation sequencing (NGS) in clinical practice, advances in multiomics tumor profiling, and high-throughput functional screens now depict comprehensive molecular portraits of individual cancers and their major hallmarks as well as enable us to forecast evolutionary trajectories. Coupled with enhanced drug-discovery platforms and the emerging role of synthetic biology, we now have a rich therapeutic toolbox of next-generation targeted therapies, engineered cell therapies, immuno-modulatory approaches, antibody–drug conjugates with the potential to address key therapeutic vulnerabilities in individual cancers.
With the number of new drug approvals by the FDA being at its highest in recent years, the burgeoning number of effective therapeutic options has led to the widening gap between the initial effects of a specific intervention at one point in the patient's treatment continuum and traditional endpoints such as OS (1). Here, we will discuss the limitations of current traditional endpoints and phase III trial designs, how developments in novel methods and technologies, e.g., circulating tumor DNA (ctDNA) evaluated in real-time longitudinally, can represent intermediate endpoints to complement radiologic imaging, allowing for more direct inference of the treatment effect and clonal evolution (2). We further highlight emerging solutions to facilitate drug development that include utilizing high-resolution technologies to derive trial endpoints, as well as the use of knowledge banks and real-world data to generate synthetic controls and digital twins to support drug development.
LIMITATIONS WITH THE CURRENT RELIANCE ON RANDOMIZED PHASE III TRIALS AND OS AS AN ENDPOINT
There remains equipoise with regard to accelerated drug approvals and reimbursement in the absence of mature randomized phase III data. Although anticipated clinical benefits from single-arm phase II trials may not always recapitulate in extended cohorts and longer follow-up, there have also been concerns over delayed access to novel therapeutics, especially if addressing an area of high unmet medical need. Furthermore, interpreting OS as a direct result of a specific therapy is fraught with challenges. First, given the growing number of new modalities and therapies, OS as an endpoint is increasingly confounded by cross-over or subsequent treatments making it exceedingly difficult to clearly attribute clinical benefit or the lack thereof to the treatment effect by a specific intervention (Fig. 1). As precision therapies are advanced to the adjuvant and neoadjuvant settings, interpreting OS will require due consideration of the impact of prior and subsequent therapies (including advances in treatment of metastatic disease).
Precision endpoints for contemporary clinical trials. Phase III clinical trial designs utilize traditional clinical endpoints in trial participants selected based on broad eligibility criteria. Despite randomization, differences in underlying biology may still lead to biases and reduced effect size. Contemporary precision oncology trials can incorporate deep molecular profiling from tissue and liquid biopsies that are prognostic and predictive for improved patient selection as well as progressive intermediate endpoints with early pharmacodynamic readouts to complement imaging for improved evaluation of drug effects. Synthesizing data sets across real-world registries and trials along with emerging solutions such as artificial intelligence, machine learning, generating synthetic controls, and digital twins can help facilitate trial conduct and accelerate drug development. R, randomized; PD, progressive disease; PFS, progression-free survival; OS, overall survival; CtDNA, circulating tumor DNA; AI, artificial intelligence; ML, machine learning; RWD, real-world data; RCT, Randomized controlled trial; ORR, objective response rate; PFS1, progression-free survival on therapy B; PFS2, progression-free survival on therapy B1.
Precision endpoints for contemporary clinical trials. Phase III clinical trial designs utilize traditional clinical endpoints in trial participants selected based on broad eligibility criteria. Despite randomization, differences in underlying biology may still lead to biases and reduced effect size. Contemporary precision oncology trials can incorporate deep molecular profiling from tissue and liquid biopsies that are prognostic and predictive for improved patient selection as well as progressive intermediate endpoints with early pharmacodynamic readouts to complement imaging for improved evaluation of drug effects. Synthesizing data sets across real-world registries and trials along with emerging solutions such as artificial intelligence, machine learning, generating synthetic controls, and digital twins can help facilitate trial conduct and accelerate drug development. R, randomized; PD, progressive disease; PFS, progression-free survival; OS, overall survival; CtDNA, circulating tumor DNA; AI, artificial intelligence; ML, machine learning; RWD, real-world data; RCT, Randomized controlled trial; ORR, objective response rate; PFS1, progression-free survival on therapy B; PFS2, progression-free survival on therapy B1.
Second, in defined molecular subsets with a precise biomarker-matched therapy—where the magnitude of effect size accorded by optimal patient selection through robust multiomics profiling approaches significantly exceeds the interindividual variation and bias addressed through randomization and stratification—the role of conducting a phase III trial randomized against a control arm of limited efficacy can be less clear (Fig. 1). Although it is important to generate extended safety data and valid comparator arms, the trade-off is delayed access to potentially life-saving therapies. Additionally, if the prevalence of the molecular subset is low, high screen failure rates mean that many recruitment centers will have to participate, and accrual will take time leading to escalating infrastructure, administrative and trial costs.
Third, with the increasing use of combination therapies, delineating the contribution of individual treatment components can be challenging, especially when considering the heterogeneity of disease biology and treatment effects. For example, increased access to radiation techniques that deliver highly effective local therapy can significantly affect life-limiting complications such as central nervous system disease. Although these advancements may not always prolong OS in the presence of extracranial disease, they play a crucial role in preserving the patient's function and QOL.
Finally, as the pace of implementation of efficacious novel therapies accelerates, the time required to evaluate OS exceeds the length of innovation cycles for that specific cancer type. By the time mature trial results are reported, the control arm may already be supplanted, reducing the applicability of the therapeutic intervention in real-world clinical practice. In the perioperative setting for early-stage cancers, there remains ongoing debate over the optimal primary endpoint. Disease-free survival has been increasingly utilized in many studies but yields limited information about how the natural history of the disease is augmented by early use of an investigational drug, while, on the other hand, OS can be confounded by the effects of prior and subsequent therapies.
THE NEED FOR TARGETED AND MECHANISM-DRIVEN SURROGATE ENDPOINTS
As a result, single-arm cohorts selecting patients based on commonalities in tumor genomic alterations (basket trials) and tissue-of-origin (umbrella trials) are common and surrogate endpoints such as PFS and ORR have become important readouts in early-phase clinical trials and for cancer drugs considered for accelerated approval. Still, this is largely reliant on the sensitivity of the imaging assessment tools and patients having target lesions which are discrete and reproducible to measure. With the increasing use of multimodality combinations, including immunotherapy and radiotherapy, imaging response assessment is increasingly confounded by post-therapy reactive changes calling for more accurate and sensitive methods that can help complement imaging and enable better response evaluation. Moreover, a 30% reduction in the sum of the longest diameters of all lesions is required for imaging response criteria to be met. Although this criterion represents an objective measure of significant response (as per RECIST), there are instances when this may not be feasible. For instance, leptomeningeal disease in advanced cancers may not always be measurable on imaging, yet clinical improvement can lead to fulfilling and functional time gained by a patient. When evaluating a novel drug that has a favorable administration route or side effect profile compared with standard of care, measuring QOL and rate of adverse events may reflect improved tolerability. For therapeutics targeting specific cancer processes such as metastatic invasion, measuring time to new metastases rather than a composite endpoint such as ORR or PFS would better reflect the drug effect. With the emerging array of precision therapeutics, selecting appropriate mechanism-driven endpoints, in addition to traditional endpoints, is crucial in the comprehensive assessment of therapeutic effect.
ADOPTION OF HIGH-RESOLUTION BIOMARKER TOOLS AND INTERMEDIATE ENDPOINTS IN MEASURING CONSEQUENCES OF THERAPEUTIC EFFECT
With increasingly high-resolution biomarker tools for improved patient selection as well as lower limits of detection for assessing response to therapy, our ability to detect earlier signals of efficacy will continue to shape response criteria and the evolution of clinical trial endpoints (Fig. 1). The expansion of NGS technologies and computational toolboxes for cancer genome and transcriptome profiling has now enabled deep molecular profiling to unravel the complexity of tumor heterogeneity at unparalleled resolution. Genetic tumor profiling of tissue and liquid biopsies either with NGS panels or comprehensive unbiased sequencing has contributed to clinical decision-making in the era of precision oncology and, yet, mutations captured from the cancer genome are only the tip of the iceberg. From charting mutational signatures to multiomics profiling with orthogonal methods, including epigenetic, transcriptomic, and proteomic approaches on biopsy samples at baseline and throughout therapy, this compendium of data allows for a holistic view of tumor characteristics at a molecular level at baseline and in response to therapy (3, 4). In practice, we may eventually be able to move beyond single-gene genomic classifications to multiomics taxonomy dissecting key druggable traits, delineate therapeutic opportunities for each trait, and identify sources of interpatient variation.
In assessing the pharmacodynamic effects of therapeutic intervention, evaluating proximal to distal consequences of therapy can help determine reliable intermediate endpoints of treatment effect. For example, early target and pathway modulation can eventually lead to downstream mechanisms of apoptosis that consequently translate to dynamic markers of tumor response such as longitudinal changes in ctDNA levels or pathologic response in resection specimens. For drugs that modulate the tumor microenvironment (TME), evaluating changes in TME state and/or immune monitoring methods, including multiparameter flow cytometry and multiplex cytokine arrays from tissue or plasma, can also provide information regarding shifts in the immune and biomarker contexture, providing yet another intermediate endpoint of drug effect.
The routine use of radiologic imaging has led to the widespread adoption of the RECIST criteria for response assessment in clinical trials, but this method offers a single-dimension snapshot of tumor response to therapy, and radiomic features as well as functional imaging can add another layer of information to provide greater insights (5). ctDNA in liquid biopsy can also be deployed in tandem with imaging as a parallel, minimally invasive, diagnostic tool for monitoring dynamic, more granular changes of disease burden during treatment, providing both qualitative and quantitative information that can predict treatment response or resistance (2).
By unifying molecular data from multiple longitudinally measured variables extracted from both tissue and liquid biopsies, together with clinical variables such as tumor size, histologic subtypes, digital pathology, and radiology imaging, we can now derive biological insights and identify novel biomarkers or intermediate endpoints that reflect benefit from therapy, glean predictive biomarkers to forecast response, as well as define drug-tolerant and persister states.
CHALLENGES IN DEPLOYING INTERMEDIATE ENDPOINTS
For dynamic intermediate endpoints such as ctDNA response criteria, there will need to be harmonization of assay technology, standardization of reporting, establishing qualitative and quantitative criteria for classifying progressive changes and rationalizing ctDNA changes against imaging as well as identifying appropriate sampling timepoints where ctDNA technology can best complement imaging (2). Response rates and durability of responses are often hampered by tumor adaptation with mechanisms such as compensatory pathways, subversion of apoptosis, and ongoing tumor evolution. There is an urgent need to incorporate analytically validated pharmacodynamic biomarkers prospectively in early-phase clinical trials for biomarker credentialing.
It is likely that the correlation between these intermediate endpoints and clinical benefit may vary with tumor type and stage and whether the therapy is applied in the adjuvant curative or metastatic phase of the disease. For drugs that target specific cancer hallmarks such as inhibiting metastatic processes, e.g., epithelial–mesenchymal transition, even if the drug works as intended and fulfills all intermediate endpoints, patients could still suffer from cancer cachexia or local complications of an uncontrolled primary tumor. For each specific class of therapeutics, it would therefore be important to define not only the series of progressive endpoints that will best correlate with drug activity and survival but also relevance to the therapeutic niche and clinical context.
PRAGMATIC DESIGNS IN PRECISION ONCOLOGY
Advanced multiomics profiling has revealed that even tumors sharing common genomic drivers can demonstrate significant molecular and phenotypic heterogeneity within the tumors and between individuals. As we interrogate therapeutic effects in smaller subsets of patients, conducting a randomized study becomes increasingly infeasible. Conversely, in N-of-1 trials, matching an individual's molecular profile to the appropriate treatment or early-phase clinical trials becomes the key challenge.
In N-of-1 trials, patient outcomes are heavily dependent on physicians having access to the full armamentarium of treatment modalities and novel therapeutics as well as the ability to adequately address the molecular vulnerabilities of each individual cancer. In the WINTHER study, patients who had a greater degree of matching between administered treatment and molecular alterations detected had better survival outcomes (6). Therefore, it would be important to develop a standardized approach to integrate and interpret orthogonal omics data into coherent mechanisms driving tumor biology, performed in a timely manner and at scale. Multidisciplinary molecular tumor boards (MTB) consisting of pathologists, medical, radiation, and surgical oncologists (both treating and early-phase physicians), scientists, including bioinformaticians and laboratory scientists, and clinical coordinators will be crucial in driving the process from sample collection and processing, standardizing and selecting the right assays for each individual patient to derive the most mechanistic insights, to making recommendations and allocating patients to the appropriate treatments (Fig. 1). Besides recommending suitable drug trials, a greater challenge for the MTB would be incorporating different treatment modalities, e.g., surgery, radiotherapy, chemotherapy, tumor treating fields, immunotherapy, or novel targeted therapies, from the expanding therapeutics toolbox in a rational manner, while providing clinical insights from well-annotated cohort-level data.
LEVERAGING KNOWLEDGE BANKS AND REAL-WORLD DATA FOR CLINICAL DECISION SUPPORT
Well-curated real-world data sets (RWD) and knowledge banks are important as they leverage existing data infrastructure and include information on the safety and efficacy of therapies in patients from relevant demographics and patients not normally included in randomized clinical trials. These include patients who are older, have poor performance status, with multiple comorbidities, or are from underrepresented communities. In fact, data from real-world studies have supported regulatory approval for new indications and new routes of administration as well as post-approval study requirements (7). To facilitate the meaningful integration of RWD and multiomics data, there is a need for standardization of data formats and harmonization across multiple registries and study sites. Knowledge repositories from historical or ongoing studies should be well curated with “minimal viable clinical datasets” defined to improve efficiency and consistency of data extraction and uniform reporting of responses to assess ground truths. These repositories should also be housed in data-centric computing systems that will facilitate scalability and efficiency. Incorporating temporally informed multistage statistical models can provide a rigorous and data-informed clinical decision-making support framework guiding personally tailored therapeutic interventions (8). This can provide novel insights into the dynamics and relationship between molecular changes and disease progression. By utilizing emerging algorithms and machine learning (ML) frameworks to synthesize data from preclinical studies, trials, and real-world registries, these knowledge banks can further inform biomarkers, propose novel intermediate endpoints as therapy benefit surrogates, and address shortcomings in traditional clinical trial designs.
SOME EMERGING SOLUTIONS
Tumor Growth Kinetics
As patients are unique and treatments personalized, novel endpoints are required to demonstrate clinical benefit. Von Hoff and colleagues recognized this need and developed the growth modulation index, or PFS ratio (PFSr), as a surrogate measure of treatment benefit for precision oncology trials. Through paired analysis of individual patients by assessing PFS during current targeted therapy (PFS2) in relation to PFS during the prior systemic therapy (PFS1), interindividual heterogeneity of tumors, patients, and responses to respective targeted therapies is accounted for with each patient serving as his/her own control, obviating the need for a control arm (9). PFSr has been utilized as the primary endpoint in N-of-1 studies and clinical benefit has been defined with fixed thresholds of more than 1.5 (6). However, PFSr can be influenced by extremes in PFS1. A patient with a very short PFS1 may have an exceedingly large PFSr, whereas a patient who demonstrated 12 months for both PFS1 and PFS2 and benefited from therapy would have a PFSr of 1. This results in significant discordance between fixed PFSr thresholds and physician-perceived clinical benefit. Therefore, a modified PFSr incorporating absolute PFS1 intervals was proposed, which demonstrated significantly improved concordance with perceived clinical benefit, and this will require further validation (10).
Synthetic Controls
To address the practical challenges of running randomized trials in rare molecularly defined subsets of patients with cancer, one can leverage ML and statistical methods from historical clinical trials or RWD to create a synthetic control arm that mirrors the characteristic of a particular treatment group. Synthetic controls can aid in identifying potential surrogate endpoints as counterfactual comparison groups against treatment groups, whose outcomes can be measured. Ways to ensure the validity of these synthetic controls include deploying methods to minimize sources of heterogeneity in data sets and selecting external controls with similar baseline characteristics, and are evaluated with similar response criteria as those recruited to the interventional arm of the study. External control patients should also have undergone treatments that meet contemporary standards so that comparisons can be clinically generalizable. Statistical methods, including propensity score adjustment, Bayesian analysis, and microsimulation, can be used to match covariates for external controls with interventional study patients, and the results can be assessed for robustness through sensitivity analysis (11). Nevertheless, there may always be sources of hidden bias, or the lack of high-quality external data could render synthetic controls entirely not feasible. The use of synthetic controls, powered by artificial intelligence (AI) and ML algorithms, represents an innovative solution that is likely to play a bigger role in precision oncology moving forward.
Digital Twins
Digital twins, an emerging frontier in AI science, is poised to play a prominent role in innovative trial designs. These virtual replicas are created with extensive real-world multiomics data and can represent molecules, cells, tissues, organs, systems, patients, or entire study populations. Digital twins rely heavily on mathematical modeling thus interdisciplinary training and collaboration between computational scientists, mathematicians alongside clinicians are crucial to ensure downstream interpretability. Potential applications include simulating control arms in comparative studies and predicting the effects of manipulating specific pathways for drug candidate selection in N-of-1 studies, contingent on sufficient high-quality training data. Digital twins aim to expedite drug discovery by shortening study timelines and necessitate the integration of diverse data sources (tumor, TME, immune system) across scales and time. While assessing modeling efficacy is crucial, running parallel studies and cross-referencing with preclinical and clinical trials can refine and enhance digital twin models (12).
CONCLUSION
As we continue to unravel complex cancer ecosystems and expand the breadth of clinically deployable biomarkers, we anticipate new opportunities for high-precision patient stratification as well as novel and bespoke therapeutic strategies, which will continue to drive evolution in clinical trial design. Current clinical trials remain largely reliant on imaging readouts based on tumor dimensions and traditional endpoints that do not fully capture the pharmacologic and pharmacodynamic effects of a therapeutic intervention. Thus, incorporating robust and reliable fit-for-purpose mechanism-driven biomarkers can provide earlier orthogonal readouts of antitumor activity and delineate the relative contribution of treatment components on an individual level. The use of validated surrogate endpoints can further facilitate early go/no-go decisions. Clinical decision-support tools can be further aided by insights obtained through AI- and ML-enabled analysis of clinical, molecular, and epidemiologic data in well-curated real-world knowledge banks and trial data sets.
Nevertheless, extensive validation to demonstrate correlation with clinical benefit is imperative to ensure the robustness and interpretability of ML predictions. To circumvent challenges of variability in data sources, a framework to standardize data collection, preprocessing, code sharing, and algorithms is needed to mitigate potential bias in ML modeling. Container and federated data-sharing platforms can ensure seamless transferability of algorithms and software dependencies between different stakeholders. To harness the full potential of novel technologies and emerging paradigms in clinical trials, regular dialogue between payers, regulators, industry, academics, and patients is mission critical to contextualize the implementation of appropriate endpoints and new therapies within local health systems. Such engagements should also take into account local demographics, resources, and constraints and define the minimal burden of proof needed for access and reimbursement of innovative therapies relevant to specific cancer types, patient subsets, and therapeutic niches.
Acknowledgments
R. Hoo is supported by the IASLC Adi. F Gazdar Fellowship Grant (IASLC24EGFR/RH). K.L.M. Chua is supported by the National Medical Research Council Singapore Clinician Scientist–Individual Research Grant–New Investigator Grant (NMRC/CS-IRG-NIG/CNIG20nov-0029). A.J. Skanderup is supported by the National Medical Research Council Singapore Open-Fund Large Collaborative Grant (NMRC/OFLCG/002-2018) and Open Fund-Individual Research Grant (OFIRG21nov-0083). D.S.W. Tan is supported by the National Medical Research Council Singapore Open-Fund Large Collaborative Grant (NMRC/OFLCG/002-2018) and Clinician Scientist Award (NMRC/CSA/010/2019).