Abstract
Efficient capture of routine clinical care and patient outcomes is needed at a population-level, as is evidence on important treatment-related side effects and their effect on well-being and clinical outcomes. The increasing availability of electronic health records (EHR) offers new opportunities to generate population-level patient-centered evidence on oncologic care that can better guide treatment decisions and patient-valued care.
This study includes patients seeking care at an academic medical center, 2008 to 2018. Digital data sources are combined to address missingness, inaccuracy, and noise common to EHR data. Clinical concepts were identified and extracted from EHR unstructured data using natural language processing (NLP) and machine/deep learning techniques. All models are trained, tested, and validated on independent data samples using standard metrics.
We provide use cases for using EHR data to assess guideline adherence and quality measurements among patients with cancer. Pretreatment assessment was evaluated by guideline adherence and quality metrics for cancer staging metrics. Our studies in perioperative quality focused on medications administered and guideline adherence. Patient outcomes included treatment-related side effects and patient-reported outcomes.
Advanced technologies applied to EHRs present opportunities to advance population-level quality assessment, to learn from routinely collected clinical data for personalized treatment guidelines, and to augment epidemiologic and population health studies. The effective use of digital data can inform patient-valued care, quality initiatives, and policy guidelines.
A comprehensive set of health data analyzed with advanced technologies results in a unique resource that facilitates wide-ranging, innovative, and impactful research on prostate cancer. This work demonstrates new ways to use the EHRs and technology to advance epidemiologic studies and benefit oncologic care.
See all articles in this CEBP Focus section, “Modernizing Population Science.”
Introduction
Digital technology and a focus on quality and value of care has promise to transform the health care delivery system. Over 20 years ago, two startling reports from the Institute of Medicine spotlighted patient safety and quality of care delivery (1, 2). Recommendations included creation of evidence-based guidelines, quality measures, and use of electronic data collection systems. Federal agencies and oncology organizations responded and these efforts spurred health care system improvement by developing and implementing quality measurement tools and systems (3–7). However, the tools and systems developed reflected the data available for “high-volume” clinical analytics at that time—mainly insurance claims data. Claims data generally include encounter-level information regarding diagnoses, treatments, and billing information yet are limited in their capture of patient outcomes [particularly patient-reported outcomes (PRO)], clinical decisions, patient-values, and secondary diagnoses/problem lists (8, 9). These data limitations resulted in an abundance of quality measures focused on processes of care (10) that have been subsequently linked to clinician burnout (11) and also too few measures that are actually linked to improvements in patient outcomes, which are of importance to patients and their caregivers, as well as health care purchasers (12).
Advances in informatics and digitalization of information further promised a positive transformation of health care. Initial adoption of informatics for patient care was slow (13) but the rapid uptake of tools and software for clinical care followed passage of the HiTech Act of 2009, which provided monetary incentives to implement electronic health record (EHR) systems (14). As a result, over 90% of US hospitals had a functioning EHR system by 2017 (15). The EHR captures “real world” care: care that is longitudinal, multidisciplinary, occurs across settings, and is delivered to “all comers”—including patients who may not have been eligible or included in randomized clinical trials (RCT; ref. 16). EHRs provide a plethora of data to analyze, explore, and learn from to evaluate health care delivery from a different angle—an angle that includes patient values, shared decision-making, and patient-reported outcomes. However, because most EHR data exist as unstructured text, advanced biomedical informatics methodologies are needed to extract and organize this wealth of data so that it can be used to investigate health care delivery (8, 17).
Our laboratory has focused on using guidelines and evidence to measure the quality of care delivered from information extracted from electronic data systems, as suggested by the IOM reports. We have developed tools and methods to analyze the granular, longitudinal data available in EHRs. Our work takes advantage of the new opportunities to improve the quality of health care delivery, to learn from routinely collected clinical data for personalized treatment guidelines, and to augment epidemiologic and population health studies. We recognize that EHR data present several notable advantages over previously available data types (e.g., insurance claims data), including large population sizes, low research costs, and opportunities to access medical histories and track disease onset and progression. This opens possibilities for conducting relatively low cost and time-efficient studies in routine clinical settings in which large, population-based research would otherwise be difficult or impossible to conduct. Here, we highlight opportunities for using EHR data to evaluate the quality of health care delivery among patients with cancer. First, we describe an infrastructure for capturing and linking EHR data from a large population of patients at an academic setting. Second, we describe the types of tools and software needed to leverage a main component of EHRs—unstructured clinical narrative text. Third, we provide examples of using EHRs to assess pretreatment, treatment, and post-treatment aspects of care—based on endorsed or recommended quality measures that are rarely described in the literature. Last, we discuss how biomedical informatics can be used for population and epidemiologic studies in cancer research.
Materials and Methods
Construction of a data warehouses for patient-centered outcomes
Data sources
Patients were identified in a clinical data warehouse (18, 19). In brief, data were collected from Stanford Health Care (SHC), a tertiary-care academic medical center using the Epic EHR system (Epic Systems) and managed in an EHR-based relational database, the clinical data warehouse. The clinical data warehouse contains structured data, including diagnosis and procedure codes, drug exposures and laboratory results, as well as unstructured data, including discharge summaries, progress notes, pathology and radiology reports. Structured data elements are mapped to standardized terminologies, including RxNorm, SNOMED, International Classification of Disease (ICD) 9 or 10 codes and Current Procedural Terminology (CPT). The cohort included all patients seeking treatment at SHC between 2005 and 2018. Patients with cancer were linked to the SHC cancer registry and also to the California Cancer Registry (CCR) to gather additional information on tumors, treatments not administered at SHC, cancer recurrence and survival. The CCR contains structured data about diagnosis, histology, cancer stage, treatment and outcomes across multiple tumor types, incorporating data from health care organizations across California. We matched patients to CCR records using the name and demographic details of the EHR cohort (first name, last name, middle name, date of birth, and social security number when available). Patients were excluded if they had less than two clinical visits, as these patients were likely patients seeking secondary opinions and not receiving treatment at our site. All studies received the approval from the institute's Institutional Review Board (IRB) and were conducted in accordance with recognized ethical guidelines.
Certain patient cohorts were also analyzed in the Veterans Health Administration (VHA). In the VHA cohort, data were obtained from the VA Corporate Data Warehouse (CDW), a national data repository from several VA clinical and administrative systems between 2009 and 2015. In the VHA, medication information was obtained using both the Bar Code Medication Administration data and the Decision Support System National Data Extract pharmacy dataset (20, 21).
Data mining
To fully leverage the vast amounts of data in the EHRs, we developed the infrastructure to capture and merge large heterogenous datasets, developed the methods to transform these disparate data into knowledge, and then use this knowledge to improve the health and well-being of an individual by working with policy makers and stakeholders to inform guidelines and by working with clinicians to gain insights into pressing questions in clinical care and to understand how we can best bring discoveries to the point of care. We have summarized our approach as the CAPTIVE infrastructure, which is comprised of three processes: Capture, Transform, Improve (Fig. 1).
Capture:
Patient cohorts are first identified in the EHR, which provide granular information on individuals' health care encounters. In our database, EHRs are merged with other data sources, including RCTs (22, 23), patient surveys (24), and data registries (18). This comprehensive set of knowledge resources can exponentially amplify the value of each linked semantic layer and addresses the missingness and noise of data often found in the EHRs.
Transform:
The merged data are then transformed to knowledge through different algorithms, mappings, and validation series. We have demonstrated the feasibility of our data-mining workflow to extract accurate, clinically meaningful information from EHR (25–28). The key for data extraction is to transform patient encounters into a retrospective longitudinal record for each patient and identify cohorts of interest, known as clinical phenotyping (17), using structured and unstructured data. The custom extractors we develop range in complexity based on the types of data and analytic methods required to identify and pull each variable at high fidelity. For example, we have developed several natural language processing (NLP) pipelines to populate our database with patient outcomes from unstructured EHR data using traditional rule-based approaches (27–29) and machine learning or deep learning–based approaches such as weighted neural word embeddings (26, 27), which computes weights based on TF-iDF scores for term/document pairs and generates sentence-level vector representation of clinical notes. These algorithms accurately identify clinician documentation of patient outcomes, often focusing on patient-centered outcomes, and have high performance (F1-scores between 0.87 and 0.94).
Improve:
The ultimate goal of our system to is learn for the data routinely collected in the EHRs and bring that evidence to the point of care. We focus on questions related to guideline adherence (30), patient-centered care (8), comparative-effectiveness analysis (31), and decision support (32). The application of our research at point of care provides opportunities to improve patient care and patient outcomes.
Results
Applications of CAPTIVE in assessing quality
We use our CAPTIVE system to assess the quality of care delivery using quality measures that have either been endorsed by federal agencies or have been proposed by clinical societies. Here, we report on our efforts that focus on quality measures that are unavailable in claims data yet are identified as important to both the patient and clinician.
Pretreatment assessments
Receipt of radionucleotide bone scan for staging:
The National Comprehensive Cancer Network (NCCN) and American Urological Association (AUA) have set guidelines for obtaining radionuclide bone scans for clinical staging to better inform treatment decisions. These guidelines recommend that patients with advanced-stage and local/regional high-risk cancer receive a bone scan for staging purposes and that low-risk patients not receive a bone scan before treatment (33, 34). Clinical features needed to appropriately classify patients into low- and high-risk categories are embedded in multiple data sources and scattered throughout EHRs. High-risk patients were defined by a combination of overall clinical stage, grade group (Gleason score), and pretreatment PSA values. Overall clinical stage was identified in two separate structured fields, the CCR and the EHRs, and PSA values were identified from the laboratory values in the EHR and the CCR. Next, we developed algorithms to assess adherence to guideline recommendations using several data sources in the EHR, including unstructured text (Fig. 2). One particular challenge was that bone scans were often obtained at outside facilities and therefore not identifiable as structured data in the EHRs, such as the presence of a radiologic report. Presence of outside facility bone scans were captured using NLP algorithms to scan the physicians' notes for documentation of scans. This work demonstrated the utility of gathering multiple data sources captured in diverse formats and sources to assess both overuse and underuse of bone scans for cancer staging among patients with prostate cancer. This pipeline can be implemented at point of care, such as by providing reminders to providers to perform a test like a bone scan. In addition, by gathering and presenting all information necessary to guide bone scan decisions, the methods can be used to assess the need for and performance of guidelines (such as those based solely on expert opinion) in special populations of clinical decision gray-zones where there is an absence of evidence of effectiveness for guidelines.
Digital rectal examination for prostate cancer clinical staging:
The majority of prostate cancer cases are localized at diagnosis. A digital rectal examination (DRE) is used for clinical staging and pretreatment assessment and can suggest additional diagnostic imaging in patients with locally advanced disease (35). DRE performance is identified as an important quality metric in prostate cancer care (10, 36), and most clinical guidelines include DRE as part of a comprehensive pretreatment assessment (37, 38). However, DRE results are often not systematically recorded nor included in claims datasets. Therefore, we developed a pipeline to use routinely collected electronic clinical text data to automatically assess pretreatment DRE documentation using a rule-based NLP framework (27). This NLP pipeline can accurately identify DRE documentation in the EHRs (95% precision and 90% recall; ref. 27). In our system, 72% of patients with prostate cancer had documentation of a DRE before initiation of therapy, and rates of documentation improved from 67% in 2005 to 87% in 2017. Of those with a DRE, over 70% were performed within 6 months before treatment, as required for quality metric adherence. This pipeline can open new opportunities for scalable and automated information extraction of other quality measures from unstructured clinical data (39).
Treatment assessments
Anesthesia type:
The type of anesthesia administered during operative procedures can influence postoperative outcomes, particularly pain (40). However, this information is not available for most clinicians and researchers because these data are captured and stored as unstructured data in the EHR. Despite evidence that type of anesthetic influences postoperative pain, there are limited quality metrics supporting best practices. Using a rule-based NLP pipeline we have developed, we can accurately classify different types of anesthesia (general, local, and regional) based on features within the free text of operative notes (precision 0.88 and recall 0.77; ref. 41). Using our algorithms, we found that regional anesthesia was associated with better pain scores compared with general and local anesthesia (42). This work provides evidence across populations on the use of different anesthesia types and differences in pain associations, information that can guide clinical guidelines and quality metric development.
Multimodal analgesia:
Regimens using multiple agents that target different pain-relieving mechanisms, “multimodal analgesia,” are associated with improved pain control and reduced opioid consumption postoperatively (43, 44). Current pain management guidelines recommend multimodal analgesia for postoperative pain (45, 46). Using our EHR pipeline, we evaluated patients undergoing common surgeries associated with high pain, including thoracotomy and mastectomy from Stanford University and VHA, 2008 to 2015 (20). Prescription and medication details are captured well in EHRs as structured data and both EHR systems link prescription medications to RxNorm. RxNorm provides normalized names for drugs and links names, including both generic and Brand names (47). The models were developed and validated independently at Stanford University and then applied to the VHA dataset to external validation. Although a majority of patients receive a multimodal pain approach at discharge, 20% were discharged with opioids alone (Fig. 3). Moreover, the multimodal regimen at discharge was associated with lower pain levels at follow-up and lower all-cause readmissions compared with the opioid-only regimen, substantiating guideline recommendations of postoperative pain management in a real world setting (20, 21).
Posttreatment assessments
Global mental and physical health:
Posttreatment assessments, particularly PROs, are difficult to capture and often missing from clinical research (10). Using our CAPTIVE system, the assessment of PROs is possible through the systematic collection of the PROMIS Global survey at the Stanford Cancer Institute (48). The surveys were deployed into routine clinical workflows for oncology outpatients as follows: At the time of clinic appointments, patients were given a paper survey that was transcribed directly into the EHR by the medical assistant. In May 2013, this process was supplemented by an electronic one, where patients could access the survey through the EHR patient portal before an appointment. Approximately 75% of patients at the academic cancer center were enrolled in the EHR patient portal and could receive electronic reminders to complete a survey. If no survey was completed electronically, paper surveys were available at the time of the visit. We assessed 11,657 PROMIS surveys from patients with breast (4,199) and prostate (2,118) cancer. Survey collection varied by important demographic and clinical subgroups; elder patients and those with advanced disease had disproportionately lower numbers of completed surveys. Similarly, global mental and physical health varied by patient race and stage at diagnosis, with nonwhite patients and those with advanced disease scoring significantly lower in both global physical and mental health compared with their counterparts (48). We are now correlating these direct measures of PROs with other clinical outcome, including disease status and treatment complications. However, our results also highlight shortcomings of collecting survey-based assessments of PROs, because important populations can be missed. These findings provide areas for improvement within our center where additional resources might be needed for improved survey capture.
Treatment-related side effects:
Similar to the PRO, treatment-related side-effects are difficult to capture, and studies on these outcomes are often limited to costly, prospective, survey-based ascertainment. The CAPTIVE system allows the automatic surveillance of clinical narrative text for treatment-related side effects. We demonstrate the opportunities our system facilitates through the investigation of urinary incontinence (UI), erectile dysfunction (ED), and bowel dysfunction (BD) following treatment for localized prostate cancer. First, we developed a rule-based NLP system to identify clinical documentation of UI following prostatectomy, where we identified improvements in the prevalence of UI and ED posttreatment (8, 28). Building on this system, we applied machine-learning methods to the clinical narrative text, which improved the accuracy of our algorithms to identify positive and negated mentions of UI and BD, as well as mentions for discussed risk of UI and BD (F1 score of 0.86; ref. 26), all of which are recommended quality metrics for prostate cancer care. To assess the concordance between clinician and patient reporting of UI, we next used our CAPTIVE system to compare UI documentation in clinical narrative text with patients' reporting of UI via a patient survey—the EPIC-26 (Expanded Prostate Cancer Index Composite; ref. 49)—collected in a subset of our patients (50). For all time points, the Cohen's Kappa coefficient agreement between EPIC-26 and the EHR was moderate (agreement across all time points, P < 0.001). The high level of agreement between the patient surveys and provider notes suggests that our methods facilitate unbiased measurement of important PROs from routinely collected information in EHR clinician notes (50).
Discussion
A biomedical informatics approach to population epidemiology fills an unmet need in cancer research to improve upon quality measurement and begin to capture and report the features of delivered care that matter most to both the clinician and patient. We developed our CAPTIVE system based on routinely collected clinical information to assess important quality aspects of oncology care. Our system fuses the EHR with other digital data streams, transforms the raw data into knowledge, and uses this knowledge to improve and guide clinical care. Such an approach can enable a learning health care system, where information gathered from previous patients and encounters can be used to guide and improve clinical care. The extensive availability of EHR data offers unique and promising opportunities for the application of advanced biomedical informatics techniques to improve and guide clinical oncology.
The ability to capture patient-centered outcomes and patient symptoms or treatment-related side effects at a population level opens new paradigms for patient-valued care.
For each patient, health information, such as the number and type of comorbid conditions, as well as socioeconomic, geographical and other features that affect health care interactions can be used to contextualize their health care trajectory. Novel methods that may gather patient-centered outcomes outside of traditional surveys, such as direct capture of patient–clinician conversation documentation, provide opportunities to overcome important biases. The capture of these outcomes through convenience surveys may be biased, as we have previously shown in PROMIS survey completion rates (48). High survey completion rates can be associated with a high number of appointments (suggesting that patients were given more opportunities to complete at least one survey), which may bias responses toward sicker and higher acuity patients (51). Furthermore, there is racial bias in survey completion rates that has been well documented (52, 53). This may be an effect of patient and/or staff behaviors; and emphasizes the importance of developing efforts to target minority groups in PRO initiatives.
The unbiased capture of patient-centered outcomes across populations is essential to provide precision care to oncology patients. Ultimately, the interface of the disease and treatment features and patient-specific context can be used to personalize treatment decisions that can incorporate patient values and aspects of care that are often difficult to capture from structured data, such as in insurance claims data. Such an approach is particularly important for patients where multiple treatment options are available—such as in localized prostate and breast cancers where patients must balance the risks and benefits of different treatments.
To successfully leverage the abundance of data held within an EHR system for epidemiologic studies, advanced biomedical informatics tools are needed. Fortunately, computer science and engineering methodologies applied to medical data have progressed rapidly, and new technologies are emerging at a rapid pace. In our work, we apply a broad range of techniques and methodologies to fully use information stored in the EHRs (8, 26–28, 54). Our algorithms are developed by a multidisciplinary team, where clinicians, epidemiologists, quality experts, and informaticians work closely together to identify clinical terms commonly used to describe a concept in the medical record and to identify where and how these terms are stored in the EHR. When clinical concepts are more subjective (e.g., patient reported outcomes, such as pain, fatigue and nausea) or severity of illness is of interest machine learning and deep learning tools may be required (55, 56). Although big data and EHRs provide new avenues of clinical data for research and population-based studies, these data are unserviceable if the appropriate tools and software are not developed within a multidisciplinary team.
As machine learning algorithms are adopted into routine clinical care, scientific rigor and generalizability are of the highest importance. Concerns are arising regarding bias and equity of data-driven algorithms in health care (57), particularly because they are often trained on historical data that may be biased or only include nonrepresentative populations, similar to what has been found of older clinical trials (16, 58, 59). The algorithms developed to predict patient outcomes from a predominantly white affluent population may not produce accurate results in a different patient population. Furthermore, bias in training data could further accentuate health disparities (60). Although the development of site-specific tools unlocks data within the EHR, these tools lack rigor and reproducibility if they are not extensively validated both internally and externally. Validation goes well beyond evaluating for problems of overfitting. Rather, validation requires assessments of performance in a completely separate health care system, where terminologies used by providers in clinical documentation might confound NLP algorithms developed in another system (61). Our use of the VAH data to validate our SHC-derived multidomain pain information is a useful example. External validation ensures that the prediction models developed are applicable to diverse populations, represent general populations, and not the possible select patients who might be seen only at an academic facility. However, external validation is difficult due to resource, technology, privacy, and incentive limitations. To manage these shortcomings, transparency is needed on the data used to train and validate the models and how training data are representative of the broader population. This information can help prioritize research agendas, highlight populations underrepresented in this wave of medical informatics and can inform policy around equitable health care development and practice.
A core aspect of a biomedical informatics approach to population-based cancer research is the ability to fuse diverse data to create a “tapestry” of information that can be used to learn from and predict patient trajectories (62). Registry data, such as the CCR and its associated Surveillance Epidemiology and End Results (SEER) data (63), provide the backbone of population epidemiology studies. However, these data provide only a skeleton of patient outcomes and have limited clinical information and knowledge about treatment decisions. By linking registry data with EHRs, the gaps in information can be filled and new knowledge generated. In our work, we have the ability to analyze population-level oncology data with the added information on biomarkers, social determinants of health, quality of care, and other PROs. By linking of these data to patient surveys, patient-generated data, and other environmental and social factors, an unprecedented tapestry of the patient journey through their cancer care can be obtained, where precision oncology can flourish and shared decision making is facilitated.
In conclusion, advanced technologies applied to routinely collected her clinical data present an opportunity to advance population-level quality assessment. We have developed our CAPTIVE system to efficiently and accurately assess the quality of health care delivery for oncology patients. We focus on clinical guidelines and endorsed quality metrics that in the past have been underreported due to limited data availability. Although we provide examples of potential use cases for our system, many more opportunities exist for knowledge discovery, improved patient outcomes, and the development of a learning health care system. The pipeline we have developed can be shared across systems and provides the groundwork for novel informatics-based epidemiologic studies.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Disclaimer
The content of this work is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or of the Agency for Healthcare Research and Quality.
Authors' Contributions
Conception and design: T. Hernandez-Boussard, D.W. Blayney, J.D. Brooks
Development of methodology: T. Hernandez-Boussard, J.D. Brooks
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): T. Hernandez-Boussard
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): T. Hernandez-Boussard, D.W. Blayney, J.D. Brooks
Writing, review, and/or revision of the manuscript: T. Hernandez-Boussard, D.W. Blayney, J.D. Brooks
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): T. Hernandez-Boussard, J.D. Brooks
Study supervision: T. Hernandez-Boussard, J.D. Brooks
Acknowledgments
T. Hernandez-Boussard was awarded grant number R01HS024096 from the Agency for Healthcare Research and Quality and was awarded grant number R01CA183962 from the National Cancer Institute of the National Institutes of Health.