Background:

Efficient capture of routine clinical care and patient outcomes is needed at a population-level, as is evidence on important treatment-related side effects and their effect on well-being and clinical outcomes. The increasing availability of electronic health records (EHR) offers new opportunities to generate population-level patient-centered evidence on oncologic care that can better guide treatment decisions and patient-valued care.

Methods:

This study includes patients seeking care at an academic medical center, 2008 to 2018. Digital data sources are combined to address missingness, inaccuracy, and noise common to EHR data. Clinical concepts were identified and extracted from EHR unstructured data using natural language processing (NLP) and machine/deep learning techniques. All models are trained, tested, and validated on independent data samples using standard metrics.

Results:

We provide use cases for using EHR data to assess guideline adherence and quality measurements among patients with cancer. Pretreatment assessment was evaluated by guideline adherence and quality metrics for cancer staging metrics. Our studies in perioperative quality focused on medications administered and guideline adherence. Patient outcomes included treatment-related side effects and patient-reported outcomes.

Conclusions:

Advanced technologies applied to EHRs present opportunities to advance population-level quality assessment, to learn from routinely collected clinical data for personalized treatment guidelines, and to augment epidemiologic and population health studies. The effective use of digital data can inform patient-valued care, quality initiatives, and policy guidelines.

Impact:

A comprehensive set of health data analyzed with advanced technologies results in a unique resource that facilitates wide-ranging, innovative, and impactful research on prostate cancer. This work demonstrates new ways to use the EHRs and technology to advance epidemiologic studies and benefit oncologic care.

See all articles in this CEBP Focus section, “Modernizing Population Science.”

Digital technology and a focus on quality and value of care has promise to transform the health care delivery system. Over 20 years ago, two startling reports from the Institute of Medicine spotlighted patient safety and quality of care delivery (1, 2). Recommendations included creation of evidence-based guidelines, quality measures, and use of electronic data collection systems. Federal agencies and oncology organizations responded and these efforts spurred health care system improvement by developing and implementing quality measurement tools and systems (3–7). However, the tools and systems developed reflected the data available for “high-volume” clinical analytics at that time—mainly insurance claims data. Claims data generally include encounter-level information regarding diagnoses, treatments, and billing information yet are limited in their capture of patient outcomes [particularly patient-reported outcomes (PRO)], clinical decisions, patient-values, and secondary diagnoses/problem lists (8, 9). These data limitations resulted in an abundance of quality measures focused on processes of care (10) that have been subsequently linked to clinician burnout (11) and also too few measures that are actually linked to improvements in patient outcomes, which are of importance to patients and their caregivers, as well as health care purchasers (12).

Advances in informatics and digitalization of information further promised a positive transformation of health care. Initial adoption of informatics for patient care was slow (13) but the rapid uptake of tools and software for clinical care followed passage of the HiTech Act of 2009, which provided monetary incentives to implement electronic health record (EHR) systems (14). As a result, over 90% of US hospitals had a functioning EHR system by 2017 (15). The EHR captures “real world” care: care that is longitudinal, multidisciplinary, occurs across settings, and is delivered to “all comers”—including patients who may not have been eligible or included in randomized clinical trials (RCT; ref. 16). EHRs provide a plethora of data to analyze, explore, and learn from to evaluate health care delivery from a different angle—an angle that includes patient values, shared decision-making, and patient-reported outcomes. However, because most EHR data exist as unstructured text, advanced biomedical informatics methodologies are needed to extract and organize this wealth of data so that it can be used to investigate health care delivery (8, 17).

Our laboratory has focused on using guidelines and evidence to measure the quality of care delivered from information extracted from electronic data systems, as suggested by the IOM reports. We have developed tools and methods to analyze the granular, longitudinal data available in EHRs. Our work takes advantage of the new opportunities to improve the quality of health care delivery, to learn from routinely collected clinical data for personalized treatment guidelines, and to augment epidemiologic and population health studies. We recognize that EHR data present several notable advantages over previously available data types (e.g., insurance claims data), including large population sizes, low research costs, and opportunities to access medical histories and track disease onset and progression. This opens possibilities for conducting relatively low cost and time-efficient studies in routine clinical settings in which large, population-based research would otherwise be difficult or impossible to conduct. Here, we highlight opportunities for using EHR data to evaluate the quality of health care delivery among patients with cancer. First, we describe an infrastructure for capturing and linking EHR data from a large population of patients at an academic setting. Second, we describe the types of tools and software needed to leverage a main component of EHRs—unstructured clinical narrative text. Third, we provide examples of using EHRs to assess pretreatment, treatment, and post-treatment aspects of care—based on endorsed or recommended quality measures that are rarely described in the literature. Last, we discuss how biomedical informatics can be used for population and epidemiologic studies in cancer research.

Construction of a data warehouses for patient-centered outcomes

Data sources

Patients were identified in a clinical data warehouse (18, 19). In brief, data were collected from Stanford Health Care (SHC), a tertiary-care academic medical center using the Epic EHR system (Epic Systems) and managed in an EHR-based relational database, the clinical data warehouse. The clinical data warehouse contains structured data, including diagnosis and procedure codes, drug exposures and laboratory results, as well as unstructured data, including discharge summaries, progress notes, pathology and radiology reports. Structured data elements are mapped to standardized terminologies, including RxNorm, SNOMED, International Classification of Disease (ICD) 9 or 10 codes and Current Procedural Terminology (CPT). The cohort included all patients seeking treatment at SHC between 2005 and 2018. Patients with cancer were linked to the SHC cancer registry and also to the California Cancer Registry (CCR) to gather additional information on tumors, treatments not administered at SHC, cancer recurrence and survival. The CCR contains structured data about diagnosis, histology, cancer stage, treatment and outcomes across multiple tumor types, incorporating data from health care organizations across California. We matched patients to CCR records using the name and demographic details of the EHR cohort (first name, last name, middle name, date of birth, and social security number when available). Patients were excluded if they had less than two clinical visits, as these patients were likely patients seeking secondary opinions and not receiving treatment at our site. All studies received the approval from the institute's Institutional Review Board (IRB) and were conducted in accordance with recognized ethical guidelines.

Certain patient cohorts were also analyzed in the Veterans Health Administration (VHA). In the VHA cohort, data were obtained from the VA Corporate Data Warehouse (CDW), a national data repository from several VA clinical and administrative systems between 2009 and 2015. In the VHA, medication information was obtained using both the Bar Code Medication Administration data and the Decision Support System National Data Extract pharmacy dataset (20, 21).

Data mining

To fully leverage the vast amounts of data in the EHRs, we developed the infrastructure to capture and merge large heterogenous datasets, developed the methods to transform these disparate data into knowledge, and then use this knowledge to improve the health and well-being of an individual by working with policy makers and stakeholders to inform guidelines and by working with clinicians to gain insights into pressing questions in clinical care and to understand how we can best bring discoveries to the point of care. We have summarized our approach as the CAPTIVE infrastructure, which is comprised of three processes: Capture, Transform, Improve (Fig. 1).

Figure 1.

Our CAPTIVE infrastructure combines heterogeneous data sources and uses state-of-the-art technology to transform data to knowledge for care improvement.

Figure 1.

Our CAPTIVE infrastructure combines heterogeneous data sources and uses state-of-the-art technology to transform data to knowledge for care improvement.

Close modal
Capture:

Patient cohorts are first identified in the EHR, which provide granular information on individuals' health care encounters. In our database, EHRs are merged with other data sources, including RCTs (22, 23), patient surveys (24), and data registries (18). This comprehensive set of knowledge resources can exponentially amplify the value of each linked semantic layer and addresses the missingness and noise of data often found in the EHRs.

Transform:

The merged data are then transformed to knowledge through different algorithms, mappings, and validation series. We have demonstrated the feasibility of our data-mining workflow to extract accurate, clinically meaningful information from EHR (25–28). The key for data extraction is to transform patient encounters into a retrospective longitudinal record for each patient and identify cohorts of interest, known as clinical phenotyping (17), using structured and unstructured data. The custom extractors we develop range in complexity based on the types of data and analytic methods required to identify and pull each variable at high fidelity. For example, we have developed several natural language processing (NLP) pipelines to populate our database with patient outcomes from unstructured EHR data using traditional rule-based approaches (27–29) and machine learning or deep learning–based approaches such as weighted neural word embeddings (26, 27), which computes weights based on TF-iDF scores for term/document pairs and generates sentence-level vector representation of clinical notes. These algorithms accurately identify clinician documentation of patient outcomes, often focusing on patient-centered outcomes, and have high performance (F1-scores between 0.87 and 0.94).

Improve:

The ultimate goal of our system to is learn for the data routinely collected in the EHRs and bring that evidence to the point of care. We focus on questions related to guideline adherence (30), patient-centered care (8), comparative-effectiveness analysis (31), and decision support (32). The application of our research at point of care provides opportunities to improve patient care and patient outcomes.

Applications of CAPTIVE in assessing quality

We use our CAPTIVE system to assess the quality of care delivery using quality measures that have either been endorsed by federal agencies or have been proposed by clinical societies. Here, we report on our efforts that focus on quality measures that are unavailable in claims data yet are identified as important to both the patient and clinician.

Pretreatment assessments

Receipt of radionucleotide bone scan for staging:

The National Comprehensive Cancer Network (NCCN) and American Urological Association (AUA) have set guidelines for obtaining radionuclide bone scans for clinical staging to better inform treatment decisions. These guidelines recommend that patients with advanced-stage and local/regional high-risk cancer receive a bone scan for staging purposes and that low-risk patients not receive a bone scan before treatment (33, 34). Clinical features needed to appropriately classify patients into low- and high-risk categories are embedded in multiple data sources and scattered throughout EHRs. High-risk patients were defined by a combination of overall clinical stage, grade group (Gleason score), and pretreatment PSA values. Overall clinical stage was identified in two separate structured fields, the CCR and the EHRs, and PSA values were identified from the laboratory values in the EHR and the CCR. Next, we developed algorithms to assess adherence to guideline recommendations using several data sources in the EHR, including unstructured text (Fig. 2). One particular challenge was that bone scans were often obtained at outside facilities and therefore not identifiable as structured data in the EHRs, such as the presence of a radiologic report. Presence of outside facility bone scans were captured using NLP algorithms to scan the physicians' notes for documentation of scans. This work demonstrated the utility of gathering multiple data sources captured in diverse formats and sources to assess both overuse and underuse of bone scans for cancer staging among patients with prostate cancer. This pipeline can be implemented at point of care, such as by providing reminders to providers to perform a test like a bone scan. In addition, by gathering and presenting all information necessary to guide bone scan decisions, the methods can be used to assess the need for and performance of guidelines (such as those based solely on expert opinion) in special populations of clinical decision gray-zones where there is an absence of evidence of effectiveness for guidelines.

Figure 2.

Guideline adherence based on data extracted from billing codes and radiology reports and the health care system (CPT + radiology) augmented by report of bone scan within providers' unstructured text (NLP). Percentage of patients undergoing a bone scan stratified by risk group according to the NCCN and AUA guidelines (54).

Figure 2.

Guideline adherence based on data extracted from billing codes and radiology reports and the health care system (CPT + radiology) augmented by report of bone scan within providers' unstructured text (NLP). Percentage of patients undergoing a bone scan stratified by risk group according to the NCCN and AUA guidelines (54).

Close modal
Digital rectal examination for prostate cancer clinical staging:

The majority of prostate cancer cases are localized at diagnosis. A digital rectal examination (DRE) is used for clinical staging and pretreatment assessment and can suggest additional diagnostic imaging in patients with locally advanced disease (35). DRE performance is identified as an important quality metric in prostate cancer care (10, 36), and most clinical guidelines include DRE as part of a comprehensive pretreatment assessment (37, 38). However, DRE results are often not systematically recorded nor included in claims datasets. Therefore, we developed a pipeline to use routinely collected electronic clinical text data to automatically assess pretreatment DRE documentation using a rule-based NLP framework (27). This NLP pipeline can accurately identify DRE documentation in the EHRs (95% precision and 90% recall; ref. 27). In our system, 72% of patients with prostate cancer had documentation of a DRE before initiation of therapy, and rates of documentation improved from 67% in 2005 to 87% in 2017. Of those with a DRE, over 70% were performed within 6 months before treatment, as required for quality metric adherence. This pipeline can open new opportunities for scalable and automated information extraction of other quality measures from unstructured clinical data (39).

Treatment assessments

Anesthesia type:

The type of anesthesia administered during operative procedures can influence postoperative outcomes, particularly pain (40). However, this information is not available for most clinicians and researchers because these data are captured and stored as unstructured data in the EHR. Despite evidence that type of anesthetic influences postoperative pain, there are limited quality metrics supporting best practices. Using a rule-based NLP pipeline we have developed, we can accurately classify different types of anesthesia (general, local, and regional) based on features within the free text of operative notes (precision 0.88 and recall 0.77; ref. 41). Using our algorithms, we found that regional anesthesia was associated with better pain scores compared with general and local anesthesia (42). This work provides evidence across populations on the use of different anesthesia types and differences in pain associations, information that can guide clinical guidelines and quality metric development.

Multimodal analgesia:

Regimens using multiple agents that target different pain-relieving mechanisms, “multimodal analgesia,” are associated with improved pain control and reduced opioid consumption postoperatively (43, 44). Current pain management guidelines recommend multimodal analgesia for postoperative pain (45, 46). Using our EHR pipeline, we evaluated patients undergoing common surgeries associated with high pain, including thoracotomy and mastectomy from Stanford University and VHA, 2008 to 2015 (20). Prescription and medication details are captured well in EHRs as structured data and both EHR systems link prescription medications to RxNorm. RxNorm provides normalized names for drugs and links names, including both generic and Brand names (47). The models were developed and validated independently at Stanford University and then applied to the VHA dataset to external validation. Although a majority of patients receive a multimodal pain approach at discharge, 20% were discharged with opioids alone (Fig. 3). Moreover, the multimodal regimen at discharge was associated with lower pain levels at follow-up and lower all-cause readmissions compared with the opioid-only regimen, substantiating guideline recommendations of postoperative pain management in a real world setting (20, 21).

Figure 3.

Distribution of discharge drug modality in two diverse health care settings, 2008–2015. Aceta., acetaminophen.

Figure 3.

Distribution of discharge drug modality in two diverse health care settings, 2008–2015. Aceta., acetaminophen.

Close modal

Posttreatment assessments

Global mental and physical health:

Posttreatment assessments, particularly PROs, are difficult to capture and often missing from clinical research (10). Using our CAPTIVE system, the assessment of PROs is possible through the systematic collection of the PROMIS Global survey at the Stanford Cancer Institute (48). The surveys were deployed into routine clinical workflows for oncology outpatients as follows: At the time of clinic appointments, patients were given a paper survey that was transcribed directly into the EHR by the medical assistant. In May 2013, this process was supplemented by an electronic one, where patients could access the survey through the EHR patient portal before an appointment. Approximately 75% of patients at the academic cancer center were enrolled in the EHR patient portal and could receive electronic reminders to complete a survey. If no survey was completed electronically, paper surveys were available at the time of the visit. We assessed 11,657 PROMIS surveys from patients with breast (4,199) and prostate (2,118) cancer. Survey collection varied by important demographic and clinical subgroups; elder patients and those with advanced disease had disproportionately lower numbers of completed surveys. Similarly, global mental and physical health varied by patient race and stage at diagnosis, with nonwhite patients and those with advanced disease scoring significantly lower in both global physical and mental health compared with their counterparts (48). We are now correlating these direct measures of PROs with other clinical outcome, including disease status and treatment complications. However, our results also highlight shortcomings of collecting survey-based assessments of PROs, because important populations can be missed. These findings provide areas for improvement within our center where additional resources might be needed for improved survey capture.

Treatment-related side effects:

Similar to the PRO, treatment-related side-effects are difficult to capture, and studies on these outcomes are often limited to costly, prospective, survey-based ascertainment. The CAPTIVE system allows the automatic surveillance of clinical narrative text for treatment-related side effects. We demonstrate the opportunities our system facilitates through the investigation of urinary incontinence (UI), erectile dysfunction (ED), and bowel dysfunction (BD) following treatment for localized prostate cancer. First, we developed a rule-based NLP system to identify clinical documentation of UI following prostatectomy, where we identified improvements in the prevalence of UI and ED posttreatment (8, 28). Building on this system, we applied machine-learning methods to the clinical narrative text, which improved the accuracy of our algorithms to identify positive and negated mentions of UI and BD, as well as mentions for discussed risk of UI and BD (F1 score of 0.86; ref. 26), all of which are recommended quality metrics for prostate cancer care. To assess the concordance between clinician and patient reporting of UI, we next used our CAPTIVE system to compare UI documentation in clinical narrative text with patients' reporting of UI via a patient survey—the EPIC-26 (Expanded Prostate Cancer Index Composite; ref. 49)—collected in a subset of our patients (50). For all time points, the Cohen's Kappa coefficient agreement between EPIC-26 and the EHR was moderate (agreement across all time points, P < 0.001). The high level of agreement between the patient surveys and provider notes suggests that our methods facilitate unbiased measurement of important PROs from routinely collected information in EHR clinician notes (50).

A biomedical informatics approach to population epidemiology fills an unmet need in cancer research to improve upon quality measurement and begin to capture and report the features of delivered care that matter most to both the clinician and patient. We developed our CAPTIVE system based on routinely collected clinical information to assess important quality aspects of oncology care. Our system fuses the EHR with other digital data streams, transforms the raw data into knowledge, and uses this knowledge to improve and guide clinical care. Such an approach can enable a learning health care system, where information gathered from previous patients and encounters can be used to guide and improve clinical care. The extensive availability of EHR data offers unique and promising opportunities for the application of advanced biomedical informatics techniques to improve and guide clinical oncology.

The ability to capture patient-centered outcomes and patient symptoms or treatment-related side effects at a population level opens new paradigms for patient-valued care.

For each patient, health information, such as the number and type of comorbid conditions, as well as socioeconomic, geographical and other features that affect health care interactions can be used to contextualize their health care trajectory. Novel methods that may gather patient-centered outcomes outside of traditional surveys, such as direct capture of patient–clinician conversation documentation, provide opportunities to overcome important biases. The capture of these outcomes through convenience surveys may be biased, as we have previously shown in PROMIS survey completion rates (48). High survey completion rates can be associated with a high number of appointments (suggesting that patients were given more opportunities to complete at least one survey), which may bias responses toward sicker and higher acuity patients (51). Furthermore, there is racial bias in survey completion rates that has been well documented (52, 53). This may be an effect of patient and/or staff behaviors; and emphasizes the importance of developing efforts to target minority groups in PRO initiatives.

The unbiased capture of patient-centered outcomes across populations is essential to provide precision care to oncology patients. Ultimately, the interface of the disease and treatment features and patient-specific context can be used to personalize treatment decisions that can incorporate patient values and aspects of care that are often difficult to capture from structured data, such as in insurance claims data. Such an approach is particularly important for patients where multiple treatment options are available—such as in localized prostate and breast cancers where patients must balance the risks and benefits of different treatments.

To successfully leverage the abundance of data held within an EHR system for epidemiologic studies, advanced biomedical informatics tools are needed. Fortunately, computer science and engineering methodologies applied to medical data have progressed rapidly, and new technologies are emerging at a rapid pace. In our work, we apply a broad range of techniques and methodologies to fully use information stored in the EHRs (8, 26–28, 54). Our algorithms are developed by a multidisciplinary team, where clinicians, epidemiologists, quality experts, and informaticians work closely together to identify clinical terms commonly used to describe a concept in the medical record and to identify where and how these terms are stored in the EHR. When clinical concepts are more subjective (e.g., patient reported outcomes, such as pain, fatigue and nausea) or severity of illness is of interest machine learning and deep learning tools may be required (55, 56). Although big data and EHRs provide new avenues of clinical data for research and population-based studies, these data are unserviceable if the appropriate tools and software are not developed within a multidisciplinary team.

As machine learning algorithms are adopted into routine clinical care, scientific rigor and generalizability are of the highest importance. Concerns are arising regarding bias and equity of data-driven algorithms in health care (57), particularly because they are often trained on historical data that may be biased or only include nonrepresentative populations, similar to what has been found of older clinical trials (16, 58, 59). The algorithms developed to predict patient outcomes from a predominantly white affluent population may not produce accurate results in a different patient population. Furthermore, bias in training data could further accentuate health disparities (60). Although the development of site-specific tools unlocks data within the EHR, these tools lack rigor and reproducibility if they are not extensively validated both internally and externally. Validation goes well beyond evaluating for problems of overfitting. Rather, validation requires assessments of performance in a completely separate health care system, where terminologies used by providers in clinical documentation might confound NLP algorithms developed in another system (61). Our use of the VAH data to validate our SHC-derived multidomain pain information is a useful example. External validation ensures that the prediction models developed are applicable to diverse populations, represent general populations, and not the possible select patients who might be seen only at an academic facility. However, external validation is difficult due to resource, technology, privacy, and incentive limitations. To manage these shortcomings, transparency is needed on the data used to train and validate the models and how training data are representative of the broader population. This information can help prioritize research agendas, highlight populations underrepresented in this wave of medical informatics and can inform policy around equitable health care development and practice.

A core aspect of a biomedical informatics approach to population-based cancer research is the ability to fuse diverse data to create a “tapestry” of information that can be used to learn from and predict patient trajectories (62). Registry data, such as the CCR and its associated Surveillance Epidemiology and End Results (SEER) data (63), provide the backbone of population epidemiology studies. However, these data provide only a skeleton of patient outcomes and have limited clinical information and knowledge about treatment decisions. By linking registry data with EHRs, the gaps in information can be filled and new knowledge generated. In our work, we have the ability to analyze population-level oncology data with the added information on biomarkers, social determinants of health, quality of care, and other PROs. By linking of these data to patient surveys, patient-generated data, and other environmental and social factors, an unprecedented tapestry of the patient journey through their cancer care can be obtained, where precision oncology can flourish and shared decision making is facilitated.

In conclusion, advanced technologies applied to routinely collected her clinical data present an opportunity to advance population-level quality assessment. We have developed our CAPTIVE system to efficiently and accurately assess the quality of health care delivery for oncology patients. We focus on clinical guidelines and endorsed quality metrics that in the past have been underreported due to limited data availability. Although we provide examples of potential use cases for our system, many more opportunities exist for knowledge discovery, improved patient outcomes, and the development of a learning health care system. The pipeline we have developed can be shared across systems and provides the groundwork for novel informatics-based epidemiologic studies.

No potential conflicts of interest were disclosed.

The content of this work is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or of the Agency for Healthcare Research and Quality.

Conception and design: T. Hernandez-Boussard, D.W. Blayney, J.D. Brooks

Development of methodology: T. Hernandez-Boussard, J.D. Brooks

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): T. Hernandez-Boussard

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): T. Hernandez-Boussard, D.W. Blayney, J.D. Brooks

Writing, review, and/or revision of the manuscript: T. Hernandez-Boussard, D.W. Blayney, J.D. Brooks

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): T. Hernandez-Boussard, J.D. Brooks

Study supervision: T. Hernandez-Boussard, J.D. Brooks

T. Hernandez-Boussard was awarded grant number R01HS024096 from the Agency for Healthcare Research and Quality and was awarded grant number R01CA183962 from the National Cancer Institute of the National Institutes of Health.

1.
Kohn
LT
,
Corrigan
JM
,
Donaldson
MS
, editors.
To err is human: building a safer health system
.
Washington (DC)
:
National Academy Press
; 
2000
.
2.
Simone
JV
,
Hewitt
M
, editors.
Ensuring quality cancer care
.
Washington (DC)
:
National Academies Press
; 
1999
.
3.
Jacobson
JO
,
Neuss
MN
,
McNiff
KK
,
Kadlubek
P
,
Thacker
LR
 II
,
Song
F
, et al
Improvement in oncology practice performance through voluntary participation in the Quality Oncology Practice Initiative
.
J Clin Oncol
2008
;
26
:
1893
8
.
4.
Neuss
MN
,
Desch
CE
,
McNiff
KK
,
Eisenberg
PD
,
Gesme
DH
,
Jacobson
JO
, et al
A process for measuring the quality of cancer care: the Quality Oncology Practice Initiative
.
J Clin Oncol
2005
;
23
:
6233
9
.
5.
Agency for Healthcare Research and Quality
.
Patient Safety Indicators, PSI. Version 4.1b ed
.
Rockville (MD)
:
Agency for Healthcare Research and Quality
; 
2011
.
6.
National Quality Forum
. 
NQF-endorsed standards
.
Washington (DC)
:
NQF
.
c2000 [cited 2016 Dec 10]. Available from
: http://www.qualityforum.org/Measures_List.aspx.
7.
Weeks
J
. 
Outcomes assessment in the NCCN: 1998 update. National Comprehensive Cancer Network
.
Oncology
1999
;
13
:
69
71
.
8.
Hernandez-Boussard
T
,
Tamang
S
,
Blayney
D
,
Brooks
J
,
Shah
N
. 
New paradigms for patient-centered outcomes research in electronic medical records: an example of detecting urinary incontinence following prostatectomy
.
EGEMS
(
Wash DC
) 
2016
;
4
:
1231
.
9.
Elixhauser
A
,
Steiner
C
,
Harris
DR
,
Coffey
RM
. 
Comorbidity measures for use with administrative data
.
Med Care
1998
;
36
:
8
27
.
10.
Gori
D
,
Dulal
R
,
Blayney
DW
,
Brooks
JD
,
Fantini
MP
,
McDonald
KM
, et al
Utilization of prostate cancer quality metrics for research and quality improvement: a structured review
.
Jt Comm J Qual Patient Saf
2019
;
45
:
217
26
.
11.
Shanafelt
TD
,
Dyrbye
LN
,
West
CP
. 
Addressing physician burnout: the way forward
.
JAMA
2017
;
317
:
901
2
.
12.
Rubin
HR
,
Pronovost
P
,
Diette
GB
. 
The advantages and disadvantages of process-based measures of health care quality
.
Int J Qual Health Care
2001
;
13
:
469
74
.
13.
Shortliffe
EH
,
Tang
PC
,
Detmer
DE
. 
Patient records and computers
.
Ann Intern Med
1991
;
115
:
979
81
.
14.
Blumenthal
D
. 
Launching HITECH
.
N Engl J Med
2010
;
362
:
382
5
.
15.
Adler-Milstein
J
,
Jha
AK
. 
HITECH Act drove large gains in hospital electronic health record adoption
.
Health Aff
2017
;
36
:
1416
22
.
16.
Kennedy-Martin
T
,
Curtis
S
,
Faries
D
,
Robinson
S
,
Johnston
J
. 
A literature review on the representativeness of randomized controlled trial samples and implications for the external validity of trial results
.
Trials
2015
;
16
:
495
.
17.
Banda
JM
,
Seneviratne
M
,
Hernandez-Boussard
T
,
Shah
NH
. 
Advances in electronic phenotyping: from rule-based definitions to machine learning models
.
Annu Rev Biomed Data Sci
2018
;
1
:
53
68
.
18.
Seneviratne
MG
,
Seto
T
,
Blayney
DW
,
Brooks
JD
,
Hernandez-Boussard
T
. 
Architecture and implementation of a clinical research data warehouse for prostate cancer
.
EGEMS
(
Wash DC
) 
2018
;
6
:
13
.
19.
Thompson
CA
,
Kurian
AW
,
Luft
HS
. 
Linking electronic health records to better understand breast cancer patient pathways within and between two health systems
.
EGEMS
(
Wash DC
) 
2015
;
3
:
1127
.
20.
Desai
K
,
Carroll
I
,
Asch
SM
,
Seto
T
,
McDonald
KM
,
Curtin
C
, et al
Utilization and effectiveness of multimodal discharge analgesia for postoperative pain management
.
J Surg Res
2018
;
228
:
160
9
.
21.
Hernandez-Boussard
T
,
Graham
LA
,
Carroll
I
,
Dasinger
EA
,
Titan
AL
,
Morris
MS
, et al
Perioperative opioid use and pain-related outcomes in the Veterans Health Administration
.
Am J Surg
2019
Jun 28
[Epub ahead of print]
.
22.
Hah
J
,
Mackey
SC
,
Schmidt
P
,
McCue
R
,
Humphreys
K
,
Trafton
J
, et al
Effect of perioperative gabapentin on postoperative pain resolution and opioid cessation in a mixed surgical cohort: a randomized clinical trial
.
JAMA Surg
2018
;
153
:
303
11
.
23.
Hah
JM
,
Sharifzadeh
Y
,
Wang
BM
,
Gillespie
MJ
,
Goodman
SB
,
Mackey
SC
, et al
Factors associated with opioid use in a cohort of patients presenting for surgery
.
Pain Res Treat
2015
;
2015
:
829696
.
24.
Sturgeon
JA
,
Darnall
BD
,
Kao
MC
,
Mackey
SC
. 
Physical and psychological correlates of fatigue and physical function: a Collaborative Health Outcomes Information Registry (CHOIR) study
.
J Pain
2015
;
16
:
291
8
.
25.
Hernandez-Boussard
T
,
Kourdis
P
,
Dulal
R
,
Ferrari
M
,
Henry
S
,
Seto
T
, et al
A natural language processing algorithm to measure quality prostate cancer care
.
J Clin Oncol
35
, 
2017
(
suppl 8S; abstr 232
).
26.
Banerjee
I
,
Li
K
,
Seneviratne
M
,
Ferrari
M
,
Seto
T
,
Brooks
JD
, et al
Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
.
JAMIA Open
2019
;
2
:
150
9
.
27.
Bozkurt
S
,
Park
JI
,
Kan
KM
,
Ferrari
M
,
Rubin
DL
,
Brooks
JD
, et al
An automated feature engineering for digital rectal examination documentation using natural language processing
.
AMIA Annu Symp Proc
2018
;
2018
:
288
94
.
28.
Hernandez-Boussard
T
,
Kourdis
PD
,
Seto
T
,
Ferrari
M
,
Blayney
DW
,
Rubin
D
, et al
Mining electronic health records to extract patient-centered outcomes following prostate cancer treatment
.
AMIA Annu Symp Proc
2017
;
2017
:
876
82
.
29.
Tamang
SR
,
Hernandez-Boussard
T
,
Ross
EG
,
Gaskin
G
,
Patel
MI
,
Shah
NH
. 
Enhanced quality measurement event detection: an application to physician reporting
.
EGEMS
(
Wash DC
) 
2017
;
5
:
5
.
30.
Magnani
CJ
,
Li
K
,
Seto
T
,
McDonald
KM
,
Blayney
DW
,
Brooks
JD
, et al
PSA testing use and prostate cancer diagnostic stage after the 2012 U.S. Preventive Services Task Force guideline changes
.
J Natl Compr Canc Netw
2019
;
17
:
795
803
.
31.
Vorhies
JS
,
Hernandez-Boussard
T
,
Alamin
T
. 
Treatment of degenerative lumbar spondylolisthesis with fusion or decompression alone results in similar rates of reoperation at 5 years
.
Clin Spine Surg
2018
;
31
:
E74
9
.
32.
Goodnough
LT
,
Maggio
P
,
Hadhazy
E
,
Shieh
L
,
Hernandez-Boussard
T
,
Khari
P
, et al
Restrictive blood transfusion practices are associated with improved patient outcomes
.
Transfusion
2014
;
54
:
2753
9
.
33.
Holmes
JA
,
Bensen
JT
,
Mohler
JL
,
Song
L
,
Mishel
MH
,
Chen
RC
. 
Quality of care received and patient-reported regret in prostate cancer: analysis of a population-based prospective cohort
.
Cancer
2017
;
123
:
138
43
.
34.
Carlson
RW
,
Allred
DC
,
Anderson
BO
,
Burstein
HJ
,
Carter
WB
,
Edge
SB
, et al
Breast cancer. Clinical practice guidelines in oncology
.
J Natl Compr Canc Netw
2009
;
7
:
122
92
.
35.
Siegel
RL
,
Miller
KD
,
Jemal
A
. 
Cancer statistics, 2019
.
CA Cancer J Clin
2019
;
69
:
7
34
.
36.
Litwin
MS
,
Steinberg
M
,
Malin
J
,
Naitoh
J
,
Mcguigan
K
,
Steinfeld
R
, et al
Prostate cancer patient outcomes and choice of providers: development of an infrastructure for quality assessment
.
Santa Monica (CA)
:
RAND Corporation
; 
2000
.
37.
Mohler
JL
,
Armstrong
AJ
,
Bahnson
RR
,
D'Amico
AV
,
Davis
BJ
,
Eastham
JA
, et al
Prostate cancer, version 1.2016
.
J Natl Compr Canc Netw
2016
;
14
:
19
30
.
38.
Thompson
I
,
Thrasher
JB
,
Aus
G
,
Burnett
AL
,
Canby-Hagino
ED
,
Cookson
MS
, et al
Guideline for the management of clinically localized prostate cancer: 2007 update
.
J Urol
2007
;
177
:
2106
31
.
39.
Bozkurt
S
,
Kan
KM
,
Ferrari
MK
,
Rubin
DL
,
Blayney
DW
,
Hernandez-Boussard
T
, et al
Is it possible to automatically assess pretreatment digital rectal examination documentation using natural language processing? A single-centre retrospective study
.
BMJ Open
2019
;
9
:
e027182
.
40.
Ruppert
V
,
Leurs
LJ
,
Rieger
J
,
Steckmeier
B
,
Buth
J
,
Umscheid
T
, et al
Risk-adapted outcome after endovascular aortic aneurysm repair: analysis of anesthesia types based on EUROSTAR data
.
J Endovasc Ther
2007
;
14
:
12
22
.
41.
Nastasi
AJ
,
Bozkurt
S
,
Manjrekar
M
,
Curtin
C
,
Hernandez-Boussard
T
. 
A rule-based natural language processing pipeline for anesthesia classification from EHR notes [abstract]
. In:
Proceedings of the 13th Annual Academic Surgical Congress
;
2018 Sep; Houston, TX. Los Angeles (CA)
:
Association for Academic Surgery
; 
2018
.
Abstr 11.15
.
42.
Chin
KK
,
Carroll
I
,
Desai
K
,
Asch
S
,
Seto
T
,
McDonald
KM
, et al
Integrating adjuvant analgesics into perioperative pain practice: results from an academic medical center
.
Pain Med
2020
;
21
:
161
70
.
43.
Maund
E
,
McDaid
C
,
Rice
S
,
Wright
K
,
Jenkins
B
,
Woolacott
N
. 
Paracetamol and selective and non-selective non-steroidal anti-inflammatory drugs for the reduction in morphine-related side-effects after major surgery: a systematic review
.
Br J Anaesth
2011
;
106
:
292
7
.
44.
Ong
CK
,
Seymour
RA
,
Lirk
P
,
Merry
AF
. 
Combining paracetamol (acetaminophen) with nonsteroidal antiinflammatory drugs: a qualitative systematic review of analgesic efficacy for acute postoperative pain
.
Anesth Analg
2010
;
110
:
1170
9
.
45.
Dowell
D
,
Haegerich
TM
,
Chou
R
. 
CDC Guideline for prescribing opioids for chronic pain—United States, 2016
.
JAMA
2016
;
315
:
1624
45
.
46.
American Society of Anesthesiologists Task Force on Acute Pain Management
. 
Practice guidelines for acute pain management in the perioperative setting: an updated report by the American Society of Anesthesiologists Task Force on Acute Pain Management
.
Anesthesiology
2012
;
116
:
248
73
.
47.
Hernandez
P
,
Podchiyska
T
,
Weber
S
,
Ferris
T
,
Lowe
H
. 
Automated mapping of pharmacy orders from two electronic health record systems to RxNorm within the STRIDE clinical data warehouse
.
AMIA Annu Symp Proc
2009
;
2009
:
244
8
.
48.
Seneviratne
MG
,
Bozkurt
S
,
Patel
MI
,
Seto
T
,
Brooks
JD
,
Blayney
DW
, et al
Distribution of global health measures from routinely collected PROMIS surveys in patients with breast cancer or prostate cancer
.
Cancer
2019
;
125
:
943
51
.
49.
Wei
JT
,
Dunn
RL
,
Litwin
MS
,
Sandler
HM
,
Sanda
MG
. 
Development and validation of the expanded prostate cancer index composite (EPIC) for comprehensive assessment of health-related quality of life in men with prostate cancer
.
Urology
2000
;
56
:
899
905
.
50.
Gori
D
,
Banerjee
I
,
Chung
BI
,
Ferrari
M
,
Rucci
P
,
Blayney
DW
, et al
Extracting patient-centered outcomes from clinical notes in electronic health records: assessment of urinary incontinence after radical prostatectomy
.
EGEMS
(
Wash DC
) 
2019
;
7
:
43
.
51.
Weiskopf
NG
,
Rusanov
A
,
Weng
C
. 
Sick patients have more data: the non-random completeness of electronic health records
.
AMIA Annu Symp Proc
2013
;
2013
:
1472
7
.
52.
Jackson
R
,
Chambless
LE
,
Yang
K
,
Byrne
T
,
Watson
R
,
Folsom
A
, et al
Differences between respondents and nonrespondents in a multicenter community-based study vary by gender ethnicity. The Atherosclerosis Risk in Communities (ARIC) Study Investigators
.
J Clin Epidemiol
1996
;
49
:
1441
6
.
53.
Richiardi
L
,
Boffetta
P
,
Merletti
F
. 
Analysis of nonresponse bias in a population-based case-control study on lung cancer
.
J Clin Epidemiol
2002
;
55
:
1033
40
.
54.
Coquet
J
,
Bozkurt
S
,
Kan
KM
,
Ferrari
MK
,
Blayney
DW
,
Brooks
JD
, et al
Comparison of orthogonal NLP methods for clinical phenotyping and assessment of bone scan utilization among prostate cancer patients
.
J Biomed Inform
2019
;
94
:
103184
.
55.
Purushotham
S
,
Meng
C
,
Che
Z
,
Liu
Y
. 
Benchmarking deep learning models on large healthcare datasets
.
J Biomed Inform
2018
;
83
:
112
34
.
56.
Gensheimer
MF
,
Henry
AS
,
Wood
DJ
,
Hastie
TJ
,
Aggarwal
S
,
Dudley
SA
, et al
Automated survival prediction in metastatic cancer patients using high-dimensional electronic medical record data
.
J Natl Cancer Inst
2019
;
111
:
568
74
.
57.
Gianfrancesco
MA
,
Tamang
S
,
Yazdany
J
,
Schmajuk
G
. 
Potential biases in machine learning algorithms using electronic health record data
.
JAMA Intern Med
2018
;
178
:
1544
7
.
58.
Gijsberts
CM
,
Groenewegen
KA
,
Hoefer
IE
,
Eijkemans
MJC
,
Asselbergs
FW
,
Anderson
TJ
, et al
Race/ethnic differences in the associations of the Framingham risk factors with carotid IMT and cardiovascular events
.
PLoS One
2015
;
10
:
e0132321
.
59.
Ferryman
K
,
Pitcan
M
. 
Fairness in precision medicine
.
New York
:
Data & Society
; 
2018
. Available from: https://datasociety.net/library/fairness-in-precision-medicine/.
60.
Char
DS
,
Shah
NH
,
Magnus
D
. 
Implementing machine learning in health care—addressing ethical challenges
.
N Engl J Med
2018
;
378
:
981
3
.
61.
Hernandez-Boussard
T
,
Monda
KL
,
Crespo
BC
,
Riskin
D
. 
Real world evidence in cardiovascular medicine: assuring data validity in electronic health record-based studies
.
J Am Med Inform Assoc
2019
;
26
:
1189
94
.
62.
Weber
GM
,
Mandl
KD
,
Kohane
IS
. 
Finding the missing link for big biomedical data
.
JAMA
2014
;
311
:
2479
80
.
63.
Howlader
N
,
Noone
A
,
Krapcho
M
.
SEER stat fact sheets: prostate cancer
.
Bethesda (MD)
:
National Cancer Institute
; 
2017
.
Available from
: https://seer.cancer.gov/statfacts/html/prost.html.