Background: There is no agreement on the best data source for measuring colorectal cancer (CRC) screening. Medicare claims have been used to measure CRC testing but the validity of using claims to measure fecal occult blood tests (FOBT) has not been established.
Methods: We compared ascertainment of FOBT among three data sources: self-reports, Medicare claims, and medical records. Data were collected on FOBT use during the study window (1/1/1998 – 12/31/2002). Our study was conducted with North Carolina Medicare enrollees (N = 561) who had previously responded to a telephone survey on CRC tests. FOBT information was abstracted from respondents' physician office medical records and compared with self-reported FOBT use and Medicare claims for FOBT. Data sources were assessed for accuracy and completeness of FOBT reporting using sensitivity, specificity, positive predictive value, negative predictive value, and agreement.
Results: Reporting of FOBT use in the prior year in medical records and Medicare claims agreed 82% of the time [95% confidence interval (95% CI), 79-85%]. FOBT 1-year use rates from self-report agreed with test use found in medical records 70% of the time (95% CI, 66-74%). The lowest agreement was between self-reported 1-year FOBT use and Medicare claims, which agreed 67% of the time (95% CI, 63-71%).
Conclusions: No data source could be established as providing complete and valid information about FOBT use among Medicare enrollees, showing the difficulty of ascertaining test use rates for noninvasive, low-cost procedures conducted in multiple settings. Caution should be used when attempting to measure FOBT use with self-report, Medicare claims, or medical records. (Cancer Epidemiol Biomarkers Prev 2008;17(4):799–804)
Colorectal cancer (CRC) screening is widely recommended for persons ages 50 and older (1-3). Although colonoscopy has received more attention recently than other screening modalities, flexible sigmoidoscopy, barium enema, and annual fecal occult blood tests (FOBT) are recognized as equally effective in most screening guidelines. Despite the multiple options for CRC screening, CRC screening is underused (4). Since the introduction of Medicare coverage for CRC screening tests, test use assessed by Medicare claims has remained virtually stable: ∼20% of Medicare enrollees receive a CRC test each year and just over half (52%) had any CRC test from 1998 to 2004 (5). To monitor progress toward public health screening goals, it is important to develop methods to evaluate population-based use of CRC screening in the United States.
Measuring the level of CRC test use in the population is difficult for two primary reasons. First, the recommended time between tests for average-risk patients varies by test: 10 years between each colonoscopy, 5 years between each sigmoidoscopy or barium enema, and 1 year between each FOBT (1, 2). The optimal mix of how many persons should receive each test in any given year is unknown. Second, there is no uniform source of data to measure CRC test use and no agreement on the best data source to measure CRC test use. Complete ascertainment of CRC test use will vary by the type of procedure and data source. Studies often rely on self-report to assess CRC test use. However, prior studies have found poor agreement between rates for FOBT based on self-reported test use and claims or medical records (6-10). An alternate source of data to assess CRC testing is Medicare claims. An earlier study on the measurement of colorectal endoscopy (e.g., sigmoidoscopy and colonoscopy) found Medicare claims to be an accurate source for measurement of endoscopy use (11). It is not clear whether FOBT is reported as completely in the Medicare claims as endoscopy. Several characteristics of FOBT distinguish it from endoscopy procedures, leading to the concern that Medicare claims may underrepresent FOBT testing in the Medicare population. Medicare reimbursement for FOBT is low, which may discourage physician billing for FOBT testing, and FOBT test kits are available for purchase and use by consumers without a physician visit.
In this article, FOBT use was determined from three data sources: Medicare claims, self-report, and medical records. Although we evaluated all data sources, the primary focus of our analysis was on whether Medicare claims could completely and accurately capture FOBT use. This study was a collaborative effort among the Centers for Medicare & Medicaid Services, the National Cancer Institute, and The Carolinas Center for Medical Excellence, the Quality Improvement Organization for the Medicare program in North and South Carolina.
Materials and Methods
Details of the study cohort, data, and methods have been previously described (11). Briefly, the study population included African-American and White Medicare fee-for-service enrollees between the ages of 50 and 80 years who resided in one of 10 urban counties in North Carolina and had responded to a telephone survey in 2002 on their prior CRC screening (n = 1,001). Primary care providers were identified for survey respondents, making them eligible for medical record abstraction (n = 877). Internal medicine, general practice, family practice, preventive medicine, gerontology, obstetrics/gynecology, nurse practitioners, and physician assistants were included in the definition of primary care providers.
The identification of the primary care provider was a two-step process. Respondents were asked the name and address of their primary care provider on the telephone survey. If the respondent did not name a physician, or the physician named by the respondent could not be located, a primary care provider was identified using an algorithm developed by the Medicare Healthcare Quality Improvement Program (12). We examined 3 years of Medicare claims (2000-2002) to identify a primary care provider for study participants identified by the unique physician identifier on the claim. The algorithm ranked physicians based on the number of visits during the 3-year window and selected the primary care provider with the most visits as the likely provider of “routine” care. If no primary care provider was identified through the survey or algorithm, the physician with the most office visits based on the algorithm, regardless of specialty, was identified as the physician for the medical record accession.
A subset of the eligible respondents (n = 694) was targeted for medical record abstraction. This subset included all respondents with a claim or self-report of an endoscopy procedure in the past 2 years (n = 574) and a random sample of respondents without evidence of prior test (n = 120; ref. 11). We obtained medical records from 1998 to 2002 for 88% of the targeted respondents, yielding a study group of 609. Medical records were abstracted between June 2004 and May 2005. Analyses were limited to survey respondents with continuous enrollment in Medicare Part A and B and no health maintenance organization enrollment from entry into Medicare until death or the end of the study window (defined as January 1, 1998 to December 31, 2002), yielding a final study group of 561 Medicare enrollees.
We merged data from three different sources (Medicare data, telephone survey, and physician office medical records) using a unique Medicare identifier to create a single patient level record for each study participant. Participants' records included demographic information and details about their CRC test use. A brief description of each data source follows below. Additional details of the study methods can be found in our earlier report (11).
Medicare claims for study participants were obtained from all settings (inpatient, physician, and hospital outpatient claims) for a 5-year period (January 1, 1998 to December 31, 2002). Claims were analyzed to capture any billing for FOBT tests using both diagnostic and screening codes for FOBT [Current Procedural Terminology (CPT) codes 82270 and 82274 and Health Care Financing Agency Common Procedure Coding System code G0107; refs. 13, 14]. We obtained demographic and coverage information for study participants from the Medicare enrollment database.
Questions on the 2002 survey were designed to distinguish the recommended method of FOBT, home test kits, from a FOBT undertaken in the physician's office through a digital rectal exam. Respondents were read the following description of FOBT: A FOBT or stool blood test is a test to check for colon or rectal cancer. It is done at home using a set of three cards. You smear a sample of your fecal matter or stool on a card from three separate bowel movements and return the cards to be tested for blood. After the test description was read, respondents were asked if they had ever completed a stool blood test at home and, if so, the timing of their most recent stool blood test using a home test kit. Respondents were also asked whether the test had been conducted as part of a checkup or because of a problem. The telephone survey also included questions on educational level, marital status, and names and addresses of primary care providers and physicians who conducted respondents' most recent CRC tests, if any. A copy of the survey instrument is available from the corresponding author (A.P.S.).
A comprehensive record abstraction tool captured CRC test use between the dates January 1, 1998 and December 31, 2002 from medical records. Nurse abstractors visited physicians' offices across the state to abstract medical record information directly into an electronic abstraction tool contained on a laptop computer. The following information about the four most recent FOBT tests was abstracted from the medical record: date of test, the type of documentation found about the test (office visit notes, letter, lab result, or documentation on a flow sheet), type of test (three cards sent to lab, digital rectal exam done in office; immunochemical FOBT or not specified), the reason for the test (screening, symptoms, surveillance, or not specified), and the results of the test (positive, negative, or not specified). In addition, the medical record abstraction also collected the date of the last visit and most recent well checkup, patient medical conditions, and information about the medical practice (type of practice, number of physicians in the practice, and whether the practice used electronic medical records).
Definitions of FOBT Use
The recommended interval for CRC testing by FOBT is 1 year (1-3). The national goal for CRC screening by FOBT, as published in Healthy People 2010, sets a target of 50% of adults ages 50 years or older having a FOBT within the past 2 years (15). For this study, we measured the frequency of FOBT use in two ways. First, consistent with the recommended test use interval, we classified people as having had a FOBT if there was evidence they had it in the preceding 1 year. To allow for comparisons with the Healthy People 2010 target, and because analyses of self-reports have shown telescoping, where procedures are recalled to have occurred more recently than their actual date, we expanded our test use window to analyze FOBT use during the preceding 2 years. Both measures of FOBT use were calculated separately for each data source.
We captured any FOBT mentioned in the medical records occurring during the study window. However, in the assessment of the data sources, FOBTs that were identified in the medical record as having been conducted by digital rectal exam were not included. Tests found in the medical record for which the abstractor was unable to tell the type of FOBT test were included, although we evaluated the effect of excluding those tests. As classification of the FOBT tests of undetermined type in the medical record made very little difference in our findings, the results we report include these tests in the definition of FOBTs found in the medical record.
Analysis of data sources on FOBT testing included three different comparisons: claim with medical record, survey with medical record, and survey with claim. In comparisons using the medical record, the medical record was designated as the criterion standard. For the survey-with-claim comparison, the claim was designated as the criterion standard. Assessment of the completeness and accuracy of FOBT reporting was measured with five statistics: sensitivity, specificity, positive predictive value, negative predictive value, and agreement (see Table 3 for formulas; refs. 16, 17). We calculated each statistic twice (for 1-year use and for 2-year use). Ninety-five percent confidence intervals (95% CI) were calculated around the statistics using standard methods (17).
Although the statistics are interrelated, each provides a slightly different measure of validity of the data sources. Using the claim-with-medical record comparisons as an example, sensitivity provides a measure of the extent to which claims for FOBT can completely capture the universe of people who had the test (in this case, measured by medical record). Specificity provides a measure of the extent to which claims for FOBT can be used to identify accurately persons who did not undergo the test. Positive predictive value indicates the proportion of people who actually had a FOBT, given that there is a claim. Negative predictive value indicates the probability that a person has not had a FOBT, given that there is not a claim. Agreement indicates the percentage of people for whom the claims and the medical record agree on FOBT use.
Vernon et al. (18) have recommended the use of the report-to-records ratio to quantify the bias in cancer screening rates based on self-report. Report-to-records ratios greater than 1 indicate over-reporting and values less than 1 indicate underreporting. We calculated the report-to-records ratio for the two comparisons: self-report compared with medical records and self-report compared with claims. The report-to-records ratio was calculated by dividing the number of person who reported having a FOBT by the number of persons who had a FOBT in the data source used as the “gold” standard (either the medical record or the claim).
Records were abstracted from 269 physician offices. The majority of offices were single-specialty primary care providers (68%), 14% were multispecialty offices, 10% were gastroenterology offices, and 7% were other specialties. Although many FOBTs were identified in the medical records during the 5-year abstraction window (n = 953), documentation in the medical records of the type of FOBT was poor. Examining the most recent FOBT found in the medical records, 31% of study participants had a test with insufficient documentation to determine the type of FOBT (Table 1). One hundred seventy-six (31%) of the study group had a FOBT based on a digital rectal exam as their most recent FOBT, which is not recommended for CRC screening.
Estimates of FOBT use in the previous 1- and 2-year windows varied by data source and by demographic characteristic of the study participant (Table 2). The highest 1-year rates observed were based on self-report (28.7%). Claims-based 1-year test rates were 21.1%. The lowest rates were from medical records (19.4%). Across all demographic groups, 1-year rates based on self-report were higher than those from claims or medical records. Test use rates captured by claims were higher than those ascertained from the medical record except for those ages 50 to 64 years. However, similar patterns were observed, with the highest rates based on self-report and lowest rates from medical records.
Table 3 shows the results from comparison of different data sources for reporting of FOBT. For the 1-year period, the sensitivity of the data sources ranged from 40% (self-report with claims) to 58% (claims with medical record). The highest positive predictive value was 53% (claims with medical records for both time windows). Agreement measure of FOBT use among the data sources was moderate (Table 3). The highest measures of agreement were observed in the comparison of medical records and Medicare claims for FOBT use in the previous 1 year (agreement, 82%; 95% CI, 79-85). The worst agreement for FOBT in the previous 1 year was between self-report and Medicare claims (agreement, 67%; 95% CI, 63-71). Across all three comparisons, the 1-year window had better agreement than the 2-year window.
The report-to-records ratio of FOBT use in the past year, for the comparison of survey to medical record, was 1.5. The report-to-records ratio, for FOBT use in the past year for survey compared with Medicare claim, was 1.4. Both estimates indicate over-reporting.
Given the limited agreement between the data sources on FOBT testing, we conducted several sensitivity analyses. First, we attempted to identify, in the Medicare claims, those tests that were conducted during office visits, assuming such tests would be more likely to be based on “single” cards rather than the three cards done as part of a home test kit. Using the CPT codes to designate office visits, we created a flag to indicate when FOBT occurred in the claims files on the same date as an office visit claim, labeling those FOBTs as presumptively single-card tests. We then excluded those claims and compared FOBTs not linked to office visits to the FOBTs identified in the medical record as having been done using the three-card approach. The concordance measures were virtually unchanged (data not shown).
Second, we included all FOBTs found in the medical record, thereby including digital rectal exams done in the office. We examined the concordance of this revised definition of medical record FOBT with FOBTs identified using all Medicare claims, with similar results (data not shown).
Finally, we examined the potential of using more than one data source as the gold standard. We created variables to characterize FOBT testing in either of two data sources (claim or medical record, medical record or survey, and claim or survey) and used the two-source test use measure as the gold standard against which to compare the third data source. We created two versions of the two-source test use measures, accommodating definitions for claims and medical records to include or exclude office-based tests. None of the comparisons, with any of the modifications or combinations attempted, showed sufficient concordance to instill confidence in the use of any of the data sources (data not shown).
We found that no data source could be established as providing complete and valid information about FOBT use among Medicare enrollees in fee for service. Our primary purpose for conducting these analyses was to determine whether Medicare claims could be used to accurately measure FOBT. Other investigators have used Medicare claims to assess use of FOBT (19-24). Our results provide strong evidence that these claims are not a reliable source for measuring FOBT. However, the limitations of the data are not restricted to Medicare claims; all three data sources examined in this study were imperfect sources of information about FOBT use.
Our study results are in contrast to those of Baier et al. (25). In a study of managed care enrollees, these investigators compared self-reported FOBT use with test use based on laboratory evidence of FOBT cards and found high sensitivity and specificity (96% and 86%, respectively). One probable reason for the disparate results is that our study was conducted in a community setting where patients can, and do, visit multiple providers and thus have multiple medical records. In contrast, the study by Baier et al. was conducted in a health care system that used a single laboratory; this minimized the potential for missing a FOBT test by looking in the “wrong” medical record.
Although most studies, including ours, have found FOBT self-reported use to be higher than claims or medical record documented levels, one study found the opposite pattern (26). In an intervention study, Lipkus et al. compared self-reported recall of FOBT use with laboratory records indicating that a FOBT kit had been returned to the study group. The finding of lower self-report in that study is likely due to two factors: more complete ascertainment of laboratory information and better ascertainment of self-report. In the study by Lipkus et al., FOBT kits were mailed to a central location, creating a more accurate gold standard. In addition, in their telephone survey, Lipkus et al. asked about completion of a FOBT in the previous 10 months, a shorter window than used in our study, thereby minimizing problems with recall.
Our study underscores the difficulty of ascertaining test use rates for noninvasive, low-cost procedures conducted in multiple settings. Each data source has different reasons for its possibly inaccurate ascertainment of FOBT. Self-reported FOBT use may be higher than that found in claims or the medical record due to inaccurate or telescoped recall. On the other hand, self-reported FOBT use may be more accurate than claims because it can capture tests where no insurance was billed. Medicare claims, as the primary payer for Medicare enrollees, can, at least in theory, capture procedures done by multiple providers in multiple settings. However, lack of agreement between claims and medical records for procedures conducted in the physician office has been observed previously (27, 28). Adding to the confusion is the fact that, historically, the Medicare CPT-4 code for FOBT has provided reimbursement for one to three cards, making it impossible to distinguish claims for a FOBT based on a single card obtained by a digital rectal exam done by a physician and a FOBT based on three cards from a home test kit. Medical records can capture tests done regardless of payment source, but many people have more than one physician, especially over time. Thus, outside of an integrated health care system, a medical record from a single office can be incomplete. Even when a short time window was of interest, we found a substantial number of FOBT tests in the medical record with insufficient documentation to determine the type of test that had been conducted (i.e., digital rectal exam versus home test kit).
Further complicating the measurement of FOBT, we found substantial documentation in medical records of FOBT having been done by digital rectal exam, which is not a recommended approach for CRC screening. It is unknown whether physicians discuss the office-based digital rectal exam with patients as a cancer screening approach, but it is possible that patients think they have been screened when they receive a digital rectal exam, thus adding to problems with self-report. The gap between the recommended use of a three-card home FOBT and the prevalence of physician use of office-based FOBT underscores the need for better understanding of how to incorporate screening guidelines into standard care.
These findings have implications for health care providers and researchers interested in CRC screening. The poor concordance across FOBT data sources observed in this study shows that there is no single data source capable of accurately measuring FOBT use. An increasing interest in electronic medical records has fostered the expectation that their application will improve our ability to track preventive services use. However, for FOBT, documentation of the type of test (digital rectal exam versus home test kit) must be included before electronic medical records will improve our ability to monitor FOBT use through medical records. Effective January 1, 2007, Centers for Medicare & Medicaid Services implemented a new CPT code to help distinguish tests based on a single card obtained in the physician's office and tests based on three cards completed at home (29). This new code may improve the future ability of Medicare claims to capture CRC screening with FOBT. Increasing use of immunochemical FOBT, which is reimbursed at a higher rate than guaiac FOBT, may also result in improved documentation of FOBT in claims and medical records. Future validation studies of FOBT should focus on assessing the effect of these recent changes.
Grant support: The analyses on which this publication is based were done under Contract No. 500-02-NC03 (“Utilization and Quality Control Peer Review Organization for the State of North Carolina”), sponsored by the Centers for Medicare & Medicaid Services and the National Cancer Institute under interagency agreement #Y1-PC-1007.
Note: The content of this publication does not necessarily reflect the views or policies of Centers for Medicare & Medicaid Services, National Cancer Institute, or The Carolinas Center for Medical Excellence. The opinions expressed in this article are those of the authors.
We thank Jim Coan (Centers for Medicare & Medicaid Services) for project management support and Karen Bell (The Carolinas Center for Medical Excellence) for her assistance with coding questions.