Valid and reliable self-report measures of cancer screening behaviors are important for evaluating efforts to improve adherence to guidelines. We evaluated test-retest reliability and validity of self-report of the fecal occult blood test (FOBT), sigmoidoscopy (SIG), colonoscopy (COL), and barium enema (BE) using the National Cancer Institute colorectal cancer screening (CRCS) questionnaire. A secondary objective was to evaluate reliability and validity by mail, telephone, and face-to-face survey administration modes. Consenting men and women, 51 to 74 years old, receiving care at a multispecialty clinic for at least 5 years who had not been diagnosed with colorectal cancer were stratified by prior CRCS status and randomized to survey mode (n = 857). Within survey mode, respondents were randomized to complete a second survey at 2 weeks, 3 months, or 6 months. Comparing self-report with administrative and medical records, concordance estimates were 0.91 for COL, 0.85 for FOBT, 0.85 for SIG, and 0.92 for BE. Overall sensitivity estimates were 0.91 for COL, 0.82 for FOBT, 0.76 for SIG, and 0.56 for BE. Specificity estimates were 0.91 for COL, 0.86 for FOBT, 0.89 for SIG, and 0.97 for BE. Sensitivity and specificity varied little by survey mode for any test. Report-to-records ratio showed overreporting for SIG (1.1), COL (1.15), and FOBT (1.57), and underreporting for BE (0.82). Reliability at all time intervals was highest for COL; there was no consistent pattern according to survey mode. This study provides evidence to support the use of the National Cancer Institute CRCS questionnaire to assess self-report with any of the three survey modes. (Cancer Epidemiol Biomarkers Prev 2008;17(4):758–67)

Valid and reliable self-report measures of cancer screening behaviors are important for identifying correlates and predictors of behavior, evaluating the effectiveness of behavioral interventions, and monitoring progress and trends in adherence to cancer screening guidelines (1, 2). The implementation of the Health Insurance Portability and Accountability Act of 1996 limited access to medical records and is likely to increase the need to use self-reported data in epidemiologic, health services, and behavioral research (3-5). For colorectal cancer screening (CRCS), assessing the accuracy of self-reports is especially difficult because there are multiple types of acceptable screening tests [i.e., fecal occult blood test (FOBT), sigmoidoscopy (SIG), colonoscopy (COL), and barium enema (BE)]; the recommended time interval for test completion differs between tests, and the guidelines have changed over time (6). Adding to this complexity is the number of measures of utilization, e.g., initial, recent, up-to-date, ever.

In 1999, the National Cancer Institute (NCI) convened a group of experts to develop uniform descriptions of the tests and questions to measure CRCS behaviors (hereafter called the NCI CRCS questionnaire; ref. 6). The initial draft of the questionnaire underwent cognitive testing as part of the development of NCI's Health Information National Trends Survey in 2003. Qualitative data from focus groups (7-12) showed that many adults do not know or recognize the names of CRCS tests and that many were unable to distinguish between SIG and COL or between home-based and office-based FOBT (13, 14); therefore, the cognitive interviews focused on issues related to comprehension and interpretation of the questions and, to a lesser extent, on strategies respondents used to recall information. Findings from the cognitive interviews were consistent with previous findings and led to a revised questionnaire and a recommendation that CRCS test descriptions be provided prior to asking questions about test use (6).

The purpose of this report was to evaluate the reliability and validity of self-report for the four tests included in the NCI CRCS questionnaire (FOBT, SIG, COL, and BE). A secondary objective was to determine if validity estimates were equivalent across three modes of administration (mail, telephone, and face-to-face). We focused on survey mode because such differences may affect comparisons of estimates from national surveys (e.g., the National Health Interview Survey is administered face-to-face whereas the Behavioral Risk Factor Surveillance System is administered by telephone) and because these survey modes are commonly used in surveys and intervention studies. Although one study evaluated the accuracy of self-reports of mammography when ascertained by mail compared with telephone (15), to date, no one has compared self-report of any cancer screening behaviors when assessed by telephone versus face-to-face modes. Holbrook et al. (16) suggest that response quality may differ between telephone and face-to-face interviews due to the opportunity to establish greater trust and rapport between the respondent and the interviewer. We also evaluated the test-retest reliability of the NCI CRCS questionnaire over 2-week, 3-month, and 6-month intervals.

### Study Setting, Population, and Recruitment

The study was approved by the University of Texas Health Science Center at Houston, Committee for the Protection of Human Subjects. The study setting was the Kelsey-Seybold Clinic (KSC), the largest multispecialty medical organization in Houston, Texas with a main campus and 17 satellite clinics. KSC delivers primary and specialty care to >400,000 patients and has a staff of 119 primary care physicians (family and internal medicine) and seven gastroenterologists. The prevalence of any CRCS among KSC patients, defined as home-based FOBT within the past year, SIG within the past 5 years, or COL within the past 10 years, was 54% in 2000 (17).

The study population was English-speaking men and women between 51 and 74 years of age, receiving primary care at KSC for at least 5 years. Although guidelines for COL recommend one every 10 years, we decided against requiring KSC enrollment for 10 years because of the limited number of patients meeting this more restrictive eligibility criterion. Patients with a personal history of colorectal cancer were excluded.

Eligibility status was determined from the KSC electronic administrative database. Potentially eligible participants were randomly selected every 2 weeks. Patients were mailed an invitation letter describing the study along with contact information for the Kelsey Research Foundation; staff at the Foundation facilitate research collaborations between KSC physicians and university researchers. The letter stated that patients could decline participation by calling the Foundation. A Foundation research assistant called patients who did not actively decline in order to confirm willingness to participate, ascertain eligibility, and enroll them in the study by obtaining verbal informed consent, and permission to review the medical record (as required by the Health Insurance Portability and Accountability Act). Patients were called at least six times before being classified as nonrespondents. Patients who refused were asked to give a reason for declining. Weekly, the Foundation staff sent enrollee names to University of Texas-Houston School of Public Health project staff who made all additional contacts. Recruitment began in September 2005 and follow-up ended in August 2007.

### Study Design

We used a randomized design to assess equivalence of reliability and validity estimates for mail, telephone, and face-to-face modes of survey administration. We stratified randomization to survey mode by CRCS test status [single test within guidelines (FOBT, SIG, COL, or BE), multiple tests within guidelines, or no test within guidelines] as recorded in the KSC administrative database. Random numbers were generated using Stata version 10.0 and were used to assign patients with these test profiles to survey mode.

We defined adherence to guidelines as FOBT within the past year or SIG, COL, or BE within the past 5 years. We used a 5-year interval for COL, rather than 10 years as recommended by the American Cancer Society (18), to be consistent with our eligibility criteria (i.e., KSC patient for at least 5 years).

### Self-report Questionnaire and Study Procedures

Vernon et al. (6) have described the development of the NCI CRCS questionnaire. We formatted two versions for this study. The mail version had explicit written instructions and detailed skip patterns. The telephone and face-to-face version contained interviewer scripts and prompts. Pilot testing resulted in reordering a few questions to help respondents follow the skip patterns appropriate for their CRCS test history. Both versions were professionally printed as eight-page color bubble-formatted booklets that could be scanned for data entry.

We also included questions about sociodemographics (age, sex, race/ethnicity, education, and marital status), family history of colorectal cancer, use of the healthcare system (recency of last KSC visit and number of KSC and non-KSC visits during the past 5 years), and social desirability. Because prior studies consistently show that respondents tend to overreport cancer screening behaviors, we included a measure of social desirability, a construct defined as the tendency to respond to questions in socially or culturally sanctioned ways (19, 20). We used the 10-item version (21) of the original 33-item Marlowe-Crowne Social Desirability Scale (19). The response format was true/false and scores ranged from 0 (low social desirability) to 10 (high social desirability).

The first survey (hereafter called the validation survey) was used to compare patients' self-report against information in the medical record, electronic administrative database, and reports from non–KSC physicians. Upon completion of the validation survey, participants were asked whether they would be willing to complete a second survey at a future date (hereafter called the reliability survey). Consenting respondents were randomly assigned within the same survey mode to be re-interviewed at one of three time intervals—2 weeks, 3 months, or 6 months. The same protocols were followed in the reliability survey.

### Statistical Analysis

We used χ2 statistics to assess whether sociodemographics and healthcare use differed according to randomized group; ANOVA was used to assess mean differences in social desirability scores. To evaluate the reliability and validity estimates, we used a criterion approach described below.

Reliability analyses. To assess test-retest reliability of the four CRCS tests, we compared a participant's responses to the validation survey with responses on the counterpart 2-week, 3-month, or 6-month reliability survey. We coded a participant's responses as consistent if the time interval between the survey date and the self-reported month and year was within guidelines on both the validation and reliability surveys or if no test within guidelines was reported on both surveys. If month and year were not reported, we used data from the time interval question. Missing values on either survey were counted as no test within guidelines. We excluded CRCS tests from these calculations if they were documented in the combined medical record as having been completed between surveys. In addition to the four CRCS test types, we also examined reliability for a combined measure of SIG and COL (i.e., endoscopy), defined as either test within the past 5 years.

To correct for chance agreement, we calculated κ coefficients (25). We used criteria recommended by Landis and Koch (26) to assess the adequacy of κ coefficients: values >0.80 indicate excellent agreement, between 0.61 and 0.80 substantial agreement, between 0.41 and 0.60 moderate agreement, 0.21 and 0.40 fair agreement, and <0.21 slight agreement.

Validity analyses. Validity was evaluated with four measures: concordance (raw percentage agreement), sensitivity, specificity, and the report-to-records ratio. The report-to-records ratio is the ratio of participants reporting a test (true positives plus false positives) divided by the percentage of tests in the record (true positives plus false negatives). It is a measure of net bias in test reporting with values >1.0 indicating overreporting and values <1.0 indicating underreporting (27). All measures were calculated for each CRCS test type and for endoscopy as well as by mode of survey administration.

We used test dates in the combined medical record to determine adherence to guidelines for each type (yes/no). Based on the response to the question about the month and year (or if missing, data from the interval question), respondents were classified as adherent or not for each test (yes/no). We then compared data from both sources to calculate the four validity measures. Month and year were reported for 61% of FOBTs, 60% of SIGs, 79% of COLs, and 39% of BEs.

Respondents with missing values for both questions and those who reported an unverified test from a non–KSC provider were classified as nonadherent. Missing values were 2% for FOBT, 3% for SIG, 4% for COL, and 5% for BE. Most missing data were from the mail survey.

In the primary analysis, persons with multiple tests within the guidelines documented in the combined medical record were included in the analyses for each test they had. A person without a given test within the guideline-specific time period was included in the analyses as a nontester (e.g., a person with an up-to-date FOBT but no COL was analyzed as a nontester in the COL analysis).

We used Tisnado et al.'s (28) criteria for evaluating the sensitivity and specificity of ambulatory care services: ≥0.9 indicates excellent agreement, ≥0.8 to <0.9 indicates good agreement, ≥0.7 to <0.8 indicates fair agreement, and <0.7 indicates poor agreement. We judged a measure to have adequate precision if the lower confidence limit was >0.70. For concordance, sensitivity, specificity, and report-to-records ratio, we calculated two-sided 95% confidence intervals.

We conducted several ancillary analyses to assess effects on sensitivity or specificity estimates. First, we excluded respondents with missing data on when they had their most recent test, rather than assume that they were nonadherent (65 exclusions). A second analysis excluded respondents with gastrointestinal conditions: polyps, diverticulitis, or Crohn's disease (67 exclusions). In a third analysis, we examined whether the number of test types received within guidelines affected recall for each test by stratifying sensitivity estimates on single versus multiple-test status as recorded in the combined medical record. We hypothesized that patients with multiple CRCS tests would be more knowledgeable about screening and therefore would be more accurate (i.e., sensitivity would be higher) in their reporting than patients having only a single test. Likewise, we hypothesized that patients with no documented test in the combined medical record during the 5-year time period would be more accurate in reporting not having a particular test compared with patients who received other types of CRCS tests, but not the one being assessed. We examined this issue by stratifying specificity estimates for each test type according to nontester status. Finally, we measured adherence to CRCS guidelines using the interval question as the primary source of self-reported information and supplementing missing information with data on month and year. Almost all respondents provided a time interval: 98% of FOBTs, 95% of SIGs, 92% of COLs, and 92% of BEs.

From September 2005 to December 2006, we invited 3,519 potential study candidates and were able to contact 3,028 (86%; Fig. 1). Of the 3,028 contacted, 2,167 (72%) were eligible. Of these, 1,163 (54%) refused prior to randomization. Among the refusals, 49% did not give a reason. Among the rest, reasons included lack of interest (27%), no time (15%), only being willing to do a specific survey mode (4%), and miscellaneous other reasons (5%). Of the 1,004 randomized, our overall response proportion was 40% (857 of 2,167 eligible candidates).

Figure 1.

Sampling and recruitment of the study population from Kelsey-Seybold Clinic, Houston, Texas; 2005-2006.

Figure 1.

Sampling and recruitment of the study population from Kelsey-Seybold Clinic, Houston, Texas; 2005-2006.

Close modal

Postrandomization validation survey completion by mode was 80% for face-to-face, 88% for mail, and 89% for telephone (P = 0.36). Although not statistically significant, there were more refusals to the face-to-face survey compared with the other two survey modes, whereas there were more nonrespondents to mail and telephone surveys compared with face-to-face surveys (Fig. 1).

There were few statistically significant differences in characteristics by survey mode (Table 1). Overall, most participants were in the younger age group, were female, white, married, had at least some college education, reported no family history of colorectal cancer, and had few visits to non–KSC providers. Respondents to the mail survey were somewhat less likely than respondents in other survey modes to report visiting a KSC provider in the past year; however, >88% in all groups had done so. Mail respondents also were more likely than the other groups to report fewer than six visits to a KSC provider within the past 5 years. Mean scores for the social desirability scale were higher for telephone respondents than for face-to-face surveys. As expected, because we stratified our results according to prior CRCS, the groups did not differ on test status.

Table 1.

Characteristics of survey respondents by mode of survey administration, Kelsey-Seybold Clinic (KSC), Houston, Texas, 2005-2006

Face-to-face, n = 280
Mail, n = 291
Telephone, n = 286
P*df
n (%)n (%)n (%)
Age (years)
Mean 58.8 59.4 59.4 0.37
51-64 232 (82.9) 231 (79.4) 232 (81.1)
≥65 48 (17.1) 60 (20.6) 54 (18.9) 0.57
Gender
Female 186 (66.4) 188 (64.6) 190 (66.4)
Male 94 (33.6) 103 (35.4) 96 (33.6) 0.87
Race/ethnicity
Non-Hispanic white 178 (63.6) 158 (54.3) 173 (60.5)
African American 69 (24.6) 84 (28.9) 73 (25.5)
Hispanic 28 (10.0) 31 (10.7) 25 (8.7)
Unreported 5 (1.8) 18 (6.2) 15 (5.2) 0.48
Marital status
Not married 75 (26.8) 67 (23.0) 73 (25.5)
Married/living with a partner 205 (73.2) 218 (74.9) 213 (74.5)
Unreported 0 (0.0) 6 (2.1) 0 (0.0) 0.66
Education
<High school 4 (1.4) 7 (2.4) 12 (4.2)
High school/GED 28 (10.0) 36 (12.4) 28 (9.8)
Some college 96 (34.3) 88 (30.2) 83 (29.0)
≥College 150 (53.6) 155 (53.3) 163 (57.0)
Unreported 2 (0.7) 5 (1.7) 0 (0.0) 0.32
Family history of colorectal cancer
No 244 (87.1) 243 (83.5) 249 (87.1)
Yes 31 (11.1) 37 (12.7) 29 (10.1)
Unreported 5 (1.8) 11 (3.8) 8 (2.8) 0.58
Last healthcare visit to KSC
Within the past year 270 (96.4) 258 (88.7) 272 (95.1)
>1 y 10 (3.6) 26 (8.9) 14 (4.9)
Unreported 0 (0.0) 7 (2.4) 0 (0.0) 0.01
No. vists to KSC in past 5 y
0-5 24 (8.6) 61 (21.0) 28 (9.8)
>5 256 (91.4) 223 (76.6) 258 (90.2)
Unreported 0 (0.0) 7 (2.4) 0 (0.0) 0.00
No. visits outside KSC in past 5 y
None 143 (51.1) 156 (53.6) 152 (53.1)
1-2 61 (21.8) 46 (15.8) 65 (22.7)
3-5 32 (11.4) 42 (14.4) 29 (10.1)
>5 44 (15.7) 40 (13.7) 39 (13.6)
Unreported 0 (0.0) 7 (2.4) 1 (0.3) 0.32
Test status
No tests 114 (40.7) 126 (43.3) 113 (39.5)
Single test 107 (38.2) 111 (38.1) 126 (44.1)
Multiple tests 59 (21.1) 54 (18.6) 47 (16.4) 0.43
Social desirability§
Mean score (SD) 6.64 (1.95) 6.69 (2.11) 7.08 (1.96) 0.02
Face-to-face, n = 280
Mail, n = 291
Telephone, n = 286
P*df
n (%)n (%)n (%)
Age (years)
Mean 58.8 59.4 59.4 0.37
51-64 232 (82.9) 231 (79.4) 232 (81.1)
≥65 48 (17.1) 60 (20.6) 54 (18.9) 0.57
Gender
Female 186 (66.4) 188 (64.6) 190 (66.4)
Male 94 (33.6) 103 (35.4) 96 (33.6) 0.87
Race/ethnicity
Non-Hispanic white 178 (63.6) 158 (54.3) 173 (60.5)
African American 69 (24.6) 84 (28.9) 73 (25.5)
Hispanic 28 (10.0) 31 (10.7) 25 (8.7)
Unreported 5 (1.8) 18 (6.2) 15 (5.2) 0.48
Marital status
Not married 75 (26.8) 67 (23.0) 73 (25.5)
Married/living with a partner 205 (73.2) 218 (74.9) 213 (74.5)
Unreported 0 (0.0) 6 (2.1) 0 (0.0) 0.66
Education
<High school 4 (1.4) 7 (2.4) 12 (4.2)
High school/GED 28 (10.0) 36 (12.4) 28 (9.8)
Some college 96 (34.3) 88 (30.2) 83 (29.0)
≥College 150 (53.6) 155 (53.3) 163 (57.0)
Unreported 2 (0.7) 5 (1.7) 0 (0.0) 0.32
Family history of colorectal cancer
No 244 (87.1) 243 (83.5) 249 (87.1)
Yes 31 (11.1) 37 (12.7) 29 (10.1)
Unreported 5 (1.8) 11 (3.8) 8 (2.8) 0.58
Last healthcare visit to KSC
Within the past year 270 (96.4) 258 (88.7) 272 (95.1)
>1 y 10 (3.6) 26 (8.9) 14 (4.9)
Unreported 0 (0.0) 7 (2.4) 0 (0.0) 0.01
No. vists to KSC in past 5 y
0-5 24 (8.6) 61 (21.0) 28 (9.8)
>5 256 (91.4) 223 (76.6) 258 (90.2)
Unreported 0 (0.0) 7 (2.4) 0 (0.0) 0.00
No. visits outside KSC in past 5 y
None 143 (51.1) 156 (53.6) 152 (53.1)
1-2 61 (21.8) 46 (15.8) 65 (22.7)
3-5 32 (11.4) 42 (14.4) 29 (10.1)
>5 44 (15.7) 40 (13.7) 39 (13.6)
Unreported 0 (0.0) 7 (2.4) 1 (0.3) 0.32
Test status
No tests 114 (40.7) 126 (43.3) 113 (39.5)
Single test 107 (38.2) 111 (38.1) 126 (44.1)
Multiple tests 59 (21.1) 54 (18.6) 47 (16.4) 0.43
Social desirability§
Mean score (SD) 6.64 (1.95) 6.69 (2.11) 7.08 (1.96) 0.02
*

Using χ2P-values and degrees of freedom (df). Results for mean age are from ANOVA (Prob>F).

The category “unreported” was not included when calculating the χ2 statistic.

Based on the combined medical record.

§

Results from ANOVA. In pairwise comparisons, the only significant difference was between face-to-face and telephone respondents. Omits 3 observations with a social desirability score less than 2.

### Test-Retest Reliability Assessment

At the end of the validation survey, 99% of face-to-face, 98% of mail, and 99% of telephone survey participants agreed to do a reliability survey. Completion rates for the 661 reliability surveys we requested were 85% for face-to-face, 91% for mail, and 89% for telephone surveys (P = 0.11). Reasons for withdrawal between the surveys were nonresponse (n = 75), refusal (n = 9), lost to follow-up (n = 6), and illness or death (n = 4). We completed 567 reliability surveys, 185 at 2 weeks, 184 at 3 months, and 198 at 6 months.

The percentage of agreement between responses to the validation survey and the 2-week and 3-month reliability surveys was 90% or greater for all tests in all survey modes with only minor exceptions (Table 2). Except for COL and BE, agreement at 6 months was lower than the other time intervals but was never <80% for any of the tests or endoscopy.

Table 2.

Reliability of self-report by mode of survey administration at 2 wk, 3 mo, or 6 mo of follow-up survey (Kelsey-Seybold Clinic, Houston, TX; 2005-2006)

2 wk
3 mo
6 mo
nConcordanceκ*nConcordanceκ*nConcordanceκ*
Fecal occult blood test
Overall 165 89.7 0.74 160 90.9 0.74 150 85.2 0.58
Face-to-face 48 85.7 0.64 54 90.0 0.74 39 84.8 0.58
Mail 64 92.8 0.82 58 89.2 0.65 60 84.5 0.59
Telephone 53 89.8 0.76 48 94.1 0.82 51 86.4 0.56
Sigmoidoscopy (SIG)
Overall 168 90.8 0.77 170 92.4 0.81 170 86.7 0.67
Face-to-face 50 89.3 0.78 57 93.4 0.85 46 85.2 0.60
Mail 64 91.4 0.70 62 92.5 0.80 68 89.5 0.75
Telephone 54 91.5 0.79 51 91.1 0.76 56 84.9 0.62
Colonoscopy (COL)
Overall 177 95.7 0.90 162 93.6 0.85 171 92.4 0.83
Face-to-face 52 92.9 0.83 51 92.7 0.83 48 95.2 0.91
Mail 69 98.6 0.97 59 92.2 0.82 69 93.2 0.84
Telephone 56 94.9 0.89 52 96.3 0.90 56 88.9 0.74
Endoscopy (COL or SIG)
Overall 170 91.9 0.84 175 95.1 0.90 174 87.9 0.78
Face-to-face 49 87.5 0.73 58 95.1 0.90 48 88.9 0.78
Mail 66 94.3 0.88 63 94.0 0.88 68 89.5 0.79
Telephone 55 93.2 0.86 54 96.4 0.93 58 85.3 0.71
Barium enema
Overall 172 93.5 0.66 176 96.7 0.78 186 95.4 0.72
Face-to-face 50 90.9 0.68 57 95.0 0.77 49 92.5 0.46
Mail 67 95.7 0.64 66 100.0 1.00 72 97.3 0.87
Telephone 55 93.2 0.63 53 94.6 0.55 65 95.6 0.64
2 wk
3 mo
6 mo
nConcordanceκ*nConcordanceκ*nConcordanceκ*
Fecal occult blood test
Overall 165 89.7 0.74 160 90.9 0.74 150 85.2 0.58
Face-to-face 48 85.7 0.64 54 90.0 0.74 39 84.8 0.58
Mail 64 92.8 0.82 58 89.2 0.65 60 84.5 0.59
Telephone 53 89.8 0.76 48 94.1 0.82 51 86.4 0.56
Sigmoidoscopy (SIG)
Overall 168 90.8 0.77 170 92.4 0.81 170 86.7 0.67
Face-to-face 50 89.3 0.78 57 93.4 0.85 46 85.2 0.60
Mail 64 91.4 0.70 62 92.5 0.80 68 89.5 0.75
Telephone 54 91.5 0.79 51 91.1 0.76 56 84.9 0.62
Colonoscopy (COL)
Overall 177 95.7 0.90 162 93.6 0.85 171 92.4 0.83
Face-to-face 52 92.9 0.83 51 92.7 0.83 48 95.2 0.91
Mail 69 98.6 0.97 59 92.2 0.82 69 93.2 0.84
Telephone 56 94.9 0.89 52 96.3 0.90 56 88.9 0.74
Endoscopy (COL or SIG)
Overall 170 91.9 0.84 175 95.1 0.90 174 87.9 0.78
Face-to-face 49 87.5 0.73 58 95.1 0.90 48 88.9 0.78
Mail 66 94.3 0.88 63 94.0 0.88 68 89.5 0.79
Telephone 55 93.2 0.86 54 96.4 0.93 58 85.3 0.71
Barium enema
Overall 172 93.5 0.66 176 96.7 0.78 186 95.4 0.72
Face-to-face 50 90.9 0.68 57 95.0 0.77 49 92.5 0.46
Mail 67 95.7 0.64 66 100.0 1.00 72 97.3 0.87
Telephone 55 93.2 0.63 53 94.6 0.55 65 95.6 0.64

NOTE: Tests completed after the validation survey were excluded.

*

Landis and Koch (26) define quality of interrater agreement (κ) as excellent (κ > 0.80); substantial (κ ≥ 0.61 and ≤0.80); moderate (κ ≥ 0.41 and ≤0.60); fair (κ ≥ 0.21 and ≤0.40); and slight (κ < 0.21).

At 2 weeks and 3 months, all κ coefficients met Landis and Koch's criteria (26) for excellent (>0.8) or good agreement (0.61 to 0.80; Table 2). At 6 months, except for telephone, κ coefficients for COL remained in the excellent range. For FOBT and SIG, κ coefficients decreased to the moderate (0.41-0.60) to good range. Within test type, except for BE, there was little variation in κ coefficient according to survey mode.

### Validity Assessment

We had combined medical record data for 857 patients: 353 patients with no current CRCS test of any type, 344 with only one CRCS test within guidelines (47 FOBTs, 119 SIGs, 144 COLs, and 34 BEs), and 160 patients with more than one test within guidelines (133 with two tests, 26 with three tests, and 1 with four tests). In all, we had test data for 138 FOBTs, 219 SIGs, 232 COLs, and 103 BEs. Patterns for multiple testers showed that 91 persons had FOBT and COL, SIG, or BE, 48 had SIG and COL or BE, and 21 had COL and BE.

Concordance. Overall concordance estimates for all tests and endoscopy met the criteria for good agreement; estimates were >0.80 and the lower confidence limit exceeded 0.70 (Table 3). Although the differences were small, estimates for COL and BE were consistently higher than for FOBT or SIG. Estimates of concordance showed no substantial differences by survey mode for any CRCS test or endoscopy.

Table 3.

Concordance, sensitivity, specificity, and report-to-records ratio comparing self-report of adherence to colorectal cancer–screening guidelines with the combined medical record (“gold standard”) by colorectal cancer–screening test type and mode of survey administration (Kelsey-Seybold Clinic, Houston, TX; 2005-2006; n = 857)

Endoscopy
FOBT
SIG
COL
COL or SIG
BE
nConcordance (95% CI)*nConcordance (95% CI)nConcordance (95% CI)nConcordance (95% CI)nConcordance (95% CI)
Overall 857 0.85 (0.82-0.88) 857 0.85 (0.83-0.88) 857 0.91 (0.89-0.93) 857 0.85 (0.83-0.88) 857 0.92 (0.90-0.94)
Face-to-face 280 0.84 (0.79-0.88) 280 0.84 (0.79-0.88) 280 0.91 (0.88-0.95) 280 0.85 (0.80-0.89) 280 0.89 (0.85-0.93)
Mail 291 0.85 (0.81-0.90) 291 0.87 (0.82-0.91) 291 0.92 (0.89-0.95) 291 0.86 (0.81-0.90) 291 0.95 (0.92-0.97)
Telephone

286

0.86 (0.82-0.91)

286

0.86 (0.82-0.90)

286

0.89 (0.85-0.93)

286

0.85 (0.81-0.90)

286

0.91 (0.88-0.95)

n

Sensitivity (95% CI)*

n

Sensitivity (95% CI)

n

Sensitivity (95% CI)

n

Sensitivity (95% CI)

n

Sensitivity (95% CI)

Overall 138 0.82 (0.75-0.89) 219 0.76 (0.70-0.83) 232 0.91 (0.87-0.94) 420 0.89 (0.85-0.92) 103 0.56 (0.44-0.69)
Face-to-face 45 0.80 (0.74-0.86) 83 0.78 (0.72-0.85) 71 0.92 (0.88-0.96) 141 0.91 (0.85-0.96) 39 0.51 (0.45-0.58)
Mail 43 0.81 (0.76-0.87) 73 0.74 (0.68-0.80) 79 0.94 (0.90-0.97) 140 0.86 (0.80-0.92) 34 0.68 (0.62-0.73)
Telephone

50

0.84 (0.79-0.89)

63

0.76 (0.70-0.82)

82

0.87 (0.82-0.92)

139

0.88 (0.83-0.94)

30

0.50 (0.44-0.56)

n

Specificity (95% CI)*

n

Specificity (95% CI)

n

Specificity (95% CI)

n

Specificity (95% CI)

n

Specificity (95% CI)

Overall 719 0.86 (0.83-0.88) 638 0.89 (0.86-0.91) 625 0.91 (0.89-0.93) 437 0.82 (0.86-0.91) 754 0.97 (0.95-0.98)
Face-to-face 235 0.84 (0.79-0.89) 197 0.86 (0.81-0.91) 209 0.91 (0.87-0.95) 139 0.78 (0.71-0.86) 241 0.95 (0.93-0.98)
Mail 248 0.86 (0.81-0.91) 218 0.91 (0.87-0.95) 212 0.92 (0.88-0.95) 151 0.85 (0.79-0.91) 257 0.98 (0.96-1.00)
Telephone

246

0.87 (0.82-0.91)

223

0.89 (0.84-0.93)

204

0.90 (0.86-0.94)

147

0.82 (0.76-0.89)

256

0.96 (0.94-0.99)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Overall  1.57 (1.43-1.72)  1.10 (0.99-1.21)  1.15 (1.04-1.26)  1.07 (0.92-1.23)  0.82 (0.66-0.97)
Face-to-face  1.62 (1.31-1.93)  1.12 (0.94-1.30)  1.18 (0.96-1.40)  1.12 (0.90-1.34)  0.79 (0.56-1.03)
Mail  1.63 (1.40-1.86)  1.01 (0.84-1.13)  1.16 (0.96-1.36)  1.03 (0.83-1.23)  0.82 (0.55-1.10)
Telephone  1.48 (1.24-1.72)  1.16 (0.94-1.38)  1.11 (0.94-1.28)  1.07 (0.90-1.24)  0.83 (0.51-1.16)
Endoscopy
FOBT
SIG
COL
COL or SIG
BE
nConcordance (95% CI)*nConcordance (95% CI)nConcordance (95% CI)nConcordance (95% CI)nConcordance (95% CI)
Overall 857 0.85 (0.82-0.88) 857 0.85 (0.83-0.88) 857 0.91 (0.89-0.93) 857 0.85 (0.83-0.88) 857 0.92 (0.90-0.94)
Face-to-face 280 0.84 (0.79-0.88) 280 0.84 (0.79-0.88) 280 0.91 (0.88-0.95) 280 0.85 (0.80-0.89) 280 0.89 (0.85-0.93)
Mail 291 0.85 (0.81-0.90) 291 0.87 (0.82-0.91) 291 0.92 (0.89-0.95) 291 0.86 (0.81-0.90) 291 0.95 (0.92-0.97)
Telephone

286

0.86 (0.82-0.91)

286

0.86 (0.82-0.90)

286

0.89 (0.85-0.93)

286

0.85 (0.81-0.90)

286

0.91 (0.88-0.95)

n

Sensitivity (95% CI)*

n

Sensitivity (95% CI)

n

Sensitivity (95% CI)

n

Sensitivity (95% CI)

n

Sensitivity (95% CI)

Overall 138 0.82 (0.75-0.89) 219 0.76 (0.70-0.83) 232 0.91 (0.87-0.94) 420 0.89 (0.85-0.92) 103 0.56 (0.44-0.69)
Face-to-face 45 0.80 (0.74-0.86) 83 0.78 (0.72-0.85) 71 0.92 (0.88-0.96) 141 0.91 (0.85-0.96) 39 0.51 (0.45-0.58)
Mail 43 0.81 (0.76-0.87) 73 0.74 (0.68-0.80) 79 0.94 (0.90-0.97) 140 0.86 (0.80-0.92) 34 0.68 (0.62-0.73)
Telephone

50

0.84 (0.79-0.89)

63

0.76 (0.70-0.82)

82

0.87 (0.82-0.92)

139

0.88 (0.83-0.94)

30

0.50 (0.44-0.56)

n

Specificity (95% CI)*

n

Specificity (95% CI)

n

Specificity (95% CI)

n

Specificity (95% CI)

n

Specificity (95% CI)

Overall 719 0.86 (0.83-0.88) 638 0.89 (0.86-0.91) 625 0.91 (0.89-0.93) 437 0.82 (0.86-0.91) 754 0.97 (0.95-0.98)
Face-to-face 235 0.84 (0.79-0.89) 197 0.86 (0.81-0.91) 209 0.91 (0.87-0.95) 139 0.78 (0.71-0.86) 241 0.95 (0.93-0.98)
Mail 248 0.86 (0.81-0.91) 218 0.91 (0.87-0.95) 212 0.92 (0.88-0.95) 151 0.85 (0.79-0.91) 257 0.98 (0.96-1.00)
Telephone

246

0.87 (0.82-0.91)

223

0.89 (0.84-0.93)

204

0.90 (0.86-0.94)

147

0.82 (0.76-0.89)

256

0.96 (0.94-0.99)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Report-to-records ratio (95% CI)

Overall  1.57 (1.43-1.72)  1.10 (0.99-1.21)  1.15 (1.04-1.26)  1.07 (0.92-1.23)  0.82 (0.66-0.97)
Face-to-face  1.62 (1.31-1.93)  1.12 (0.94-1.30)  1.18 (0.96-1.40)  1.12 (0.90-1.34)  0.79 (0.56-1.03)
Mail  1.63 (1.40-1.86)  1.01 (0.84-1.13)  1.16 (0.96-1.36)  1.03 (0.83-1.23)  0.82 (0.55-1.10)
Telephone  1.48 (1.24-1.72)  1.16 (0.94-1.38)  1.11 (0.94-1.28)  1.07 (0.90-1.24)  0.83 (0.51-1.16)

NOTE: Adherence to guidelines is defined as an annual FOBT or a SIG, COL, or BE within 5 y.

Concordance is the percentage of all those who reported receiving a test or who reported no test in agreement with the combined medical record. Sensitivity is the number of individuals who correctly recalled having the test divided by the number of individuals who had a test according to the combined medical record. Specificity is the number of individuals who correctly reported no test divided by the number of individuals with no test documented in the combined medical record. The report-to-records ratio is the ratio of participants reporting a test (true positives plus false positives) divided by the percentage of tests in the combined medical record (true positives plus false negatives). It is a measure of net bias in test reporting, with values >1.0 indicating overreporting and values <1.0 indicating underreporting.

*

95% confidence interval.

Tisnado et al. (28) defined sensitivity or specificity as excellent if ≥0.90; good if ≥0.80; fair if ≥0.70; and poor if <0.70.

Sensitivity. Estimates were good for FOBT, fair for SIG, good to excellent for COL and endoscopy, and poor for BE. Only SIG and BE did not consistently meet the criterion of >0.7 for the lower confidence limit. Except for BE, sensitivity estimates showed little variation by survey mode for any of the tests (Table 3).

Specificity. Estimates were good for FOBT and endoscopy, good to excellent for SIG, and excellent for COL and BE (Table 3). Specificity was highest for BE. All estimates met the criterion of >0.7 for the lower confidence limit.

Report-to-records ratio. Overall estimates indicated overreporting for FOBT, SIG, COL, and endoscopy, and underreporting for BE (Table 3). In most cases, confidence intervals for SIG, COL, and endoscopy included 1.0. These patterns were generally consistent across survey modes for all tests.

Ancillary analyses. The effect of excluding respondents with missing values on sensitivity and specificity estimates was minimal. Almost all changes were in the mail survey mode which had the most missing values. Compared with the estimates in Table 3, sensitivity consistently increased (range, 0.01 to 0.09), and specificity consistently decreased (range, −0.01 to −0.02).

When we excluded patients with gastrointestinal conditions, estimates were minimally affected compared with those in Table 3. For FOBT, SIG, and COL, there was no consistent pattern of increase or decrease in sensitivity based on test type or survey mode (range, −0.04 to 0.03). Specificity estimates decreased minimally for FOBT (−0.01) but not for SIG or COL.

When we stratified sensitivity estimates on single versus multiple tests, estimates for FOBT were consistently lower among single testers (range, −0.03 to −0.08) and were consistently higher among multiple testers (range, 0.02 to 0.05). For SIG, COL, and BE, there was no consistent increase or decrease within survey mode by test status (range, −0.09 to 0.06). Overall estimates were unchanged or lower among single testers (range, −0.06 to 0) and slightly higher among multiple testers (range, 0.01 to 0.04).

When we stratified specificity estimates on nontester status, estimates among those with no documentation of FOBT or SIG in the combined medical record consistently increased (range, 0.03 to 0.08). Among those with documentation of a test in the combined medical record, estimates consistently decreased (range, −0.03 to −0.11). The pattern was reversed for COL; specificity decreased slightly among those with no documentation for COL (range, −0.02 to 0.00) and increased slightly among those with documentation (range, 0.00 to 0.03). For BE, there was no consistent pattern by nontester status (range, −0.03 to 0.03).

Measuring self-report using the interval question as the primary source of information and month and year as a supplemental source slightly decreased some of the sensitivity estimates for FOBT and BE (range, −0.01 to −0.02). For SIG, sensitivity estimates by survey mode improved (range, 0.02 to 0.05). Almost all specificity estimates for all test types decreased slightly (range, −0.01 to −0.03). However, these changes did not affect the rating (i.e., excellent, good, or fair) based on Tisnado et al.'s criteria (28).

To our knowledge, this is the first study to assess the test-retest reliability of CRCS tests over defined time intervals and to examine, systematically, the effect of mode of administration on the reliability and validity of self-reports. Only one other study (29) has assessed test-retest reliability of CRCS measures. We could not compare results because those researchers used different prevalence definitions (“ever had” versus “adherence to guidelines”) and because the time interval between surveys in their study was not standardized (average, 77 days; range, 40 to 356). Our findings show that although there is some decline over time, participants show reasonably good recall of CRCS tests even after 6 months, particularly for COL.

Consistent with findings from a mammography study (15), we found no evidence that survey mode affects the validity of self-report. However, as reported in a meta-analysis in this issue of CEBP, Rauscher et al. (30) found that interviews conducted face-to-face tended to be associated with reduced self-report accuracy compared with telephone or self-administered surveys for cancer screening behaviors including mammography, Pap testing, FOBT, and endoscopy (SIG or COL).

Our validity estimates compare favorably with other studies. Rauscher et al. (30) report summary effect estimates of sensitivity and specificity for eight studies of FOBT (14, 31-37) and for four studies of SIG or COL (SIG and COL were combined in the meta-analysis; refs. 14, 32, 35, 37). For FOBT, the summary sensitivity estimate was 82%, the same as our overall estimate, and specificity was 78% compared with our 86%. For endoscopy, sensitivity was 79% compared with 89% in our study; for specificity, it was 90% compared with our 82%. Our sensitivity and specificity estimates for FOBT and BE were also similar to those reported by Partin et al. (38) who used the NCI CRCS questionnaire in a low-income veteran population; however, our report-to-records ratio for BE showed statistically significant underreporting whereas Partin et al. found statistically significant overreporting. Compared with Partin et al.'s results, we observed higher sensitivity and specificity for SIG and higher specificity for COL but similar sensitivity.

Applying Tisnado et al.'s (28) criteria, our sensitivity estimates were mostly excellent for COL, good for FOBT, and fair for SIG. Differences in accurate recall may be due to test characteristics or to respondents' familiarity with the tests. Qualitative research has consistently found that patients are confused about the distinction between SIG and COL (7-12). COL may be more memorable because it has received more attention in the media and because patients are sedated, need to arrange for transportation, and need to take a day off from work. Our high sensitivity estimates for COL support the view that patients accurately recall this experience, whereas the lower sensitivity estimates for SIG may be due to patients confusing it with COL (39). To see if patients who had a single documented SIG or COL were more likely to mislabel SIG as COL than vice versa, we compared the frequency of documented SIGs that were self-reported as COL with the frequency of documented COLs that were self-reported as SIG. Few confused these tests. Six of 57 (10%) self-reported COLs were documented SIGs, whereas 4 of 73 (5%) self-reported SIGs were documented COLs. In contrast, Partin et al. (38) found that 55% of false-positive reports for either endoscopy procedure had documentation for the other procedure. Future studies should examine differences in patterns of misreporting test types to understand the reasons for confusion.

A possible reason for FOBT false-positive reports may be recall bias regarding time period. Respondents may recall FOBT as occurring more recently than it did, often referred to as forward telescoping (40, 41). We examined the distribution of FOBT self-reports, measured both as month and year and within a 1-year interval, against test dates in the combined medical record that were within 12 months of the survey date. Approximately 75% accurately reported an FOBT within the past year in response to each question; however, only 61% provided month and year compared with 98% for the interval question. Given respondents' preference for the interval question and the comparable validity estimates found in our ancillary analysis, the interval question may be the preferred way to assess the recency of FOBT, at least in surveys requiring retrospective recall over long time intervals.

Our ancillary analyses showed that sensitivity and specificity estimates were minimally affected when we excluded respondents with missing values; however, the effect of these exclusions was to consistently increase sensitivity and decrease specificity. This could be a problem in studies that have a high percentage of uncertain or missing responses. Although we originally planned to include only patients who had CRC tests for screening, data on reason for the test was not consistently available in the KSC database; however, excluding patients with gastrointestinal conditions only minimally affected our estimates. Our stratified ancillary analyses confirmed our conjectures that respondents with multiple tests were more accurate in reporting test types than single testers and that those with no documented tests were better at reporting not having a particular test than those with a CRCS test (other than the one being asked about).

We found that respondents to the telephone survey scored higher on the Marlowe-Crowne Social Desirability Scale than mail or face-to-face respondents, although only the difference between telephone and face-to-face respondents was statistically significant (means were 7.08, 6.69, and 6.64, respectively). Substantively, these differences are small, and the report-to-records ratios did not suggest that overreporting due to social desirability response tendency was more likely to occur in the telephone survey compared with the other survey modes. Likewise, although mail respondents reported less healthcare utilization compared with telephone and face-to-face respondents (they were less likely to report a healthcare visit within the past year and were more likely to report less than six visits within the past 5 years), there was no pattern in the validity estimates to suggest that these factors differentially influenced reporting of CRCS behaviors.

The 40% participation rate among eligible patients may reduce the generalizability of our results. During recruitment, men and women with no documented record of CRCS were more likely to refuse participation in our study, a finding similar to that reported by others (38, 42-45). Although our sampling strategy ensured an adequate number of participants with no prior CRCS, it is unclear whether participants differed from those without prior CRCS who refused, in ways that might have affected our reliability and validity estimates. Although it may have limited generalizability, we chose a study setting with a stable patient population, strong data systems, and the on-site provision of endoscopy in order to have the most complete ascertainment of our gold standard measure. Only 7% of our study participants reported having CRCS outside KSC, and we attempted to contact those providers to verify self-reports of patients who gave us permission to do so.

Valid and reliable self-report measures are a critical component of cancer prevention and control research. Acceptable values for sensitivity and specificity vary depending on how the measures will be used. A measure with low sensitivity may be acceptable if the purpose is to evaluate the efficacy of a behavior change intervention, but if the purpose is to identify those who need screening, it may be desirable to have high sensitivity, and sacrifice specificity, in order not to miss unscreened persons.

This study provides empirical support for the proposition that the reliability and validity of the NCI CRCS questionnaire is comparable across the three modes of survey administration. Researchers should base their selection of survey mode on their research objectives and on characteristics of the target population such as literacy. Future research should continue to investigate other potential sources of error and bias in self-report as was done by Beebe et al. (46) in this issue of CEBP. Finally, as new screening technologies with different test characteristics are introduced, they will need to be validated.

Grant support: PRC SIP 19-04 U48 DP000057 from the Centers for Disease Control and Prevention (S.W. Vernon, P.M. Diamond, M.E. Fernandez, A. Greisinger, and R.W. Vojvodic) and by RO1 CA97263 (S.W. Vernon).

1
Vernon SW, Briss PA, Tiro JA, Warnecke RB. Some methodologic lessons learned from cancer screening research.
Cancer
2004
;
101
:
1131
–45.
2
Hiatt RA, Klabunde CN, Breen NL, Swan J, Ballard-Barbash R. Cancer screening practices from National Health Interview Surveys: past present, and future.
J Natl Cancer Inst
2002
;
94
:
1837
–46.
3
Luu A. The impact of the HIPAA privacy rule on research participation.
J Biolaw Bus
2005
;
8
:
68
–9.
4
McCarthy D, Shatin D, Drinkard C, Kleinman J, Gardner J. Medical records and privacy: empirical effects of legislation.
Health Serv Res
1999
;
34
:
417
–25.
5
Nosowsky R, Giordano T. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) privacy rule: implications for clinical research.
Ann Rev Med
2006
;
57
:
575
–90.
6
Vernon SW, Meissner HI, Klabunde CN, et al. Measures for ascertaining use of colorectal cancer screening in behavioral, health services, and epidemiologic research.
Cancer Epidemiol Biomarkers Prev
2004
;
13
:
898
–905.
7
Bastani R, Gallardo NV, Maxwell AE. Barriers to colorectal cancer screening among ethnically diverse high- and average-risk individuals.
J Psychosoc Oncol
2001
;
19
:
65
–84.
8
Goel V, Gray RE, Chart PL, Fitch M, Saibil F, Zdanowicz Y. Perspectives on colorectal cancer screening: a focus group study.
Health Expect
2004
;
7
:
51
–60.
9
Greisinger A, Hawley ST, Bettencourt JL, Perz CA, Vernon SW. Primary care patients' understanding of colorectal cancer screening.
Cancer Detect Prev
2006
;
30
:
67
–74.
10
Beeker C, Kraft JM, Southwell BG, Jorgensen CM. Colorectal cancer screening in older men and women: qualitative research findings and implications for intervention.
J Community Health
2000
;
25
:
263
–77.
11
Weitzman ER, Zapka JG, Estabrook B, Goins KV. Risk and reluctance: understanding impediments to colorectal screening.
Prev Med
2001
;
32
:
502
–13.
12
Brouse CH, Basch CE, Wolf RL, Shmukler C, Neugut AI, Shea S. Barriers to colorectal cancer screening with fecal occult blood testing in a predominantly minority urban population: a qualitative study.
Am J Public Health
2003
;
93
:
1268
–71.
13
Madlensky L, McLaughlin JR, Goel V. A comparison of self-reported colorectal cancer screening with medical records.
Cancer Epidemiol Biomarkers Prev
2003
;
12
:
656
–9.
14
Baier M, Calonge BN, Cutter GR, et al. Validity of self-reported colorectal cancer screening behavior.
Cancer Epidemiol Biomarkers Prev
2000
;
9
:
229
–32.
15
Zapka JG, Bigelow C, Hurley TG, et al. Mammography use among sociodemographically diverse women: the accuracy of self-report.
Am J Public Health
1996
;
86
:
1016
–21.
16
Holbrook AL, Green MC, Krosnick JA. Telephone versus face-to-face interviewing of national probability samples with long questionnaires: comparisons of respondent satisficing and social desirability response bias.
Public Opin Q
2003
;
67
:
79
–125.
17
Hawley ST, Vernon SW, Levin B, Vallejo B. Prevalence of colorectal cancer screening in a large medical organization.
Cancer Epidemiol Biomarkers Prev
2004
;
13
:
314
–9.
18
American Cancer Society. Cancer facts and figures, 2007. Report. Atlanta (GA): American Cancer Society; 2007.
19
Crowne DP, Marlowe DA. A new scale of social desirability independent of psychopathology.
J Consult Clin Psychol
1960
;
24
:
349
–54.
20
Marlowe DA, Crowne DP. Social desirability and response to perceived situational demands.
J Consult Psychol
1961
;
25
:
109
–15.
21
Strahan R, Gerbasi KC. Short, homogeneous versions of the Marlowe-Crowne social desirability scale.
J Clin Psychol
1972
;
28
:
191
–3.
22
Dillman DA. Mail and telephone surveys: the total design method. New York (NY): John Wiley & Sons; 1978.
23
Fiscella K, Holt K, Meldrum S, Franks P. Disparities in preventive procedures: comparisons of self-report and Medicare claims data.
BMC Health Serv Res
2006
;
6
:
1
–8.
24
Peabody JW, Luck J, Glassman P, Dresselhaus TR, Lee M. Comparison of vignettes, standardized patients, and chart abstraction: a prospective validation study of 3 methods for measuring quality.
JAMA
2000
;
283
:
1715
–22.
25
Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York (NY): John Wiley & Sons; 1981.
26
Landis JR, Koch GG. The measurement of observer agreement for categorical data.
Biometrics
1977
;
33
:
159
–74.
27
Warnecke RB, Sudman S, Johnson TP, O'Rourke DP, Davis AM, Jobe JB. Cognitive aspects of recalling and reporting health-related events: Papanicolaou smears, clinical breast examinations, and mammograms.
Am J Epidemiol
1997
;
146
:
982
–92.
28
Tisnado DM, Adams JL, Liu H, et al. What is the concordance between the medical record and patient self-report as data sources for ambulatory care?
Med Care
2006
;
44
:
132
–40.
29
Bradbury BD, Brooks DR, Brawarsky P, Mucci LA. Test-retest reliability of colorectal testing questions on the Massachusetts Behavioral Risk Factor Surveillance System (BRFSS).
Prev Med
2005
;
41
:
303
–11.
30
Rauscher GH, Johnson TP, Cho YI, Walk JA. Accuracy of self-reported cancer screening histories: a meta-analysis.
Cancer Epidemiol Prev
2008
;
17
:
748
–57.
31
Sudman S, Warnecke RB, Johnson TP, O'Rourke DP. Cognitive aspects of reporting cancer prevention examinations and tests. Report #6. Hyattsville (MD): National Center for Health Statistics; 1994. p. 1–171.
32
Hiatt RA, Pérez-Stable EJ, Quesenberry CP, Jr., Sabogal F, Otero-Sabogal R, McPhee SJ. Agreement between self-reported early cancer detection practices and medical audits among Hispanic and non-Hispanic White health plan members in northern California.
Prev Med
1995
;
24
:
278
–85.
33
Brown JB, Adams ME. Patients as reliable reporters of medical care process: recall of ambulatory encounter events.
Med Care
1992
;
30
:
400
–11.
34
Mandelson MT, LaCroix AZ, Anderson LA, Nadel MR, Lee NC. Comparison of self-reported fecal occult blood testing with automated laboratory records among older women in a health maintenance organization.
Am J Epidemiol
1999
;
150
:
617
–21.
35
Hall HI, van den Eeden SK, Tolsma DD, et al. Testing for prostate and colorectal cancer: comparison of self-report and medical record audit.
Prev Med
2004
;
39
:
27
–35.
36
Lipkus IM, Samsa GP, Dement J, et al. Accuracy of self-reports of fecal occult blood tests and test results among individuals in the carpentry trade.
Prev Med
2003
;
37
:
513
–9.
37
Gordon NP, Hiatt RA, Lampert DI. Concordance of self-reported data and medical record audit for six cancer screening procedures.
J Natl Cancer Inst
1993
;
85
:
566
–70.
38
Partin MR, Grill J, Noorbaloochi S, et al. Validation of self-reported colorectal cancer screening behavior from a mixed-mode survey of veterans.
Cancer Epidemiol Prev
2008
;
17
:
768
–76.
39
Meissner HI, Breen NL, Klabunde CN, Vernon SW. Patterns of colorectal cancer screening uptake among men and women in the US.
Cancer Epidemiol Biomarkers Prev
2006
;
15
:
389
–94.
40
May DS, Trontell AE. Mammography use by elderly women: a methodological comparison of two national data sources.
Ann Epidemiol
1998
;
8
:
439
–44.
41
Prohaska V, Brown NR, Belli RF. Forward telescoping: the question matters.
Memory
1998
;
6
:
455
–65.
42
Lindholm E, Berglund B, Haglind E, Kewenter J. Factors associated with participation in screening for colorectal cancer with faecal occult blood testing.
Scand J Gastroenterol
1995
;
30
:
171
–6.
43
Vernon SW, Acquavella JF, Yarborough CM, Hughes JI, Thar WE. Reasons for participation and nonparticipation in a colorectal cancer screening program for a cohort of high risk polypropylene workers.
J Occup Med
1990
;
32
:
46
–51.
44
Kelly RB, Shank JC. Adherence to screening flexible sigmoidoscopy in asymptomatic patients.
Med Care
1992
;
30
:
1029
–42.
45
Bastani R, Maxwell AE, Glenn B, Mojica C, Chang C, Ganz PA. Validation of self-reported colorectal cancer screening in a study of ethnically diverse first degree relatives of CRC Cases.
Cancer Epidemiol Prev
2008
;
17
:
791
–8.
46
Beebe TJ, Davern ME, McAlpine DD, Call KT, Rockwood TH. Increasing response rates in a survey of Medicaid enrollees: the effect of a prepaid monetary incentive and mixed modes (mail and telephone).
Med Care
2005
;
43
:
411
–20.