Introduction: To determine whether a Web-based survey was an acceptable method of data collection for a clinic-based case-control study of adult brain cancer, the authors compared the reliability of paired responses to a main and resurvey for participants completing surveys by telephone (n = 74) or self-administered on the Web (n = 465) between 2003 and 2006.

Methods: Recruitment of cases was done at the Evanston Northwestern Healthcare Kellogg Cancer Care Center and the Duke University Medical Center Cancer Control division, and controls were friends and siblings of cases. Twenty-five variables were examined, including smoking, oral contraceptive and residential histories, water sources, meat preparation, fruit and vegetable consumption, and pesticide use. Weighted and simple κ's were estimated for categorical and binary variables, respectively.

Results: The number of concordant paired responses was summed for use in linear regression. Respondents were 97% White and 85% had postsecondary education. Kappa's for individual questions ranged from 0.31 (duration of residence in a single family house) to 0.96 (ever smoked), with a median of 0.57 (95% confidence interval, 0.47-0.64). The median number of concordant responses was 16.2 (range, 5-22). Reliability was greater for controls than cases, Web-based versus telephone responders, females, and higher-income responders. Frequency of e-mail and Internet use was not associated with reliability.

Conclusions: A self-administered, Web-based survey was a feasible and appropriate mode of interview in this study. The comparable reliability of Web compared with telephone responses suggest that Web-based self-interviews could be a cost-effective alternative to traditional modes of interview. (Cancer Epidemiol Biomarkers Prev 2008;17(10):2639–46)

According to the Pew Internet and American Life Project, Internet usage in the United States has increased over time in all age categories, with 71% of American adults currently using the Internet at least occasionally. Although this percentage is higher among younger adults, 65% of Americans with the age of 50 to 64 years currently use the Internet and 93% of adults living in a household with an income >$75,000/y are online (1). Web-based technology has been used extensively to collect data in the areas of public opinion and commercial research but has been underutilized as a method of data collection for epidemiologic studies (2), especially in the exploration of the role of human exposures in the causation of cancer. Underutilization of this technology may be due to concern that general population studies would include individuals who are not regular Internet users and may not be comfortable completing a lengthy exposure assessment questionnaire online.

Although it has been recommended to keep the length of Web-administered surveys to <20 min (3), risk factor questionnaires tend to be considerably longer, especially when studying a disease with a complex and largely unknown etiology. Because using the Web for exposure assessment in health research is so new, it is important to assess the reliability of responses obtained from this mode of data collection. Our pilot case-control study on brain cancer provided an opportunity to compare the reliability of paired responses to a questionnaire administered either by telephone or via the Internet.

Between 2003 and 2006, we collected data from brain tumor cases and their friend and sibling controls about their personal histories and self-reported exposures to substances that have been identified as animal neurocarcinogens, including occupational, environmental, and food exposures. Participants had the option of completing the interview either by telephone or self-administered over the Internet. The substudy described here examined the reliability of responses to a subset of items on the main questionnaire. This reliability substudy had three primary goals: (a) to determine whether participants would choose the Web self-interview or would opt for the more traditional telephone interview, (b) to determine whether the reliability of responses from the Web-based survey were comparable with the those from the telephone-administered survey, and (c) to determine whether reliability was affected by case-control status and other participant characteristics. To our knowledge, this is the first reliability study to be done for a comprehensive risk factor questionnaire in a sample of brain tumor cases and their friend and sibling controls.

Research Design Overview

Brain cancer patients were ascertained at two specialty clinics (Duke University Medical Center in North Carolina and Evanston Northwestern Healthcare in Illinois) that obtain patient referrals from a broad geographic region. The Duke clinic serves brain tumor patients of all ages with approximately one third coming from North Carolina, South Carolina, and southern Virginia, and the rest coming from other areas of the United States or other countries. The Evanston Northwestern healthcare clinic draws 93% of its patient referrals from Illinois, with the remainder primarily from other Midwest states such as Wisconsin and Indiana. Cases were ascertained between 2003 and 2006. To be eligible, cases must (a) have had a histologically confirmed diagnosis of a first primary brain tumor (International Classification of Diseases for Oncology, 3rd edition, sites C70.0-C72.9 and C75.1-C75.3) with the histologies of glioblastoma (International Classification of Diseases for Oncology, 3rd edition, histology codes 9440-9442), astrocytoma grades 2 and 3 (9400-9411 and 9420), or oligodendroglial (9382 and 9450-9451); (b) be ≥18 years of age; (c) speak English; and (d) reside within the United States. Cognitive function of cases was assessed by the doctor at the time consent was obtained.

Because it was not possible to identify and draw a sample of controls representing the underlying referral population for the two clinics, patients were asked to provide names and contact information for up to two siblings and three friends that might agree to serve as study controls. Eligible sibling and friend controls had to be at least 18 years old, residents of the United States, and have no history of brain cancer. Institutional review board approval for this study was obtained from the University of Illinois at Chicago, Duke University, and Evanston Northwestern Healthcare institutional review boards. Participant recruitment and consent to participate in each section of the study was done in person with cases at the clinic and by telephone or mail for controls. Case and control interviews were conducted between 2003 and 2006.

Choice of Survey Format

Participants were encouraged to complete the interview via the Internet but were offered a telephone-administered interview if they preferred. Patients who chose to complete the Web-based version were provided with a packet containing a confidential and unique username and password and a set of detailed instructions. The home page for the Web-based survey listed the different sections of the instrument and the survey had a status bar on each page to show the participant what percentage of the survey questions was completed. Participants completing the Web-based survey had the opportunity to complete it at their leisure and at multiple locations, to log in and out multiple times as needed or desired, and to call the toll-free support line if they had questions or problems. Participants who chose the telephone-administered survey were also given the opportunity to complete the survey during more than one call if needed. Although the questions and response choices were identically worded for the Web- and telephone-administered versions of the survey, some transition phrases were added to the telephone version to facilitate the segue from one section to the next.

A total of 679 cases were eligible to participate in the study, 359 of which consented to the main survey; 269 cases completed the main survey, 49 by telephone and 220 over the Internet, which resulted in an overall response rate of 40% for cases in the main survey. A total of 651 controls were eligible to participate in the study, 532 of which consented to the main survey; 400 controls completed the main survey, 48 by telephone and 353 over the Internet, which resulted in an overall response rate of 61% for controls in the main survey.

Reliability Substudy

Several weeks after completing the main survey, all participants were recontacted and asked to complete a short resurvey. The resurvey consisted of a subset of 25 questions from the main survey. Brain tumor cases and their sibling and friend controls were included in this analysis if they completed the main survey and the brief resurvey using the same survey mode for each, either by telephone or self-administered on the Web.

Of the 269 cases and 400 controls who completed the main survey, 222 (83%) and 337 (84%) also completed a resurvey, respectively. Twenty participants were excluded from the reliability study because they completed the main survey and resurvey via different modes. Of the 539 remaining, 74 completed both surveys via telephone and 465 completed both online. The median time between completion of the main survey and the resurvey was 36 days via telephone and 48 days via the Web.

The following 25 questions from the main survey were asked during the resurvey: where did the respondent stay as a child (home, relative's home, daycare); primary source of drinking water as a child (city water, household well, bottled water); ever smoked cigarettes, average number of cigarettes smoked per day (continuous); frequency of adult dental X-rays (never or only as a child, at least once a year, 2-3 years, 4-5 years, or less often); ever used oral contraceptives; age at which the respondent began using oral contraception (continuous); and duration of use (continuous). There were separate questions on duration of residence (0, 1-9, 10-19, and ≥20 years) in each of the following housing structures: a mobile home or trailer, 1-family house, 1-family house attached to ≥1 houses, a building with 1 to 9 apartments, a building with 10 to 49 apartments, and a building with ≥50 apartments. Respondents were asked separately how long they had ever lived near an industrial facility or a gas station, using the same scale. There were also separate questions about duration of residence (0, 1-9, 10-19, and ≥20 years) with the following domestic water sources: public or commercial water system, private well, and cistern. Questions were asked about the frequency of consumption (never, rarely, sometimes, often) of broiled food, grilled or barbecued food, and food charred or blackened by burning, as well as the frequency of consumption of home-grown fruits and vegetables during the summertime. Finally, there were questions on the duration of residence on a farm and the number of years residing in a place where pesticides were professionally applied indoors (each with response categories of 0, 1-9, 10-19, and ≥20 years).

Statistical Analysis

We calculated the percentage of participants choosing the Web-based survey, overall and by case-control status, within categories of age, gender, race, education, income, and frequency of Internet and e-mail use. χ2 Tests of association were done overall and by case-control status to determine whether choice of survey mode was significantly associated with these characteristics (α = 0.05). In addition, a separate logistic regression model predicting choice of survey mode was fit for each characteristic, including terms for case-control status and the interaction between case-control status and the characteristic, to determine whether the relationship between survey mode and the characteristic differed by case-control status. The P value from the type 3 test was reported.

We also examined how survey process characteristics varied by survey mode. These included the perceived level of difficulty completing the survey, perceived length of the survey, whether the respondent received help in completing the survey, and the total number of minutes, sessions, and days to complete the survey.

To examine the reliability of responses to individual questions, we recoded the continuous variables, number of cigarettes smoked per day (1-4, 5-9, 10-19, =20), the number of years oral contraceptives were used (≤1, 2, 3-5, 6-10, >10 years), and the participant's age when she started oral contraceptives (<20, 20-29, 30-39, 40-49, =50), into 4 to 6 categories and estimated simple and weighted κ's for binary and ordinal categorical variables, respectively, along with their 95% asymptotic confidence limits (4-6). Estimated κ values for each question were graphed in ascending order. We then re-estimated and plotted κ's after stratifying on survey mode, case-control status, gender, and income level.

In addition, we summed the number of concordant responses between the main survey and resurvey for each participant across the 22 questions, after excluding the three questions on oral contraceptive histories that were not asked of men. The result was an ordinal variable that was approximately normally distributed, with a possible range between 0 and 22. We used this variable as a dependent variable in linear regression analyses to examine participant-level predictors of reliability. First, the dependent variable was regressed one at a time with each independent variable to screen for variables to include in a final model. The only variables that were associated with reliability were gender, case-control status, survey mode, and household income, and these were included together in the final multivariate linear regression model of reliability.

Respondents to the main survey were 97% White, with 85% having postsecondary education (data not shown). Controls and participants who were younger, who were college-educated, and who had higher incomes were more likely to choose the Web-based survey over the telephone-administered survey. There was a monotonic increase in choice of Web-based survey with decreasing age, increasing household income, and increasing frequency of e-mail and Internet use (Table 1). There was no difference in the relationship between survey mode and participant characteristics for cases compared with controls.

Table 1.

Percentage of respondents who chose the Web-based survey format, overall and by case-control status, and associations between the choice of survey mode (Web versus telephone–administered) and participant characteristics

CharacteristicnTotal cases choosing Web (%)Total controls choosing Web (%)Total participants choosing Web (%)P*
Case-control status  n = 209 n = 330   
    Case 209 — — 83  
    Sibling control 153 — — 90  
    Friend control 177 — — 88  
        P  — — 0.15 — 
Age      
    <40 143 95 96 96  
    40-49 125 84 88 86  
    50-59 159 77 89 85  
    60+ 107 67 80 76  
        P  0.002 0.02 0.0001 0.79 
Gender      
    Male 267 84 92 88  
    Female 275 82 86 84  
        P  0.67 0.09 0.19 0.37 
Race or ethnicity      
    White 517 85 89 87  
    Non-White 16 82 100 88  
        P  0.81 0.43 0.97 0.97 
College educated      
    No 74 61 72 68  
    Yes 465 87 91 89  
        P  0.0006 0.0003 0.0001 0.94 
Household income      
    <25,000 35 75 53 63  
    25,000-50,000 72 73 88 83  
    51,000-75,000 103 73 89 86  
    76,000-100,000 90 79 94 88  
    >100,000 229 90 92 91  
        P  0.24 <0.0001 0.0002 0.16 
Have an e-mail account      
    No 49 55 59 57  
    Yes 490 86 91 89  
    P  0.0005 <0.0001 0.0001 0.52 
Check e-mail daily      
    Daily 315 92 93 93  
    Few times a week 111 83 90 87  
    Weekly 46 81 90 87  
    Less/never 18 44 69 44  
        P  <0.0001 0.02 0.0001 0.87 
Use the Internet daily      
    Daily 204 91 96 94  
    Few times a week 160 88 91 89  
    Weekly 71 86 95 92  
    Less/never 104 60 64 63  
        P  <0.0001 <0.0001 0.0001 0.57 
CharacteristicnTotal cases choosing Web (%)Total controls choosing Web (%)Total participants choosing Web (%)P*
Case-control status  n = 209 n = 330   
    Case 209 — — 83  
    Sibling control 153 — — 90  
    Friend control 177 — — 88  
        P  — — 0.15 — 
Age      
    <40 143 95 96 96  
    40-49 125 84 88 86  
    50-59 159 77 89 85  
    60+ 107 67 80 76  
        P  0.002 0.02 0.0001 0.79 
Gender      
    Male 267 84 92 88  
    Female 275 82 86 84  
        P  0.67 0.09 0.19 0.37 
Race or ethnicity      
    White 517 85 89 87  
    Non-White 16 82 100 88  
        P  0.81 0.43 0.97 0.97 
College educated      
    No 74 61 72 68  
    Yes 465 87 91 89  
        P  0.0006 0.0003 0.0001 0.94 
Household income      
    <25,000 35 75 53 63  
    25,000-50,000 72 73 88 83  
    51,000-75,000 103 73 89 86  
    76,000-100,000 90 79 94 88  
    >100,000 229 90 92 91  
        P  0.24 <0.0001 0.0002 0.16 
Have an e-mail account      
    No 49 55 59 57  
    Yes 490 86 91 89  
    P  0.0005 <0.0001 0.0001 0.52 
Check e-mail daily      
    Daily 315 92 93 93  
    Few times a week 111 83 90 87  
    Weekly 46 81 90 87  
    Less/never 18 44 69 44  
        P  <0.0001 0.02 0.0001 0.87 
Use the Internet daily      
    Daily 204 91 96 94  
    Few times a week 160 88 91 89  
    Weekly 71 86 95 92  
    Less/never 104 60 64 63  
        P  <0.0001 <0.0001 0.0001 0.57 
*

P value for the interaction term between case-control status and the characteristic, from a logistic regression model predicting survey mode.

P value for the χ2 test of the association between survey choice and characteristics, among cases only, controls only, and overall.

Web respondents were more likely than telephone respondents to report receiving help from another person while completing the survey. In addition, Web respondents were more likely to have >1 login session and complete the main survey during a span of >1 day when compared with telephone respondents (Table 2).

Table 2.

Distribution of interview process characteristics and associations between the choice of survey mode (Web versus telephone–administered) and interview characteristics

nTelephone (%)Web (%)P
Questionnaire too difficult     
    No 496 93 92  
    Yes 39 0.85 
Questionnaire too long     
    No 407 88 82  
    Yes 85 12 18 0.23 
Received help     
    No 440 91 81  
    Yes 97 19 0.04 
Minutes to complete     
    ≤30 16  
    31-60 193 57 57  
    61-90 80 37 22  
    >90 47 15 0.04 
Sessions to complete     
    1 252 65 44  
    2 145 20 28  
    3 78 15  
    4+ 64 13 0.008 
Days to complete     
    Within a day 454 94 83  
    Longer than a day 81 17 0.02 
nTelephone (%)Web (%)P
Questionnaire too difficult     
    No 496 93 92  
    Yes 39 0.85 
Questionnaire too long     
    No 407 88 82  
    Yes 85 12 18 0.23 
Received help     
    No 440 91 81  
    Yes 97 19 0.04 
Minutes to complete     
    ≤30 16  
    31-60 193 57 57  
    61-90 80 37 22  
    >90 47 15 0.04 
Sessions to complete     
    1 252 65 44  
    2 145 20 28  
    3 78 15  
    4+ 64 13 0.008 
Days to complete     
    Within a day 454 94 83  
    Longer than a day 81 17 0.02 

Across all 25 questions, the overall median value of κ was 0.57 (range, 0.31-0.96). Kappa values were generally higher among Web-based compared with telephone-based respondents. Questions on oral contraceptive use and smoking had the highest κ values overall, ranging from 0.75 to 0.96. The 4 questions on frequency of dietary habits had κ values ranging from 0.40 to 0.50, whereas the questions related to residential histories had a broad range of κ values from 0.31 to 0.81 (Table 3).

Table 3.

Kappa values and 95% confidence intervals for the 25 binary and ordinal variables included in the main survey and resurvey, overall and by survey mode, ordered by increasing κ value overall

NumberVariableOverall
Web
Telephone
κ95% CIκ95% CIκ95% CI
Duration of residence in a single family home 0.31 0.18-0.44 0.34 0.20-0.48 0.14 −0.22-0.50 
Duration of residence in a midsized apartment building 0.39 0.31-0.47 0.41 0.32-0.50 
Duration of drinking water from cistern 0.40 0.23-0.57 0.46 0.27-0.65 
Frequency of consuming broiled food 0.40 0.34-0.46 0.42 0.35-0.48 0.32 0.14-0.50 
Frequency of consuming charred food 0.43 0.37-0.49 0.38 0.32-0.45 0.67 0.55-0.80 
Where stayed as a child 0.47 0.34-0.60 0.48 0.34-0.61 
Frequency of consuming grilled food 0.48 0.41-0.55 0.40 0.21-0.58 
Duration of residence in a small apartment building 0.49 0.42-0.56 0.52 0.44-0.59 0.31 0.10-0.52 
Frequency of consuming fresh fruits and vegetables 0.50 0.44-0.55 0.49 0.43-0.55 0.55 0.41-0.68 
10 Duration of residence near an industrial facility 0.52 0.38-0.66 0.52 0.36-0.67 0.54 0.24-0.84 
11 Frequency of dental X-rays in adulthood 0.52 0.46-0.58 0.49 0.42-0.56 0.71 0.58-0.84 
12 Duration of using professional exterminator 0.55 0.49-0.60 0.55 0.49-0.62 
13 Duration of drinking water from a public source 0.57 0.50-0.64 0.59 0.52-0.67 0.38 0.11-0.65 
14 Duration of residence near a gas station 0.58 0.49-0.66 0.59 0.50-0.68 0.52 0.30-0.74 
15 Duration of residence in a large apartment building 0.60 0.52-0.69 0.64 0.55-0.73 0.30 −0.01-0.61 
16 Duration of residence in an attached home 0.61 0.53-0.68 0.62 0.55-0.70 0.49 0.30-0.69 
17 Source of drinking water as a child 0.63 0.53-0.73 0.64 0.54-0.75 
18 Duration of residence on a farm 0.64 0.56-0.72 0.65 0.57-0.74 0.57 0.36-0.77 
19 Duration of drinking water from a private well 0.72 0.67-0.77 0.76 0.71-0.80 0.52 0.35-0.68 
20 Age started oral contraceptives 0.75 0.68-0.83 0.80 0.72-0.87 0.53 0.25-0.82 
21 Years used oral contraceptives 0.78 0.73-0.82 0.77 0.72-0.82 0.79 0.68-0.90 
22 Duration of residence in a mobile home 0.81 0.74-0.88 0.82 0.74-0.90 0.76 0.59-0.94 
23 Cigarettes smoked per day 0.90 0.88-0.93 0.90 0.87-0.93 0.91 0.86-0.96 
24 Ever used oral contraceptives 0.92 0.85-0.98 0.96 0.91-1.00 0.76 0.53-0.98 
25 Ever smoked cigarettes 0.96 0.94-0.99 0.97 0.95-0.99 0.92 0.83-1.00 
NumberVariableOverall
Web
Telephone
κ95% CIκ95% CIκ95% CI
Duration of residence in a single family home 0.31 0.18-0.44 0.34 0.20-0.48 0.14 −0.22-0.50 
Duration of residence in a midsized apartment building 0.39 0.31-0.47 0.41 0.32-0.50 
Duration of drinking water from cistern 0.40 0.23-0.57 0.46 0.27-0.65 
Frequency of consuming broiled food 0.40 0.34-0.46 0.42 0.35-0.48 0.32 0.14-0.50 
Frequency of consuming charred food 0.43 0.37-0.49 0.38 0.32-0.45 0.67 0.55-0.80 
Where stayed as a child 0.47 0.34-0.60 0.48 0.34-0.61 
Frequency of consuming grilled food 0.48 0.41-0.55 0.40 0.21-0.58 
Duration of residence in a small apartment building 0.49 0.42-0.56 0.52 0.44-0.59 0.31 0.10-0.52 
Frequency of consuming fresh fruits and vegetables 0.50 0.44-0.55 0.49 0.43-0.55 0.55 0.41-0.68 
10 Duration of residence near an industrial facility 0.52 0.38-0.66 0.52 0.36-0.67 0.54 0.24-0.84 
11 Frequency of dental X-rays in adulthood 0.52 0.46-0.58 0.49 0.42-0.56 0.71 0.58-0.84 
12 Duration of using professional exterminator 0.55 0.49-0.60 0.55 0.49-0.62 
13 Duration of drinking water from a public source 0.57 0.50-0.64 0.59 0.52-0.67 0.38 0.11-0.65 
14 Duration of residence near a gas station 0.58 0.49-0.66 0.59 0.50-0.68 0.52 0.30-0.74 
15 Duration of residence in a large apartment building 0.60 0.52-0.69 0.64 0.55-0.73 0.30 −0.01-0.61 
16 Duration of residence in an attached home 0.61 0.53-0.68 0.62 0.55-0.70 0.49 0.30-0.69 
17 Source of drinking water as a child 0.63 0.53-0.73 0.64 0.54-0.75 
18 Duration of residence on a farm 0.64 0.56-0.72 0.65 0.57-0.74 0.57 0.36-0.77 
19 Duration of drinking water from a private well 0.72 0.67-0.77 0.76 0.71-0.80 0.52 0.35-0.68 
20 Age started oral contraceptives 0.75 0.68-0.83 0.80 0.72-0.87 0.53 0.25-0.82 
21 Years used oral contraceptives 0.78 0.73-0.82 0.77 0.72-0.82 0.79 0.68-0.90 
22 Duration of residence in a mobile home 0.81 0.74-0.88 0.82 0.74-0.90 0.76 0.59-0.94 
23 Cigarettes smoked per day 0.90 0.88-0.93 0.90 0.87-0.93 0.91 0.86-0.96 
24 Ever used oral contraceptives 0.92 0.85-0.98 0.96 0.91-1.00 0.76 0.53-0.98 
25 Ever smoked cigarettes 0.96 0.94-0.99 0.97 0.95-0.99 0.92 0.83-1.00 

Abbreviation: 95% CI, 95% confidence interval.

*

κ was not computed because zero participants chose one of the response values for this question in either the main or resurvey but one or more did choose that value on the other survey.

Kappa values for individual questions tended to be higher for Web respondents, controls, women, and higher-income respondents when compared with telephone respondents, cases, men, and lower-income respondents, respectively (Fig. 1). Within cases, κ values did not seem to be different for patients with more (high-grade tumor or a glioblastoma) versus less severe disease, although data were sparse (data not shown).

Figure 1.

Kappa values for individual questions, ordered from lowest to highest Kappa value, by interview mode, case-control status, gender, age, income, and education.

Figure 1.

Kappa values for individual questions, ordered from lowest to highest Kappa value, by interview mode, case-control status, gender, age, income, and education.

Close modal

The mean number of concordant responses across the 22 questions that were common to men and women was 16.2 (median, 16), ranging from 5 to 22. Table 4 shows that average concordance was 0.56 questions higher in Web respondents versus telephone respondents (P = 0.07) and 0.40 questions higher in controls versus cases (P = 0.06). Average concordance was also higher in women versus men and in participants with higher incomes (Table 4). Age, educational attainment, frequency of e-mail use, or frequency of Internet use was not associated with number of concordant responses. The association between survey mode and the number of concordant responses did not differ by case control status, given that an interaction term added to the final model was not statistically significant (P = 0.75; data not shown).

Table 4.

Bivariate and multivariate associations between survey mode (Web versus telephone–administered), interview process characteristics, demographics, and the number of concordant responses in the main survey and resurvey

VariableBivariate
Multivariate
nβPβP
Status      
    Case (ref) 209     
    Control 330 0.44 0.038 0.40 0.061 
Survey mode      
    Telephone (ref) 74     
    Web 465 0.71 0.017 0.56 0.068 
Gender      
    Male (ref) 264     
    Female 275 0.37 0.073 0.40 0.051 
Income      
    Ordinal, 5 categories 529 0.30 0.0001 0.29 0.0002 
Age      
    Years (ordinal) 534 0.001 0.89   
Any college      
    Yes (ref) 465     
    No 74 0.20 0.51   
E-mail frequency      
    Ordinal, 4 categories 490 −0.08 0.53   
Internet frequency      
    Ordinal, 4 categories 539 −0.06 0.40   
VariableBivariate
Multivariate
nβPβP
Status      
    Case (ref) 209     
    Control 330 0.44 0.038 0.40 0.061 
Survey mode      
    Telephone (ref) 74     
    Web 465 0.71 0.017 0.56 0.068 
Gender      
    Male (ref) 264     
    Female 275 0.37 0.073 0.40 0.051 
Income      
    Ordinal, 5 categories 529 0.30 0.0001 0.29 0.0002 
Age      
    Years (ordinal) 534 0.001 0.89   
Any college      
    Yes (ref) 465     
    No 74 0.20 0.51   
E-mail frequency      
    Ordinal, 4 categories 490 −0.08 0.53   
Internet frequency      
    Ordinal, 4 categories 539 −0.06 0.40   

NOTE: Number of concordant responses was measured on a scale from 1 to 22, excluding 3 questions for oral contraceptive use, which only applied to women.

Abbreviation: ref, reference.

Results from this reliability study suggest that a self-administered, Web-based survey is a feasible and appropriate method for collecting data about environmental and other exposure conditions in case-control studies on malignant brain cancer. Upon encouragement to do so, participants overwhelmingly chose the Web-based survey mode over the telephone-administered survey, and the reliability of responses via the Web exceeded the reliability of responses via telephone, even after controlling for the demographics of participants and frequency of e-mail and Internet use. Increased reliability among Web-based respondents may have been due in part to their ability to complete the main survey over several sessions and a longer period of time, which may have allowed them more opportunity for retrospection, verification, and seeking help from others when recalling events. Alternatively, because participants were not randomized to either the Web-based or telephone-administered survey mode, the apparently greater reliability for Web-based respondents may be due to a tendency for more reliable responders to choose the Web-based survey. Although we attempted to control for differences in socioeconomic status and level of comfort with the Internet, residual confounding may still explain the greater reliability observed for Web-based responses. The relatively low response rate in this study may have inflated estimates of reliability for both modes of interview, if nonresponders tended to also be less reliable reporters. Nonetheless, our results suggest that the reliability of Web-based responders is at least comparable with responders telephone.

Studies in other areas of health research have shown better or equal reliability for Web-based surveys compared with telephone-administered or paper-based surveys for exposures such as alcohol intake (7-9), general health status, and smoking cessation (7). Most of these studies randomized participants into one survey mode or the other and were done in younger populations such as college students or in populations of Internet users recruited through Web sites. Our clinic-based sample of brain tumor patients was very different from these populations. Among our healthy adult controls, who are more comparable with populations previously studied, the number of concordant responses between the main survey and the resurvey was significantly higher for the Web-based survey compared with the telephone survey (data not shown).

As anticipated, survey responses were more reliable for controls than for brain cancer cases, consistent with the cognitive decline affecting memory and attention that often occurs in brain tumor cases as a result of their disease (10) or treatment for their disease (11). Contrary to expectation, however, we did not find differentially lower reliability among cases with glioblastomas and other high-grade tumors, for whom cognitive decline might be expected to be more pronounced than those with lower-grade tumors. This may have been the result of data sparseness due to stratifying within cases, which limited our ability to detect effects.

We found that the traditional epidemiologic risk factors had higher reliability than the dietary and environmental measures. Consistent with previous research (12-15), reliability for smoking and oral contraceptive histories was very high in our study. Agreement for the four dietary history items in our study was found to be “moderate,” according to the cutoffs established by Landis and Koch (16), and comparable with those from nutrient intake assessments in studies on other diseases (17-21). The reliability of responses for the environmental exposures items such as housing, residence on a farm, pesticide exposure, and water source were also generally modest. Because these individual food and environmental items will be used to construct participant-specific indices of potential neurocarcinogen exposure, the modest reliability for these responses could lead to exposure misclassification, which may substantially attenuate any associations between exposure and disease.

Previous research suggests that respondents are less likely to underreport sensitive issues in the context of a self-interview such as a paper-based survey, audio computer–assisted interview, or a Web-based survey (9, 22, 23). Although the interview for the current study did not focus on sensitive issues, recent evidence has suggested that exposure to marijuana may be related to the development of brain tumors (24), so sensitive topics such as illicit drug use may be important in future data collection efforts in brain tumor research.

There are advantages to a computer-based self-interview when compared with other traditional paper-based self-interviews. The questionnaire can be programmed with logic so that questions are skipped automatically when not applicable and error or warning messages can appear if responses do not meet a valid range of responses. These aspects of computer-assisted modes can reduce errors in respondent reports. There are additional advantages to Web-based over computer-based interviews. First, respondents have the option of completing the interview at a time and place of their choosing, and over as many sessions and as long a period of time as needed. This enables respondents to choose moments when they are best able to focus on the topic at hand, which may make them less likely to rush their answers. Because they can start and stop the interview at any time, it also provides opportunities for soliciting information from relatives and others who are knowledgeable about the respondents' exposure history to verify their own responses.

Web-based interviews are also more rapid and cost efficient than other interview modes (9, 25, 26). Although there are fees associated with programming the questionnaire online, other costs are saved, such as paper and postage for paper-based surveys and interviewer training and personnel time for telephone-based surveys. In addition, because participants directly enter precoded data into the database, data entry costs are saved and data are available immediately for analysis (27).

The comparable reliability and cost efficiency of Web-based versus telephone interviews in our study suggest that Web-based self-interviews should be considered more often as the primary interview mode when planning epidemiologic studies in populations with Internet access, especially when the study sample is widely distributed geographically (24, 27).

Future studies should focus on the feasibility and reliability of Web-based surveys within studies on other populations and different disease states to fill the knowledge gaps in this area and describe the potential impact of survey mode on resulting measures of association between exposure and disease.

No potential conflicts of interest were disclosed.

Grant support: National Cancer Institute Specialized Programs of Research Excellence in Brain Cancer grant (1P20CA96890-01) and National Cancer Institute (grant 5 P50 CA 106743).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

We thank the research staff at the Neuro-Oncology Program, Evanston Northwestern Healthcare Kellogg Cancer Care Center, and the Cancer Control, Detection and Prevention Research Program, Duke University Medical Center for their cooperation and hard work, and the dedicated participants who have volunteered their time and energy to complete the activities of the study.

1
Demographics of Internet Users (Table). Updated 2007. Available at: http://www.pewinternet.org/trends/User_Demo_4.26.06.htm. Accessed September 20, 2007.
2
Tourangeau R. Survey research and societal change.
Annu Rev Psychol
2004
;
55
:
775
–801.
3
Umbach PD. Web surveys: Best practices. In: Porter SR, ed. Overcoming survey research problems: new directions for institutional research. No. 121 ed. New Jersey: Jossey Bass; 2004. p. 23-38.
4
Cohen J. A coefficient of agreement for nominal scales.
Educ Psychol Meas
1960
;
20
:
37
–46.
5
Fleiss JL, Cohen J, Everitt B. Large sample standard errors of kappa and weighted kappa.
Psychol Bull
1969
;
72
:
323
–7.
6
Fleiss JL, Levine B, Paik MC. Statistical methods for rates and proportions. 3rd ed. New Jersey: Wiley-Interscience; 2003.
7
Graham AL, Papandonatos GD, Bock BC, et al. Internet- vs. telephone-administered questionnaires in a randomized trial of smoking cessation.
Nicotine Tob Res
2006
;
8
:
S49
–57.
8
Miller ET, Neal DJ, Roberts LJ, et al. Test-retest reliability of alcohol measures: is there a difference between internet-based assessment and traditional methods?
Psychol Addict Behav
2002
;
16
:
56
–63.
9
Parks KA, Pardi AM, Bradizza CM. Collecting data on alcohol use and alcohol-related victimization: a comparison of telephone and web-based survey methods.
J Stud Alcohol
2006
;
67
:
318
–23.
10
Tucha O, Smely C, Preier M, Lange KW. Cognitive deficits before treatment among patients with brain tumors.
Neurosurgery
2000
;
47
:
324
–33.
11
Correa DD, DeAngelis LM, Shi W, Thaler HT, Lin M, Abrey LE. Cognitive functions in low-grade gliomas: disease and treatment effects.
J Neurooncol
2007
;
81
:
175
–84.
12
Karatela S, Purdie DM, Green AC, Webb PM, Whiteman DC. Repeatability of self-reported information for population-based studies of cancer.
Asian Pac J Cancer Prev
2006
;
7
:
303
–8.
13
Band PR, Spinelli JJ, Threlfall WJ, Fang R, Le ND, Gallagher RP. Identification of occupational cancer risks in British Columbia. Part I: methodology, descriptive results, and analysis of cancer risks, by cigarette smoking categories of 15,463 incident cancer cases.
J Occup Environ Med
1999
;
41
:
224
–32.
14
Gartner CE, Battistutta D, Dunne MP, Silburn PA, Mellick GD. Test-retest repeatability of self-reported environmental exposures in Parkinson's disease cases and healthy controls.
Parkinsonism Relat Disord
2005
;
11
:
287
–95.
15
Reider CR, Hubble JP. Test-retest reliability of an epidemiological instrument for Parkinson's disease.
J Clin Epidemiol
2000
;
53
:
863
–5.
16
Landis JR, Koch G. The measurement of observer agreement for categorical data.
Biometrics
1977
;
33
:
159
–74.
17
Friis S, Kruger Kjaer S, Stripp C, Overvad K. Reproducibility and relative validity of a self-administered semiquantitative food frequency questionnaire applied to younger women.
J Clin Epidemiol
1997
;
50
:
303
–11.
18
Schaffer DM, Coates AO, Caan BJ, Slattery ML, Potter JD. Performance of a shortened telephone-administered version of a quantitative food frequency questionnaire.
Ann Epidemiol
1997
;
7
:
463
–71.
19
Parr CL, Veierod MB, Laake P, Lund E, Hjartaker A. Test-retest reproducibility of a food frequency questionnaire (FFQ) and estimated effects on disease risk in the Norwegian women and cancer study (NOWAC).
Nutr J
2006
;
5
:
4
.
20
Date C, Fukui M, Yamamoto A, et al. Reproducibility and validity of a self-administered food frequency questionnaire used in the JACC study.
J Epidemiol
2005
;
15
:
S9
–23.
21
Margetts B, Nelson M, editors. Design Concepts in Nutritional Epidemiology. 2nd ed. USA: Oxford University Press; 1997.
22
Galesic M, Tourangeau R, Couper MP. Complementing random-digit-dial telephone surveys with other approaches to collecting sensitive data.
Am J Prev Med
2006
;
31
:
437
–43.
23
Tourangeau R, Smith TW. Asking sensitive questions—the impact of data collection mode, question format, and question context.
Public Opin Q
1996
;
60
:
275
–304.
24
Eysenbach G, Wyatt J. Using the internet for surveys and health research.
J Med Internet Res
2002
;
4
:
E13
.
25
Rhodes SD, Bowie DA, Hergenrather KC. Collecting behavioural data using the world wide web: considerations for researchers.
J Epidemiol Community Health
2003
;
57
:
68
–73.
26
Mavis B, Doig K. The value of noncognitive factors in predicting students' first-year academic probation.
Acad Med
1998
;
73
:
201
–3.
27
Van Selm M, Jankowski NW. Conducting online surveys.
Qual Quant
2006
;
40
:
435
–56.