Abstract
Introduction: To determine whether a Web-based survey was an acceptable method of data collection for a clinic-based case-control study of adult brain cancer, the authors compared the reliability of paired responses to a main and resurvey for participants completing surveys by telephone (n = 74) or self-administered on the Web (n = 465) between 2003 and 2006.
Methods: Recruitment of cases was done at the Evanston Northwestern Healthcare Kellogg Cancer Care Center and the Duke University Medical Center Cancer Control division, and controls were friends and siblings of cases. Twenty-five variables were examined, including smoking, oral contraceptive and residential histories, water sources, meat preparation, fruit and vegetable consumption, and pesticide use. Weighted and simple κ's were estimated for categorical and binary variables, respectively.
Results: The number of concordant paired responses was summed for use in linear regression. Respondents were 97% White and 85% had postsecondary education. Kappa's for individual questions ranged from 0.31 (duration of residence in a single family house) to 0.96 (ever smoked), with a median of 0.57 (95% confidence interval, 0.47-0.64). The median number of concordant responses was 16.2 (range, 5-22). Reliability was greater for controls than cases, Web-based versus telephone responders, females, and higher-income responders. Frequency of e-mail and Internet use was not associated with reliability.
Conclusions: A self-administered, Web-based survey was a feasible and appropriate mode of interview in this study. The comparable reliability of Web compared with telephone responses suggest that Web-based self-interviews could be a cost-effective alternative to traditional modes of interview. (Cancer Epidemiol Biomarkers Prev 2008;17(10):2639–46)
Introduction
According to the Pew Internet and American Life Project, Internet usage in the United States has increased over time in all age categories, with 71% of American adults currently using the Internet at least occasionally. Although this percentage is higher among younger adults, 65% of Americans with the age of 50 to 64 years currently use the Internet and 93% of adults living in a household with an income >$75,000/y are online (1). Web-based technology has been used extensively to collect data in the areas of public opinion and commercial research but has been underutilized as a method of data collection for epidemiologic studies (2), especially in the exploration of the role of human exposures in the causation of cancer. Underutilization of this technology may be due to concern that general population studies would include individuals who are not regular Internet users and may not be comfortable completing a lengthy exposure assessment questionnaire online.
Although it has been recommended to keep the length of Web-administered surveys to <20 min (3), risk factor questionnaires tend to be considerably longer, especially when studying a disease with a complex and largely unknown etiology. Because using the Web for exposure assessment in health research is so new, it is important to assess the reliability of responses obtained from this mode of data collection. Our pilot case-control study on brain cancer provided an opportunity to compare the reliability of paired responses to a questionnaire administered either by telephone or via the Internet.
Between 2003 and 2006, we collected data from brain tumor cases and their friend and sibling controls about their personal histories and self-reported exposures to substances that have been identified as animal neurocarcinogens, including occupational, environmental, and food exposures. Participants had the option of completing the interview either by telephone or self-administered over the Internet. The substudy described here examined the reliability of responses to a subset of items on the main questionnaire. This reliability substudy had three primary goals: (a) to determine whether participants would choose the Web self-interview or would opt for the more traditional telephone interview, (b) to determine whether the reliability of responses from the Web-based survey were comparable with the those from the telephone-administered survey, and (c) to determine whether reliability was affected by case-control status and other participant characteristics. To our knowledge, this is the first reliability study to be done for a comprehensive risk factor questionnaire in a sample of brain tumor cases and their friend and sibling controls.
Materials and Methods
Research Design Overview
Brain cancer patients were ascertained at two specialty clinics (Duke University Medical Center in North Carolina and Evanston Northwestern Healthcare in Illinois) that obtain patient referrals from a broad geographic region. The Duke clinic serves brain tumor patients of all ages with approximately one third coming from North Carolina, South Carolina, and southern Virginia, and the rest coming from other areas of the United States or other countries. The Evanston Northwestern healthcare clinic draws 93% of its patient referrals from Illinois, with the remainder primarily from other Midwest states such as Wisconsin and Indiana. Cases were ascertained between 2003 and 2006. To be eligible, cases must (a) have had a histologically confirmed diagnosis of a first primary brain tumor (International Classification of Diseases for Oncology, 3rd edition, sites C70.0-C72.9 and C75.1-C75.3) with the histologies of glioblastoma (International Classification of Diseases for Oncology, 3rd edition, histology codes 9440-9442), astrocytoma grades 2 and 3 (9400-9411 and 9420), or oligodendroglial (9382 and 9450-9451); (b) be ≥18 years of age; (c) speak English; and (d) reside within the United States. Cognitive function of cases was assessed by the doctor at the time consent was obtained.
Because it was not possible to identify and draw a sample of controls representing the underlying referral population for the two clinics, patients were asked to provide names and contact information for up to two siblings and three friends that might agree to serve as study controls. Eligible sibling and friend controls had to be at least 18 years old, residents of the United States, and have no history of brain cancer. Institutional review board approval for this study was obtained from the University of Illinois at Chicago, Duke University, and Evanston Northwestern Healthcare institutional review boards. Participant recruitment and consent to participate in each section of the study was done in person with cases at the clinic and by telephone or mail for controls. Case and control interviews were conducted between 2003 and 2006.
Choice of Survey Format
Participants were encouraged to complete the interview via the Internet but were offered a telephone-administered interview if they preferred. Patients who chose to complete the Web-based version were provided with a packet containing a confidential and unique username and password and a set of detailed instructions. The home page for the Web-based survey listed the different sections of the instrument and the survey had a status bar on each page to show the participant what percentage of the survey questions was completed. Participants completing the Web-based survey had the opportunity to complete it at their leisure and at multiple locations, to log in and out multiple times as needed or desired, and to call the toll-free support line if they had questions or problems. Participants who chose the telephone-administered survey were also given the opportunity to complete the survey during more than one call if needed. Although the questions and response choices were identically worded for the Web- and telephone-administered versions of the survey, some transition phrases were added to the telephone version to facilitate the segue from one section to the next.
A total of 679 cases were eligible to participate in the study, 359 of which consented to the main survey; 269 cases completed the main survey, 49 by telephone and 220 over the Internet, which resulted in an overall response rate of 40% for cases in the main survey. A total of 651 controls were eligible to participate in the study, 532 of which consented to the main survey; 400 controls completed the main survey, 48 by telephone and 353 over the Internet, which resulted in an overall response rate of 61% for controls in the main survey.
Reliability Substudy
Several weeks after completing the main survey, all participants were recontacted and asked to complete a short resurvey. The resurvey consisted of a subset of 25 questions from the main survey. Brain tumor cases and their sibling and friend controls were included in this analysis if they completed the main survey and the brief resurvey using the same survey mode for each, either by telephone or self-administered on the Web.
Of the 269 cases and 400 controls who completed the main survey, 222 (83%) and 337 (84%) also completed a resurvey, respectively. Twenty participants were excluded from the reliability study because they completed the main survey and resurvey via different modes. Of the 539 remaining, 74 completed both surveys via telephone and 465 completed both online. The median time between completion of the main survey and the resurvey was 36 days via telephone and 48 days via the Web.
The following 25 questions from the main survey were asked during the resurvey: where did the respondent stay as a child (home, relative's home, daycare); primary source of drinking water as a child (city water, household well, bottled water); ever smoked cigarettes, average number of cigarettes smoked per day (continuous); frequency of adult dental X-rays (never or only as a child, at least once a year, 2-3 years, 4-5 years, or less often); ever used oral contraceptives; age at which the respondent began using oral contraception (continuous); and duration of use (continuous). There were separate questions on duration of residence (0, 1-9, 10-19, and ≥20 years) in each of the following housing structures: a mobile home or trailer, 1-family house, 1-family house attached to ≥1 houses, a building with 1 to 9 apartments, a building with 10 to 49 apartments, and a building with ≥50 apartments. Respondents were asked separately how long they had ever lived near an industrial facility or a gas station, using the same scale. There were also separate questions about duration of residence (0, 1-9, 10-19, and ≥20 years) with the following domestic water sources: public or commercial water system, private well, and cistern. Questions were asked about the frequency of consumption (never, rarely, sometimes, often) of broiled food, grilled or barbecued food, and food charred or blackened by burning, as well as the frequency of consumption of home-grown fruits and vegetables during the summertime. Finally, there were questions on the duration of residence on a farm and the number of years residing in a place where pesticides were professionally applied indoors (each with response categories of 0, 1-9, 10-19, and ≥20 years).
Statistical Analysis
We calculated the percentage of participants choosing the Web-based survey, overall and by case-control status, within categories of age, gender, race, education, income, and frequency of Internet and e-mail use. χ2 Tests of association were done overall and by case-control status to determine whether choice of survey mode was significantly associated with these characteristics (α = 0.05). In addition, a separate logistic regression model predicting choice of survey mode was fit for each characteristic, including terms for case-control status and the interaction between case-control status and the characteristic, to determine whether the relationship between survey mode and the characteristic differed by case-control status. The P value from the type 3 test was reported.
We also examined how survey process characteristics varied by survey mode. These included the perceived level of difficulty completing the survey, perceived length of the survey, whether the respondent received help in completing the survey, and the total number of minutes, sessions, and days to complete the survey.
To examine the reliability of responses to individual questions, we recoded the continuous variables, number of cigarettes smoked per day (1-4, 5-9, 10-19, =20), the number of years oral contraceptives were used (≤1, 2, 3-5, 6-10, >10 years), and the participant's age when she started oral contraceptives (<20, 20-29, 30-39, 40-49, =50), into 4 to 6 categories and estimated simple and weighted κ's for binary and ordinal categorical variables, respectively, along with their 95% asymptotic confidence limits (4-6). Estimated κ values for each question were graphed in ascending order. We then re-estimated and plotted κ's after stratifying on survey mode, case-control status, gender, and income level.
In addition, we summed the number of concordant responses between the main survey and resurvey for each participant across the 22 questions, after excluding the three questions on oral contraceptive histories that were not asked of men. The result was an ordinal variable that was approximately normally distributed, with a possible range between 0 and 22. We used this variable as a dependent variable in linear regression analyses to examine participant-level predictors of reliability. First, the dependent variable was regressed one at a time with each independent variable to screen for variables to include in a final model. The only variables that were associated with reliability were gender, case-control status, survey mode, and household income, and these were included together in the final multivariate linear regression model of reliability.
Results
Respondents to the main survey were 97% White, with 85% having postsecondary education (data not shown). Controls and participants who were younger, who were college-educated, and who had higher incomes were more likely to choose the Web-based survey over the telephone-administered survey. There was a monotonic increase in choice of Web-based survey with decreasing age, increasing household income, and increasing frequency of e-mail and Internet use (Table 1). There was no difference in the relationship between survey mode and participant characteristics for cases compared with controls.
Characteristic . | n . | Total cases choosing Web (%) . | Total controls choosing Web (%) . | Total participants choosing Web (%) . | P* . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Case-control status | n = 209 | n = 330 | ||||||||
Case | 209 | — | — | 83 | ||||||
Sibling control | 153 | — | — | 90 | ||||||
Friend control | 177 | — | — | 88 | ||||||
P† | — | — | 0.15 | — | ||||||
Age | ||||||||||
<40 | 143 | 95 | 96 | 96 | ||||||
40-49 | 125 | 84 | 88 | 86 | ||||||
50-59 | 159 | 77 | 89 | 85 | ||||||
60+ | 107 | 67 | 80 | 76 | ||||||
P† | 0.002 | 0.02 | 0.0001 | 0.79 | ||||||
Gender | ||||||||||
Male | 267 | 84 | 92 | 88 | ||||||
Female | 275 | 82 | 86 | 84 | ||||||
P† | 0.67 | 0.09 | 0.19 | 0.37 | ||||||
Race or ethnicity | ||||||||||
White | 517 | 85 | 89 | 87 | ||||||
Non-White | 16 | 82 | 100 | 88 | ||||||
P† | 0.81 | 0.43 | 0.97 | 0.97 | ||||||
College educated | ||||||||||
No | 74 | 61 | 72 | 68 | ||||||
Yes | 465 | 87 | 91 | 89 | ||||||
P† | 0.0006 | 0.0003 | 0.0001 | 0.94 | ||||||
Household income | ||||||||||
<25,000 | 35 | 75 | 53 | 63 | ||||||
25,000-50,000 | 72 | 73 | 88 | 83 | ||||||
51,000-75,000 | 103 | 73 | 89 | 86 | ||||||
76,000-100,000 | 90 | 79 | 94 | 88 | ||||||
>100,000 | 229 | 90 | 92 | 91 | ||||||
P† | 0.24 | <0.0001 | 0.0002 | 0.16 | ||||||
Have an e-mail account | ||||||||||
No | 49 | 55 | 59 | 57 | ||||||
Yes | 490 | 86 | 91 | 89 | ||||||
P† | 0.0005 | <0.0001 | 0.0001 | 0.52 | ||||||
Check e-mail daily | ||||||||||
Daily | 315 | 92 | 93 | 93 | ||||||
Few times a week | 111 | 83 | 90 | 87 | ||||||
Weekly | 46 | 81 | 90 | 87 | ||||||
Less/never | 18 | 44 | 69 | 44 | ||||||
P† | <0.0001 | 0.02 | 0.0001 | 0.87 | ||||||
Use the Internet daily | ||||||||||
Daily | 204 | 91 | 96 | 94 | ||||||
Few times a week | 160 | 88 | 91 | 89 | ||||||
Weekly | 71 | 86 | 95 | 92 | ||||||
Less/never | 104 | 60 | 64 | 63 | ||||||
P† | <0.0001 | <0.0001 | 0.0001 | 0.57 |
Characteristic . | n . | Total cases choosing Web (%) . | Total controls choosing Web (%) . | Total participants choosing Web (%) . | P* . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Case-control status | n = 209 | n = 330 | ||||||||
Case | 209 | — | — | 83 | ||||||
Sibling control | 153 | — | — | 90 | ||||||
Friend control | 177 | — | — | 88 | ||||||
P† | — | — | 0.15 | — | ||||||
Age | ||||||||||
<40 | 143 | 95 | 96 | 96 | ||||||
40-49 | 125 | 84 | 88 | 86 | ||||||
50-59 | 159 | 77 | 89 | 85 | ||||||
60+ | 107 | 67 | 80 | 76 | ||||||
P† | 0.002 | 0.02 | 0.0001 | 0.79 | ||||||
Gender | ||||||||||
Male | 267 | 84 | 92 | 88 | ||||||
Female | 275 | 82 | 86 | 84 | ||||||
P† | 0.67 | 0.09 | 0.19 | 0.37 | ||||||
Race or ethnicity | ||||||||||
White | 517 | 85 | 89 | 87 | ||||||
Non-White | 16 | 82 | 100 | 88 | ||||||
P† | 0.81 | 0.43 | 0.97 | 0.97 | ||||||
College educated | ||||||||||
No | 74 | 61 | 72 | 68 | ||||||
Yes | 465 | 87 | 91 | 89 | ||||||
P† | 0.0006 | 0.0003 | 0.0001 | 0.94 | ||||||
Household income | ||||||||||
<25,000 | 35 | 75 | 53 | 63 | ||||||
25,000-50,000 | 72 | 73 | 88 | 83 | ||||||
51,000-75,000 | 103 | 73 | 89 | 86 | ||||||
76,000-100,000 | 90 | 79 | 94 | 88 | ||||||
>100,000 | 229 | 90 | 92 | 91 | ||||||
P† | 0.24 | <0.0001 | 0.0002 | 0.16 | ||||||
Have an e-mail account | ||||||||||
No | 49 | 55 | 59 | 57 | ||||||
Yes | 490 | 86 | 91 | 89 | ||||||
P† | 0.0005 | <0.0001 | 0.0001 | 0.52 | ||||||
Check e-mail daily | ||||||||||
Daily | 315 | 92 | 93 | 93 | ||||||
Few times a week | 111 | 83 | 90 | 87 | ||||||
Weekly | 46 | 81 | 90 | 87 | ||||||
Less/never | 18 | 44 | 69 | 44 | ||||||
P† | <0.0001 | 0.02 | 0.0001 | 0.87 | ||||||
Use the Internet daily | ||||||||||
Daily | 204 | 91 | 96 | 94 | ||||||
Few times a week | 160 | 88 | 91 | 89 | ||||||
Weekly | 71 | 86 | 95 | 92 | ||||||
Less/never | 104 | 60 | 64 | 63 | ||||||
P† | <0.0001 | <0.0001 | 0.0001 | 0.57 |
P value for the interaction term between case-control status and the characteristic, from a logistic regression model predicting survey mode.
P value for the χ2 test of the association between survey choice and characteristics, among cases only, controls only, and overall.
Web respondents were more likely than telephone respondents to report receiving help from another person while completing the survey. In addition, Web respondents were more likely to have >1 login session and complete the main survey during a span of >1 day when compared with telephone respondents (Table 2).
. | n . | Telephone (%) . | Web (%) . | P . | ||||
---|---|---|---|---|---|---|---|---|
Questionnaire too difficult | ||||||||
No | 496 | 93 | 92 | |||||
Yes | 39 | 7 | 8 | 0.85 | ||||
Questionnaire too long | ||||||||
No | 407 | 88 | 82 | |||||
Yes | 85 | 12 | 18 | 0.23 | ||||
Received help | ||||||||
No | 440 | 91 | 81 | |||||
Yes | 97 | 9 | 19 | 0.04 | ||||
Minutes to complete | ||||||||
≤30 | 16 | 0 | 6 | |||||
31-60 | 193 | 57 | 57 | |||||
61-90 | 80 | 37 | 22 | |||||
>90 | 47 | 7 | 15 | 0.04 | ||||
Sessions to complete | ||||||||
1 | 252 | 65 | 44 | |||||
2 | 145 | 20 | 28 | |||||
3 | 78 | 8 | 15 | |||||
4+ | 64 | 7 | 13 | 0.008 | ||||
Days to complete | ||||||||
Within a day | 454 | 94 | 83 | |||||
Longer than a day | 81 | 6 | 17 | 0.02 |
. | n . | Telephone (%) . | Web (%) . | P . | ||||
---|---|---|---|---|---|---|---|---|
Questionnaire too difficult | ||||||||
No | 496 | 93 | 92 | |||||
Yes | 39 | 7 | 8 | 0.85 | ||||
Questionnaire too long | ||||||||
No | 407 | 88 | 82 | |||||
Yes | 85 | 12 | 18 | 0.23 | ||||
Received help | ||||||||
No | 440 | 91 | 81 | |||||
Yes | 97 | 9 | 19 | 0.04 | ||||
Minutes to complete | ||||||||
≤30 | 16 | 0 | 6 | |||||
31-60 | 193 | 57 | 57 | |||||
61-90 | 80 | 37 | 22 | |||||
>90 | 47 | 7 | 15 | 0.04 | ||||
Sessions to complete | ||||||||
1 | 252 | 65 | 44 | |||||
2 | 145 | 20 | 28 | |||||
3 | 78 | 8 | 15 | |||||
4+ | 64 | 7 | 13 | 0.008 | ||||
Days to complete | ||||||||
Within a day | 454 | 94 | 83 | |||||
Longer than a day | 81 | 6 | 17 | 0.02 |
Across all 25 questions, the overall median value of κ was 0.57 (range, 0.31-0.96). Kappa values were generally higher among Web-based compared with telephone-based respondents. Questions on oral contraceptive use and smoking had the highest κ values overall, ranging from 0.75 to 0.96. The 4 questions on frequency of dietary habits had κ values ranging from 0.40 to 0.50, whereas the questions related to residential histories had a broad range of κ values from 0.31 to 0.81 (Table 3).
Number . | Variable . | Overall . | . | Web . | . | Telephone . | . | |||
---|---|---|---|---|---|---|---|---|---|---|
. | . | κ . | 95% CI . | κ . | 95% CI . | κ . | 95% CI . | |||
1 | Duration of residence in a single family home | 0.31 | 0.18-0.44 | 0.34 | 0.20-0.48 | 0.14 | −0.22-0.50 | |||
2 | Duration of residence in a midsized apartment building | 0.39 | 0.31-0.47 | 0.41 | 0.32-0.50 | * | * | |||
3 | Duration of drinking water from cistern | 0.40 | 0.23-0.57 | 0.46 | 0.27-0.65 | * | * | |||
4 | Frequency of consuming broiled food | 0.40 | 0.34-0.46 | 0.42 | 0.35-0.48 | 0.32 | 0.14-0.50 | |||
5 | Frequency of consuming charred food | 0.43 | 0.37-0.49 | 0.38 | 0.32-0.45 | 0.67 | 0.55-0.80 | |||
6 | Where stayed as a child | 0.47 | 0.34-0.60 | 0.48 | 0.34-0.61 | * | * | |||
7 | Frequency of consuming grilled food | 0.48 | 0.41-0.55 | * | * | 0.40 | 0.21-0.58 | |||
8 | Duration of residence in a small apartment building | 0.49 | 0.42-0.56 | 0.52 | 0.44-0.59 | 0.31 | 0.10-0.52 | |||
9 | Frequency of consuming fresh fruits and vegetables | 0.50 | 0.44-0.55 | 0.49 | 0.43-0.55 | 0.55 | 0.41-0.68 | |||
10 | Duration of residence near an industrial facility | 0.52 | 0.38-0.66 | 0.52 | 0.36-0.67 | 0.54 | 0.24-0.84 | |||
11 | Frequency of dental X-rays in adulthood | 0.52 | 0.46-0.58 | 0.49 | 0.42-0.56 | 0.71 | 0.58-0.84 | |||
12 | Duration of using professional exterminator | 0.55 | 0.49-0.60 | 0.55 | 0.49-0.62 | * | * | |||
13 | Duration of drinking water from a public source | 0.57 | 0.50-0.64 | 0.59 | 0.52-0.67 | 0.38 | 0.11-0.65 | |||
14 | Duration of residence near a gas station | 0.58 | 0.49-0.66 | 0.59 | 0.50-0.68 | 0.52 | 0.30-0.74 | |||
15 | Duration of residence in a large apartment building | 0.60 | 0.52-0.69 | 0.64 | 0.55-0.73 | 0.30 | −0.01-0.61 | |||
16 | Duration of residence in an attached home | 0.61 | 0.53-0.68 | 0.62 | 0.55-0.70 | 0.49 | 0.30-0.69 | |||
17 | Source of drinking water as a child | 0.63 | 0.53-0.73 | 0.64 | 0.54-0.75 | * | * | |||
18 | Duration of residence on a farm | 0.64 | 0.56-0.72 | 0.65 | 0.57-0.74 | 0.57 | 0.36-0.77 | |||
19 | Duration of drinking water from a private well | 0.72 | 0.67-0.77 | 0.76 | 0.71-0.80 | 0.52 | 0.35-0.68 | |||
20 | Age started oral contraceptives | 0.75 | 0.68-0.83 | 0.80 | 0.72-0.87 | 0.53 | 0.25-0.82 | |||
21 | Years used oral contraceptives | 0.78 | 0.73-0.82 | 0.77 | 0.72-0.82 | 0.79 | 0.68-0.90 | |||
22 | Duration of residence in a mobile home | 0.81 | 0.74-0.88 | 0.82 | 0.74-0.90 | 0.76 | 0.59-0.94 | |||
23 | Cigarettes smoked per day | 0.90 | 0.88-0.93 | 0.90 | 0.87-0.93 | 0.91 | 0.86-0.96 | |||
24 | Ever used oral contraceptives | 0.92 | 0.85-0.98 | 0.96 | 0.91-1.00 | 0.76 | 0.53-0.98 | |||
25 | Ever smoked cigarettes | 0.96 | 0.94-0.99 | 0.97 | 0.95-0.99 | 0.92 | 0.83-1.00 |
Number . | Variable . | Overall . | . | Web . | . | Telephone . | . | |||
---|---|---|---|---|---|---|---|---|---|---|
. | . | κ . | 95% CI . | κ . | 95% CI . | κ . | 95% CI . | |||
1 | Duration of residence in a single family home | 0.31 | 0.18-0.44 | 0.34 | 0.20-0.48 | 0.14 | −0.22-0.50 | |||
2 | Duration of residence in a midsized apartment building | 0.39 | 0.31-0.47 | 0.41 | 0.32-0.50 | * | * | |||
3 | Duration of drinking water from cistern | 0.40 | 0.23-0.57 | 0.46 | 0.27-0.65 | * | * | |||
4 | Frequency of consuming broiled food | 0.40 | 0.34-0.46 | 0.42 | 0.35-0.48 | 0.32 | 0.14-0.50 | |||
5 | Frequency of consuming charred food | 0.43 | 0.37-0.49 | 0.38 | 0.32-0.45 | 0.67 | 0.55-0.80 | |||
6 | Where stayed as a child | 0.47 | 0.34-0.60 | 0.48 | 0.34-0.61 | * | * | |||
7 | Frequency of consuming grilled food | 0.48 | 0.41-0.55 | * | * | 0.40 | 0.21-0.58 | |||
8 | Duration of residence in a small apartment building | 0.49 | 0.42-0.56 | 0.52 | 0.44-0.59 | 0.31 | 0.10-0.52 | |||
9 | Frequency of consuming fresh fruits and vegetables | 0.50 | 0.44-0.55 | 0.49 | 0.43-0.55 | 0.55 | 0.41-0.68 | |||
10 | Duration of residence near an industrial facility | 0.52 | 0.38-0.66 | 0.52 | 0.36-0.67 | 0.54 | 0.24-0.84 | |||
11 | Frequency of dental X-rays in adulthood | 0.52 | 0.46-0.58 | 0.49 | 0.42-0.56 | 0.71 | 0.58-0.84 | |||
12 | Duration of using professional exterminator | 0.55 | 0.49-0.60 | 0.55 | 0.49-0.62 | * | * | |||
13 | Duration of drinking water from a public source | 0.57 | 0.50-0.64 | 0.59 | 0.52-0.67 | 0.38 | 0.11-0.65 | |||
14 | Duration of residence near a gas station | 0.58 | 0.49-0.66 | 0.59 | 0.50-0.68 | 0.52 | 0.30-0.74 | |||
15 | Duration of residence in a large apartment building | 0.60 | 0.52-0.69 | 0.64 | 0.55-0.73 | 0.30 | −0.01-0.61 | |||
16 | Duration of residence in an attached home | 0.61 | 0.53-0.68 | 0.62 | 0.55-0.70 | 0.49 | 0.30-0.69 | |||
17 | Source of drinking water as a child | 0.63 | 0.53-0.73 | 0.64 | 0.54-0.75 | * | * | |||
18 | Duration of residence on a farm | 0.64 | 0.56-0.72 | 0.65 | 0.57-0.74 | 0.57 | 0.36-0.77 | |||
19 | Duration of drinking water from a private well | 0.72 | 0.67-0.77 | 0.76 | 0.71-0.80 | 0.52 | 0.35-0.68 | |||
20 | Age started oral contraceptives | 0.75 | 0.68-0.83 | 0.80 | 0.72-0.87 | 0.53 | 0.25-0.82 | |||
21 | Years used oral contraceptives | 0.78 | 0.73-0.82 | 0.77 | 0.72-0.82 | 0.79 | 0.68-0.90 | |||
22 | Duration of residence in a mobile home | 0.81 | 0.74-0.88 | 0.82 | 0.74-0.90 | 0.76 | 0.59-0.94 | |||
23 | Cigarettes smoked per day | 0.90 | 0.88-0.93 | 0.90 | 0.87-0.93 | 0.91 | 0.86-0.96 | |||
24 | Ever used oral contraceptives | 0.92 | 0.85-0.98 | 0.96 | 0.91-1.00 | 0.76 | 0.53-0.98 | |||
25 | Ever smoked cigarettes | 0.96 | 0.94-0.99 | 0.97 | 0.95-0.99 | 0.92 | 0.83-1.00 |
Abbreviation: 95% CI, 95% confidence interval.
κ was not computed because zero participants chose one of the response values for this question in either the main or resurvey but one or more did choose that value on the other survey.
Kappa values for individual questions tended to be higher for Web respondents, controls, women, and higher-income respondents when compared with telephone respondents, cases, men, and lower-income respondents, respectively (Fig. 1). Within cases, κ values did not seem to be different for patients with more (high-grade tumor or a glioblastoma) versus less severe disease, although data were sparse (data not shown).
The mean number of concordant responses across the 22 questions that were common to men and women was 16.2 (median, 16), ranging from 5 to 22. Table 4 shows that average concordance was 0.56 questions higher in Web respondents versus telephone respondents (P = 0.07) and 0.40 questions higher in controls versus cases (P = 0.06). Average concordance was also higher in women versus men and in participants with higher incomes (Table 4). Age, educational attainment, frequency of e-mail use, or frequency of Internet use was not associated with number of concordant responses. The association between survey mode and the number of concordant responses did not differ by case control status, given that an interaction term added to the final model was not statistically significant (P = 0.75; data not shown).
Variable . | Bivariate . | . | . | Multivariate . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | n . | β . | P . | β . | P . | |||||
Status | ||||||||||
Case (ref) | 209 | |||||||||
Control | 330 | 0.44 | 0.038 | 0.40 | 0.061 | |||||
Survey mode | ||||||||||
Telephone (ref) | 74 | |||||||||
Web | 465 | 0.71 | 0.017 | 0.56 | 0.068 | |||||
Gender | ||||||||||
Male (ref) | 264 | |||||||||
Female | 275 | 0.37 | 0.073 | 0.40 | 0.051 | |||||
Income | ||||||||||
Ordinal, 5 categories | 529 | 0.30 | 0.0001 | 0.29 | 0.0002 | |||||
Age | ||||||||||
Years (ordinal) | 534 | 0.001 | 0.89 | |||||||
Any college | ||||||||||
Yes (ref) | 465 | |||||||||
No | 74 | 0.20 | 0.51 | |||||||
E-mail frequency | ||||||||||
Ordinal, 4 categories | 490 | −0.08 | 0.53 | |||||||
Internet frequency | ||||||||||
Ordinal, 4 categories | 539 | −0.06 | 0.40 |
Variable . | Bivariate . | . | . | Multivariate . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | n . | β . | P . | β . | P . | |||||
Status | ||||||||||
Case (ref) | 209 | |||||||||
Control | 330 | 0.44 | 0.038 | 0.40 | 0.061 | |||||
Survey mode | ||||||||||
Telephone (ref) | 74 | |||||||||
Web | 465 | 0.71 | 0.017 | 0.56 | 0.068 | |||||
Gender | ||||||||||
Male (ref) | 264 | |||||||||
Female | 275 | 0.37 | 0.073 | 0.40 | 0.051 | |||||
Income | ||||||||||
Ordinal, 5 categories | 529 | 0.30 | 0.0001 | 0.29 | 0.0002 | |||||
Age | ||||||||||
Years (ordinal) | 534 | 0.001 | 0.89 | |||||||
Any college | ||||||||||
Yes (ref) | 465 | |||||||||
No | 74 | 0.20 | 0.51 | |||||||
E-mail frequency | ||||||||||
Ordinal, 4 categories | 490 | −0.08 | 0.53 | |||||||
Internet frequency | ||||||||||
Ordinal, 4 categories | 539 | −0.06 | 0.40 |
NOTE: Number of concordant responses was measured on a scale from 1 to 22, excluding 3 questions for oral contraceptive use, which only applied to women.
Abbreviation: ref, reference.
Discussion
Results from this reliability study suggest that a self-administered, Web-based survey is a feasible and appropriate method for collecting data about environmental and other exposure conditions in case-control studies on malignant brain cancer. Upon encouragement to do so, participants overwhelmingly chose the Web-based survey mode over the telephone-administered survey, and the reliability of responses via the Web exceeded the reliability of responses via telephone, even after controlling for the demographics of participants and frequency of e-mail and Internet use. Increased reliability among Web-based respondents may have been due in part to their ability to complete the main survey over several sessions and a longer period of time, which may have allowed them more opportunity for retrospection, verification, and seeking help from others when recalling events. Alternatively, because participants were not randomized to either the Web-based or telephone-administered survey mode, the apparently greater reliability for Web-based respondents may be due to a tendency for more reliable responders to choose the Web-based survey. Although we attempted to control for differences in socioeconomic status and level of comfort with the Internet, residual confounding may still explain the greater reliability observed for Web-based responses. The relatively low response rate in this study may have inflated estimates of reliability for both modes of interview, if nonresponders tended to also be less reliable reporters. Nonetheless, our results suggest that the reliability of Web-based responders is at least comparable with responders telephone.
Studies in other areas of health research have shown better or equal reliability for Web-based surveys compared with telephone-administered or paper-based surveys for exposures such as alcohol intake (7-9), general health status, and smoking cessation (7). Most of these studies randomized participants into one survey mode or the other and were done in younger populations such as college students or in populations of Internet users recruited through Web sites. Our clinic-based sample of brain tumor patients was very different from these populations. Among our healthy adult controls, who are more comparable with populations previously studied, the number of concordant responses between the main survey and the resurvey was significantly higher for the Web-based survey compared with the telephone survey (data not shown).
As anticipated, survey responses were more reliable for controls than for brain cancer cases, consistent with the cognitive decline affecting memory and attention that often occurs in brain tumor cases as a result of their disease (10) or treatment for their disease (11). Contrary to expectation, however, we did not find differentially lower reliability among cases with glioblastomas and other high-grade tumors, for whom cognitive decline might be expected to be more pronounced than those with lower-grade tumors. This may have been the result of data sparseness due to stratifying within cases, which limited our ability to detect effects.
We found that the traditional epidemiologic risk factors had higher reliability than the dietary and environmental measures. Consistent with previous research (12-15), reliability for smoking and oral contraceptive histories was very high in our study. Agreement for the four dietary history items in our study was found to be “moderate,” according to the cutoffs established by Landis and Koch (16), and comparable with those from nutrient intake assessments in studies on other diseases (17-21). The reliability of responses for the environmental exposures items such as housing, residence on a farm, pesticide exposure, and water source were also generally modest. Because these individual food and environmental items will be used to construct participant-specific indices of potential neurocarcinogen exposure, the modest reliability for these responses could lead to exposure misclassification, which may substantially attenuate any associations between exposure and disease.
Previous research suggests that respondents are less likely to underreport sensitive issues in the context of a self-interview such as a paper-based survey, audio computer–assisted interview, or a Web-based survey (9, 22, 23). Although the interview for the current study did not focus on sensitive issues, recent evidence has suggested that exposure to marijuana may be related to the development of brain tumors (24), so sensitive topics such as illicit drug use may be important in future data collection efforts in brain tumor research.
There are advantages to a computer-based self-interview when compared with other traditional paper-based self-interviews. The questionnaire can be programmed with logic so that questions are skipped automatically when not applicable and error or warning messages can appear if responses do not meet a valid range of responses. These aspects of computer-assisted modes can reduce errors in respondent reports. There are additional advantages to Web-based over computer-based interviews. First, respondents have the option of completing the interview at a time and place of their choosing, and over as many sessions and as long a period of time as needed. This enables respondents to choose moments when they are best able to focus on the topic at hand, which may make them less likely to rush their answers. Because they can start and stop the interview at any time, it also provides opportunities for soliciting information from relatives and others who are knowledgeable about the respondents' exposure history to verify their own responses.
Web-based interviews are also more rapid and cost efficient than other interview modes (9, 25, 26). Although there are fees associated with programming the questionnaire online, other costs are saved, such as paper and postage for paper-based surveys and interviewer training and personnel time for telephone-based surveys. In addition, because participants directly enter precoded data into the database, data entry costs are saved and data are available immediately for analysis (27).
The comparable reliability and cost efficiency of Web-based versus telephone interviews in our study suggest that Web-based self-interviews should be considered more often as the primary interview mode when planning epidemiologic studies in populations with Internet access, especially when the study sample is widely distributed geographically (24, 27).
Future studies should focus on the feasibility and reliability of Web-based surveys within studies on other populations and different disease states to fill the knowledge gaps in this area and describe the potential impact of survey mode on resulting measures of association between exposure and disease.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: National Cancer Institute Specialized Programs of Research Excellence in Brain Cancer grant (1P20CA96890-01) and National Cancer Institute (grant 5 P50 CA 106743).
Acknowledgments
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank the research staff at the Neuro-Oncology Program, Evanston Northwestern Healthcare Kellogg Cancer Care Center, and the Cancer Control, Detection and Prevention Research Program, Duke University Medical Center for their cooperation and hard work, and the dedicated participants who have volunteered their time and energy to complete the activities of the study.