Abstract
In studies of skin cancer, participants are often classified into risk groups based on self-reported history of sun exposure or skin characteristics. We sought to determine the reliability of self-reported skin characteristics among participants of a study to evaluate markers for nonmelanoma skin cancer (NMSC). Multiple questionnaires and screening protocols were administered over a 3-month period to individuals from three risk groups: existing sun damage on forearms but no visible actinic keratoses (n = 91), visible actinic keratoses (n = 38), and history of resected squamous cell skin cancer in the last 12 months (n = 35). We assessed consistency of risk group assignment between telephone screen and study dermatologist assignment, self-reported sun sensitivity (telephone recruitment form versus participant completed profile), and self-reported history of NMSC skin lesions (telephone recruitment form versus health history). There was substantial agreement between probable risk group and final assignment (κ = 0.76; 95% confidence interval, 0.65-0.85) and agreement did not differ by gender. Agreement for self-reported sun sensitivity was moderate (κ weighted = 0.46; 95% confidence interval, 0.36-0.56) with higher agreement for women. For self-reported NMSC lesion history between two interviews, 24 days apart, κ estimates ranged from 0.66 to 0.78 and were higher for women than men. Overall, there was evidence for substantial reproducibility related to risk group assignment and self-reported history of NMSC, with self-reported sun sensitivity being less reliable. In all comparisons, women had higher κ values than men. These results suggest that self-reported measures of skin cancer risk are reasonably reliable for use in screening subjects into studies. (Cancer Epidemiol Biomarkers Prev 2006;15(11):2292–7)
Introduction
Skin cancer is the most common form of cancer in the United States (1). Skin cancer afflicts ∼20% of the general population at some point in life, and 50% of Americans who live to be 65 years old will have skin cancer at least once (2, 3). Skin cancer, including melanoma, has been mainly associated with particular skin phenotypes (fair complexion, tendency to sunburn, freckles) and sun exposure (4).
Participant self-report is heavily relied upon in epidemiologic studies; however, the reliability of this information may vary. In cancer prevention studies, self-report is frequently used to classify participants into risk groups for future disease and to identify potential risk factors. Self-report is useful because it reduces time and length of recruitment. It is important to determine the reliability of items included in questionnaires because this variability will affect the validity of measurements and comparability between studies.
In the current analysis, we sought to determine how reliably participants were being placed in risk groups by comparing trained staff interviewers who did initial telephone screening and probable risk group determination by final dermatologist risk assignment. Second, we sought to determine the level of consistency at two different time points for participant perception of their sun sensitivity. Last, we examined the consistency of participant self-report of their nonmelanoma skin cancer (NMSC) and actinic keratosis (AK) history during the initial telephone screening interview and later reported via a self-administered health history form.
Materials and Methods
Study Design
These data resulted from the Skin Biomarkers Study conducted by the University of Arizona Cancer Center over an ∼3-month period. This study was designed to assess the reproducibility of various surrogate end point biomarkers within the skin carcinogenesis pathway, specifically the variability of polyamine levels, p53 expression, and proliferating cell nuclear antigen expression. Subjects were recruited from university and community dermatology clinics, advertisements, and a skin cancer registry. Eligible subjects were males and females of at least 18 years of age who were willing to use skin protector factor 50 sunscreen applied daily. Subjects included three different probable risk groups: sun damage on forearms with no visible AKs (the pre-AK group), visible AKs (the AK group), and history of resected squamous cell carcinoma (SCC) in the last 12 months (the SCC group).
The Biomarkers Study assessed 851 people via telephone for eligibility. Of those, 199 seemed to be eligible, agreed to participate, and consented at the eligibility clinic visit. At some point after consent, 29 were found to be ineligible. Of those who remained eligible, 91 were assigned to the pre-AK group, 38 to the AK group, and 35 to the SCC group (Fig. 1). Of the 164 subjects assigned to probable risk groups, 143 completed the 3-month study and are the focus of these analyses. Information on the design and some results of the study have been previously published (5, 6).
Questionnaires/Assessments
Throughout the 3-month study, subjects completed various questionnaires to provide information about history of skin cancers, sun damage to skin, and sun sensitivity. Figure 1 shows the study flow and when various forms were collected. The telephone recruitment form was used to assess a potential participant's initial eligibility as well as gather basic information about past diagnosis of skin conditions, current medications, sun exposure, and sun damage. The interviewer used this information to assign a potential participant into one of the previously described risk groups.
At the eligibility visit, the participant was assessed by the study dermatologist to confirm eligibility and to assign a final risk group. Participants also returned a completed self-administered health history form. This form asked about medical conditions and included a dermatology history section with questions about past diagnoses of AK, skin cancer, and skin biopsy results. A self-administered participant profile form was returned at the time of visit 1, ∼43 days after the initial telephone screen. Phenotypic characteristics, demographics, sun exposure, use of sunscreen, history of sunburn, occupational and environmental exposures, residential history, medical exposures, and smoking history were included on this profile.
Statistical Analysis
To compare basic demographic characteristics between the three final risk groups, χ2 tests were used. ANOVA and Bonferroni multiple comparisons were used to look at potential differences in mean age between groups.
To test the reliability between independent groups, Cohen's κ was used. For comparison between the multiple levels of self-reported sun sensitivity, weighted κ was estimated. This statistic took advantage of the ordered categories so that partial credit was given to small error versus large error (7). Because there was no clear-cut “gold standard” for any of the interviews or questionnaires, equal weight was applied to both sets or readings (7). A κ statistic <0 would suggest poor agreement, 0 to 0.20 slight, 0.21 to 0.40 fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial, and 0.81 to 1.00 almost perfect (8, 9). Confidence intervals (CI) were calculated for the κ statistic using the STATA command “kapci.” STATA uses an analytic method for simple two-by-two comparisons and a bootstrap method in the case of dichotomous variables. When the bootstrap method was used, STAT was asked to perform 1,000 repetitions.
Results
Table 1 shows the baseline demographic characteristics, phenotypic characteristics, skin characteristics, and sun-protective habits of participants by assigned risk group for the 143 participants who completed the study and are included in our analysis. There were few differences between assigned risk groups, with the exception of gender. There were more men in both the AK (75.8%) and SCC (62.5%) groups than in the pre-AK group (33.3%). In addition, those in the SCC group tended to be older, mean age 65.8 years compared with 58.1 years in the pre-AK group. Similarly, there were few differences for phenotypic characteristics, skin characteristics, and sun-protective habits with the exception of “ever use of sun lamps.” Self-reported usage was very low in the AK group (3.0%) compared with the other groups.
. | Risk group assignment . | . | . | . | . | ||||
---|---|---|---|---|---|---|---|---|---|
. | Pre-AK (n = 78) . | AK (n = 33) . | SCC (n = 32) . | Total (N = 143) . | P* . | ||||
Age (y, mean) | 58.1 | 61.0 | 65.8 | 60.5 | 0.068 | ||||
<40-49 | 19 (24.4%) | 5 (15.2%) | 1 (3.1%) | 25 (17.5%) | |||||
50-59 | 18 (23.1%) | 6 (18.2%) | 6 (18.8%) | 30 (21.0%) | |||||
60-69 | 30 (38.5%) | 16 (48.5%) | 14 (43.8%) | 60 (42.0%) | |||||
>70 | 11 (14.1%) | 6 (18.2%) | 11 (34.4%) | 28 (19.6%) | |||||
Male | 26 (33.3%) | 25 (75.8%) | 20 (62.5%) | 71 (49.7%) | <0.001 | ||||
Ethnicity | 0.790 | ||||||||
White | 75 (96.2%) | 32 (97.0%) | 30 (93.8%) | 137 (95.8%) | |||||
Other | 3 (3.8%) | 1 (3.0%) | 2 (6.3%) | 6 (4.2%) | |||||
Skin tanning characteristics | 0.207 | ||||||||
Always burns, no tan | 11 (14.1%) | 7 (21.2%) | 5 (15.6%) | 23 (16.1%) | |||||
Always burns, tans minimally | 16 (20.5%) | 11 (33.3%) | 9 (28.1%) | 36 (25.2%) | |||||
Burns moderately | 28 (35.9%) | 7 (21.2%) | 13 (40.6%) | 48 (33.6%) | |||||
Burns minimally | 16 (20.5%) | 8 (24.2%) | 2 (21.9%) | 26 (18.2%) | |||||
Rarely burns | 7 (9.0%) | 0 (0.0%) | 3 (9.4%) | 10 (7.0%) | |||||
Experienced painful sunburn (yes) | 57 (73.1%) | 28 (84.9%) | 21 (65.6%) | 106 (74.1%) | 0.174 | ||||
Exposure | |||||||||
Hours per week in sun during past month (Monday-Friday 9:00 a.m.-4:00 p.m.; mean ± SD) | 7.41 ± 14.6 | 14 ± 18.2 | 8.66 ± 12.1 | 9.2 ± 15.1 | 0.10 | ||||
Hours per week in sun during past month (Saturday-Sunday 9:00 a.m.-4:00 p.m.; mean ± SD) | 4.41 ± 6.64 | 6.45 ± 9.98 | 4.84 ± 7.72 | 5.0 ± 7.8 | 0.45 | ||||
Ever use of sun lamps (yes) | 16 (20.5%) | 1 (3.0%) | 7 (21.9%) | 24 (16.8%) | 0.054 | ||||
Use of sunscreen in past year | 0.649 | ||||||||
Never | 10 (12.8%) | 1 (3.0%) | 4 (12.5%) | 15 (10.5%) | |||||
Rarely used | 13 (16.7%) | 4 (12.1%) | 6 (18.8%) | 23 (16.1%) | |||||
Sometimes | 25 (32.1%) | 12 (36.4%) | 9 (28.1%) | 46 (32.2%) | |||||
Usually | 21 (26.9%) | 9 (27.3%) | 6 (18.8%) | 36 (25.2%) | |||||
Always† | 9 (11.5%)† | 7 (21.2%) | 7 (21.9%) | 23 (16.1%) |
. | Risk group assignment . | . | . | . | . | ||||
---|---|---|---|---|---|---|---|---|---|
. | Pre-AK (n = 78) . | AK (n = 33) . | SCC (n = 32) . | Total (N = 143) . | P* . | ||||
Age (y, mean) | 58.1 | 61.0 | 65.8 | 60.5 | 0.068 | ||||
<40-49 | 19 (24.4%) | 5 (15.2%) | 1 (3.1%) | 25 (17.5%) | |||||
50-59 | 18 (23.1%) | 6 (18.2%) | 6 (18.8%) | 30 (21.0%) | |||||
60-69 | 30 (38.5%) | 16 (48.5%) | 14 (43.8%) | 60 (42.0%) | |||||
>70 | 11 (14.1%) | 6 (18.2%) | 11 (34.4%) | 28 (19.6%) | |||||
Male | 26 (33.3%) | 25 (75.8%) | 20 (62.5%) | 71 (49.7%) | <0.001 | ||||
Ethnicity | 0.790 | ||||||||
White | 75 (96.2%) | 32 (97.0%) | 30 (93.8%) | 137 (95.8%) | |||||
Other | 3 (3.8%) | 1 (3.0%) | 2 (6.3%) | 6 (4.2%) | |||||
Skin tanning characteristics | 0.207 | ||||||||
Always burns, no tan | 11 (14.1%) | 7 (21.2%) | 5 (15.6%) | 23 (16.1%) | |||||
Always burns, tans minimally | 16 (20.5%) | 11 (33.3%) | 9 (28.1%) | 36 (25.2%) | |||||
Burns moderately | 28 (35.9%) | 7 (21.2%) | 13 (40.6%) | 48 (33.6%) | |||||
Burns minimally | 16 (20.5%) | 8 (24.2%) | 2 (21.9%) | 26 (18.2%) | |||||
Rarely burns | 7 (9.0%) | 0 (0.0%) | 3 (9.4%) | 10 (7.0%) | |||||
Experienced painful sunburn (yes) | 57 (73.1%) | 28 (84.9%) | 21 (65.6%) | 106 (74.1%) | 0.174 | ||||
Exposure | |||||||||
Hours per week in sun during past month (Monday-Friday 9:00 a.m.-4:00 p.m.; mean ± SD) | 7.41 ± 14.6 | 14 ± 18.2 | 8.66 ± 12.1 | 9.2 ± 15.1 | 0.10 | ||||
Hours per week in sun during past month (Saturday-Sunday 9:00 a.m.-4:00 p.m.; mean ± SD) | 4.41 ± 6.64 | 6.45 ± 9.98 | 4.84 ± 7.72 | 5.0 ± 7.8 | 0.45 | ||||
Ever use of sun lamps (yes) | 16 (20.5%) | 1 (3.0%) | 7 (21.9%) | 24 (16.8%) | 0.054 | ||||
Use of sunscreen in past year | 0.649 | ||||||||
Never | 10 (12.8%) | 1 (3.0%) | 4 (12.5%) | 15 (10.5%) | |||||
Rarely used | 13 (16.7%) | 4 (12.1%) | 6 (18.8%) | 23 (16.1%) | |||||
Sometimes | 25 (32.1%) | 12 (36.4%) | 9 (28.1%) | 46 (32.2%) | |||||
Usually | 21 (26.9%) | 9 (27.3%) | 6 (18.8%) | 36 (25.2%) | |||||
Always† | 9 (11.5%)† | 7 (21.2%) | 7 (21.9%) | 23 (16.1%) |
P value for χ2 test for difference between risk groups or ANOVA for hours per week.
One person stated that she rarely used sunscreen but always used it on the face.
Table 2 shows the agreement between risk group assignment by staff telephone interviewer and study dermatologist. Agreement between risk group assignment was substantial (κ = 0.76; 95% CI, 0.65-0.85) and there were no differences in agreement by gender of the participant. The telephone interviewers misclassified 12 (17.4%) of true pre-AK subjects as AK. Only five (15.6%) of the true AK subjects were misclassified, four as pre-AK and one as SCC. A total of three (9.7%) SCC subjects were misclassified and all three were placed in the AK group instead of the SCC group.
Group assigned by dermatologists . | Probable group assigned at telephone eligibility screening . | . | . | . | |||
---|---|---|---|---|---|---|---|
. | Pre-AK . | AK . | SCC . | Total . | |||
Pre-AK | 57 | 12 | 0 | 69 | |||
AK | 4 | 28 | 1 | 32 | |||
SCC | 0 | 3 | 28 | 31 | |||
Total | 61 | 42 | 29 | 133* |
Group assigned by dermatologists . | Probable group assigned at telephone eligibility screening . | . | . | . | |||
---|---|---|---|---|---|---|---|
. | Pre-AK . | AK . | SCC . | Total . | |||
Pre-AK | 57 | 12 | 0 | 69 | |||
AK | 4 | 28 | 1 | 32 | |||
SCC | 0 | 3 | 28 | 31 | |||
Total | 61 | 42 | 29 | 133* |
NOTE: κ = 0.76; percentage agreement = 85.0%; 95% CI, 0.65-0.85.
There were 10 subjects for which the telephone interviewer made no determination as to the probable group.
Questions on the interviewer telephone recruitment form and the self-administered participant profile asked subjects to describe what happens to their skin after a specified amount of sun exposure. Eligible responses were from one of five categories related to burning and tanning. Table 3 shows the response to both questionnaire administrations and gives more specific detail to wording differences between questionnaires. There was an average of 43 days between the administration of the two instruments. Using weighted κ, agreement was moderate (κ weighted = 0.46; 95% CI, 0.36-0.56), with higher agreement for women (κ weighted = 0.53 versus 0.36).
Participant profile . | Telephone eligibility . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | Always burn easily with blistering and peeling . | Usually burn, no blistering and some peeling . | Burn moderately, some degree of tanning or freckling . | Burn minimally, tan easily . | Rarely or never burn, tan easily . | Total . | |||||
Always burn, never tan, extremely sun sensitive | 11 | 8 | 4 | 0 | 0 | 23 | |||||
Always burn, tan minimally, very sun sensitive | 6 | 14 | 13 | 2 | 1 | 36 | |||||
Burn moderately, tan gradually and uniformly to light brown, sun sensitive | 3 | 13 | 26 | 6 | 0 | 48 | |||||
Burn minimally, tan well to moderate brown, minimal sun sensitivity | 0 | 3 | 9 | 13 | 1 | 26 | |||||
Rarely burn, tan well to dark brown, minimal sun sensitivity and never burn, deeply pigmented, sun insensitive | 0 | 0 | 1 | 7 | 2 | 10 | |||||
Total | 20 | 38 | 53 | 28 | 4 | 143 |
Participant profile . | Telephone eligibility . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | Always burn easily with blistering and peeling . | Usually burn, no blistering and some peeling . | Burn moderately, some degree of tanning or freckling . | Burn minimally, tan easily . | Rarely or never burn, tan easily . | Total . | |||||
Always burn, never tan, extremely sun sensitive | 11 | 8 | 4 | 0 | 0 | 23 | |||||
Always burn, tan minimally, very sun sensitive | 6 | 14 | 13 | 2 | 1 | 36 | |||||
Burn moderately, tan gradually and uniformly to light brown, sun sensitive | 3 | 13 | 26 | 6 | 0 | 48 | |||||
Burn minimally, tan well to moderate brown, minimal sun sensitivity | 0 | 3 | 9 | 13 | 1 | 26 | |||||
Rarely burn, tan well to dark brown, minimal sun sensitivity and never burn, deeply pigmented, sun insensitive | 0 | 0 | 1 | 7 | 2 | 10 | |||||
Total | 20 | 38 | 53 | 28 | 4 | 143 |
NOTE: κ = 0.46; percentage agreement = 84%; 95% CI, 0.36-0.56.
Table 4 shows agreement between self-reported history of NMSC at the two questionnaire administrations 24 days apart. The κ value for AK history was κ = 0.66 (95% CI, 0.54-0.78), for SCC was κ = 0.78 (95% CI, 0.65-0.91), and for BCC was κ = 0.75 (95% CI, 0.55-0.94), all considered to be with substantial agreement and all significantly different than 0. The values of κ differed by gender but the differences were not significant as there is overlap in the CIs. In addition, although not shown, individuals were more likely to report a diagnosis of NMSC at the telephone recruitment than on the self-reported health history.
Telephone recruitment questionnaire . | Health history questionnaire . | . | . | |||
---|---|---|---|---|---|---|
. | κ . | Percentage agreement (%) . | 95% CI . | |||
History of AK | 0.659 | 83 | 0.54-0.78 | |||
By gender | ||||||
Male | 0.514 | 76 | 0.32-0.71 | |||
Female | 0.804 | 90 | 0.67-0.94 | |||
By group | ||||||
Pre-AK | 0.651 | 83 | 0.48-0.82 | |||
AK | 0.526 | 82 | 0.22-0.83 | |||
SCC | 0.460 | 84 | 0.87-0.85 | |||
History of SCC* | 0.781 | 93 | 0.65-0.91 | |||
By gender | ||||||
Male | 0.701 | 88 | 0.51-0.89 | |||
Female | 0.891 | 97 | 0.74-1.00 | |||
History of basal cell carcinoma* | 0.745 | 96 | 0.55-0.94 | |||
By gender | ||||||
Male | 0.682 | 94 | 0.39-0.97 | |||
Female | 0.817 | 97 | 0.57-1.00 |
Telephone recruitment questionnaire . | Health history questionnaire . | . | . | |||
---|---|---|---|---|---|---|
. | κ . | Percentage agreement (%) . | 95% CI . | |||
History of AK | 0.659 | 83 | 0.54-0.78 | |||
By gender | ||||||
Male | 0.514 | 76 | 0.32-0.71 | |||
Female | 0.804 | 90 | 0.67-0.94 | |||
By group | ||||||
Pre-AK | 0.651 | 83 | 0.48-0.82 | |||
AK | 0.526 | 82 | 0.22-0.83 | |||
SCC | 0.460 | 84 | 0.87-0.85 | |||
History of SCC* | 0.781 | 93 | 0.65-0.91 | |||
By gender | ||||||
Male | 0.701 | 88 | 0.51-0.89 | |||
Female | 0.891 | 97 | 0.74-1.00 | |||
History of basal cell carcinoma* | 0.745 | 96 | 0.55-0.94 | |||
By gender | ||||||
Male | 0.682 | 94 | 0.39-0.97 | |||
Female | 0.817 | 97 | 0.57-1.00 |
Information from biopsy table on the health history form.
Discussion
In epidemiologic research, as in any scientific endeavor, the interpretability of observed results depends, to a considerable extent, on the accuracy of the measurements (10). In most situations, “truth” is not known, and, to judge the quality of measurement, one must settle for an assessment of agreement between multiple imperfect sources of information or between multiple measurements using a single imperfect source of information (10).
In this study, the objectives of the analyses were 3-fold. We sought to determine how reliably participants were placed into risk groups by trained telephone interviewers compared with a study dermatologist's assessment. Second, the consistency between self-reported sun sensitivity was assessed. Last, we examined the consistency of participant self-reported history of skin lesions, specifically NMSC and AK. We noted extremely good agreement (κ = 0.76; 95% CI, 0.65-0.85) between the classification of potential study participants into risk groups by trained telephone interviewers and the final assignment by a study dermatologist. During the recruitment phase of a clinical study, large numbers of people are often screened to find the few who qualify. Recruitment and screening is a time-consuming process, and study costs increase dramatically if study dermatologist time is necessary for initial screening. In addition, if the study seeks to recruit specific numbers into each risk group, participant's risk group must be immediately and accurately identified.
Despite the level of good agreement, misclassification existed, and not surprisingly, this misclassification centered on assignment of risk groups pre-AK and AK. The telephone interviewers misclassified 17.4% of true pre-AK subjects as AK. Similarly, 15.6% of the true AK subjects were misclassified, with four people (1.3%) classified by the screener as pre-AK and one (3.1%) as an SCC. Among the final SCC group, three SCC subjects (9.7%) were misclassified and all three had been placed in the AK group instead of the SCC group by the screener. The screeners placed subjects into risk groups based on information they collected during the interview, whereas the dermatologists made group assignments based on skin examinations. Because the telephone interviewers based their decisions on participant report, it would seem that subjects were more likely to report having AKs when they did not actually have any. This could be due to lack of knowledge pertaining to identification of an AK, or the difficulty of an untrained person with sun-damaged skin to differentiate an AK from other sun damage. It may also be true that potential participants overexaggerated their skin damage on the telephone because they had a strong desire to be eligible and participate in the study.
As risk group classification increased in seriousness, misclassification decreased. There was more misclassification in the pre-AK risk group and much less in the SCC risk group. If only the telephone interviewer was used to classify subjects into risk groups, there would be more true-pre-AK subjects in the AK group. This could then make it more difficult to distinguish between groups during analysis of biomarkers. For example, if there was a specific biomarker associated with development of SCC, this type of misclassification could decrease the likelihood of detecting any gradient between the disease groups.
In our study, agreement was not as strong for self-reported sun sensitivity measures (κ weighted = 0.46; 95% CI, 0.36-0.56). One caveat needs to be highlighted. The sun sensitivity questions being compared on the two forms were not worded in precisely the same manner. The question on the interviewer-administered telephone recruitment form and the self-administered participant profile differed slightly. The telephone recruitment form was more focused toward assessment of whether an individual's untanned skin burns in the sun, and the self-reported participant profile focused on descriptions of tanning in addition to burning.
The concept of sun-reactive skin typing was created in 1975 to classify persons with white skin to select the correct initial doses of UVA (in joules per cubic centimeter) for the treatment of psoriasis-oral methoxsalen photochemotherapy (11). It was decided that a brief personal interview regarding the history of the person's sunburn and suntan experience was one approach to estimate the skin tolerance to UV radiation exposure and the Fitzpatrick skin-typing system was created (11). The Fitzpatrick skin-typing system has been used by the Food and Drug Administration in its guidelines for sunscreen products for over-the-counter human use (11).
Self-reported sun sensitivity is used to assess skin type and, therefore, risk for skin cancer. Only a few studies looking at the reliability of these measures are available in the literature and report better reliability than our study. Reliability, assessed by comparing answers to the same question at different time points, is used because the measures do not have a gold standard. In the multicenter South European case-control study, a subsample of participants were reinterviewed and reaction to sun exposure was assessed on a four-level scale (4). Weighted κ for skin reaction to sun exposure was 0.61 (95% CI, 0.53-0.70), which is slightly higher than the five-level weighted κ from our current study (κ = 0.46; 95% CI, 0.36-0.56). (Recall that the value of κ is affected by the number of categories.) In a case-control study of melanoma that included test-retest reliability of self-reported exposure to sun sensitivity, there was good consistency with κ values for ability to tan and tendency to burn of 0.66 and 0.62, respectively (12).
In a case-control study nested within the Nurse's Health Study cohort, Weinstock (13) reported that test-retest reliability of tanning questions was high in the prevalent case group (Spearman's r = 0.78) and control group (Spearman's r = 0.76), but lower in the incident case group (Spearman's r = 0.59). Their study had a similar caveat in that the questions were not worded identically between the two questionnaires. Weinstock et al. (14) also found that, among women diagnosed with melanoma after the first questionnaire and before the second, there was a substantial shift toward reporting a reduced ability to tan.
This highlights an important issue for development of study questionnaires. The issue of burnability and tannability are separate issues to subjects and need to be considered separately. In a study by Rampen et al., neither tannability nor burnability were linked very closely to the minimal erythemal dose, which would be the gold standard of sun sensitivity (15). Rampen et al. (15) investigated burning and tanning histories in 790 White students, 18 to 30 years old, with a self-administered questionnaire to classify them into skin types based on the Fitzpatrick scheme (burning tendency after 1 hour of sun exposure in early summer and the tanning ability after regular sun exposure during summer were recorded as follows: 0, none; 1, mild; 2, moderate; and 3, severe/intense). Minimal erythemal dose was measured in a subgroup of this population. There was no statistically significant correlation with the self-reported burning tendency and the minimal erythemal dose. Skin typing on the basis of self-reported burning tendency and tanning ability may be subjective because subjects tended to overrecord no burning and underrecord no tanning. The correlation with biological complexion factors, such as hair and eye color and freckling tendency, was somewhat better for self-reported tanning than for the burning propensity (15).
The authors concluded that self-reported burning-tanning histories do not provide a valid means of skin typing when compared with the minimal erythemal dose (15). It may be that a better way to characterize sun sensitivity would be through proxy measures, such as hair and eye color and freckling tendency, which seem to be more reliably reported by subjects. Weinstock et al. found that test-retest reliability of hair color assessment by questionnaire was high with the Spearman correlation coefficient—between 0.76 and 0.87. Sun sensitivity may be subject to recall bias when assessed by ability to tan, but not when assessed by hair color (14). There is a need for further studies to look at the issue of skin-type classification more closely.
Based on Weinstock et al.'s results, we might have expected that the SCC risk group would have been more reliable reporters of sun sensitivity or that there would be a gradient of response with the pre-AK group being the less reliable reporters then the AK or SCC groups. However, we found that all risk groups were equally as reliable when reporting sun sensitivity (data not shown).
Results from previous studies are consistent with our findings on agreement for self-reported history of NMSC. We found that more serious skin conditions had higher agreement (for AK history κ = 0.66; 95% CI, 0.54-0.78 for SCC; κ = 0.78; 95% CI, 0.65-0.91; and for BCC κ = 0.75; 95% CI, 0.55-0.94). In a study by Ming et al. (16), self-reported history of skin cancer was compared with the gold standard of chart documentation. Patients were found to recall their cancer history quite well, with correct identification highest for melanoma (95% of cases) and lowest for basal cell carcinoma (84% of cases). In a study by Bergmann et al. (17), assessing agreement of self-reported medical history using an in-person interview versus a self-administered questionnaire, κ values of 0.83-0.88 were found for cancer reporting. Lower values were found for less severe or more transient disease, with the disease being reported at the interview but not on the questionnaire. Our study also found that NMSC diagnosis was more likely to be reported to the phone interviewer than on the self-administered health history form.
The results of our study suggest that women may be more reliable reporters than men; however, the literature does not always support this finding. In an Australian study of ocular melanoma that gathered information on sun exposure in the first four decades of a person's life, questionnaires administered 1 year apart gave an interclass correlation coefficient of 0.65 for ranked total sun exposure between two interviews with the coefficient higher for men (0.73) than for women (0.54), although, like in our study, not statistically significant (18).
The use of κ to measure reliability can be problematic because κ values will depend on the prevalence of the condition and distribution of the marginals (7). We used a weighted κ for ordered categories so that partial credit would be given to small error versus large error (7). Additionally, κ values depend on the number of categories with more categories resulting in lower values (19). The value of κ does, however, account for agreement that may occur by chance alone. The use of κ is more appropriate than the use of percentage agreement. Percentage agreement is the simplest method of summarizing agreement for categorical variables and has the advantage of being useful for any number of categories. Percentage agreement is artificially increased when the proportion of negative-negative results is high or when the prevalence of the condition is high.
Conclusions
Overall, there was evidence for substantial reproducibility for factors related to assignment into skin cancer risk group and self-reported history of skin lesions, with self-reported sun sensitivity questions being somewhat less reliable. In all comparisons, women were more consistent reporters than men. These results suggest that self-reported measures of skin cancer risk should be reasonably reliable for use in screening subjects into studies. Further studies are required to further identify the characteristics of individuals with poor reliability of self-reported measures.
Grant support: National Cancer Institute grant P01 CA27502.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.