Abstract
Purpose: Latinas and African-Americans with breast cancer, especially those of lower socioeconomic status and acculturation, have been underrepresented in studies assessing treatment satisfaction, decision-making, and quality of life. A study was designed to recruit a large and representative sample of these subgroups.
Materials and Methods: Incident cases were selected by rapid case ascertainment (RCA) in the Los Angeles Surveillance, Epidemiology, and End Results Registry from 2005 to 2006, with oversampling of Latinas and African-Americans. Patients were mailed a questionnaire and $10 incentive 5 to 6 months after diagnosis; nonrespondents were contacted by telephone. Multivariate analysis was used to assess possible response bias. The RCA definition of Hispanic origin was validated by self-reports. The Short Acculturation Scale for Hispanics index for Latina respondents was used.
Results: One thousand six hundred and ninety-eight eligible breast cancer cases were selected and 1,223 participated, for a response rate of 72.0%, which varied little by race/ethnicity. Age, race/ethnicity, and clinical factors were not associated with response; however, respondents were slightly more likely to be married and from higher socioeconomic status census tracts than nonrespondents. The RCA definition of Hispanic identity was highly sensitive (94.6%) and specific (90.0%). Lower acculturation was associated with lower education and literacy among Latinas.
Discussion: High response rates among all subgroups were achieved due to the use of RCA, an incentive, extensive telephone follow-up, a native Spanish-speaking interviewer, and a focused questionnaire. The low acculturation index category identified a highly vulnerable subgroup. This large sample representing subgroups with greater problems will provide a basis for developing better interventions to assist these women. (Cancer Epidemiol Biomarkers Prev 2009;18(7):2022–9)
Introduction
The use of regional cancer registries for oncology quality of care and outcomes research is growing because they can draw on large population-based samples with sufficient power and representation to evaluate important issues in vulnerable populations. Additionally, regional registries have experience in identifying and enrolling patients into studies shortly after diagnosis (1, 2). These advantages have fueled a research agenda that addresses important clinical and health policy issues including quality of treatments, patient perspectives about care, transitions to survivorship, outcomes such as quality of life and survival, and racial/ethnic disparities in all of these issues (3-8). As the demand for more research in this area increases, methods to improve the representativeness of results are especially important. Key methodologic challenges using cancer registries include sampling and identifying appropriate patients shortly after diagnosis, achieving high response rates particularly for minority patients, and deploying valid measures of ethnicity.
Latinos have been a particularly underrepresented group in oncology quality of care and outcomes studies (9-15). Few studies have included sufficient numbers of Latinos to evaluate care experiences despite their increasing proportion among cancer survivors in the United States (10, 13, 16, 17). For example, there are virtually no studies with sufficient representation of Latinas in the large literature of quality of care and survivorship in breast cancer. This largely reflects the particular challenges of identifying and engaging this patient population in research. Sampling Latinos is challenging because ethnic identity (versus race) is not reliably available through information such as hospital and medical records. A second challenge is enrolling and engaging sufficient numbers of Latino patients in studies, especially patients with low acculturation who often have lower socioeconomic status, a group as a whole which is less likely to participate in studies. A third challenge is developing valid measures of acculturation that capture the rich diversity of this population.
In this article, we describe how we addressed these challenges in a large population-based study of quality of care and outcomes for patients with breast cancer diagnosed in Los Angeles County from June 2005 to May 2006. The aims of the overall study were to evaluate race/ethnic differences in patterns of treatment, patients' perspectives about decision-making and care support, and quality of life outcomes for women with breast cancer. We present results that inform methodologies for successful population-based registry research including how we (a) selected a representative stratified sample of patients as they were being diagnosed, (b) achieved a high response rate among all racial ethnic subgroups, (c) assessed possible bias in survey response, and (d) assessed the validity of registry-based ethnic identification and defined levels of acculturation among Latinas. Our methods and results will be useful to other researchers developing studies to evaluate cancer-related outcomes in diverse populations.
Materials and Methods
Study Population
Los Angeles County resident women ages 20 to 79 years diagnosed with primary ductal carcinoma in situ or invasive breast cancer (stage I-III) between June 1, 2005 and May 31, 2006 were eligible for sample selection. Spanish-surnamed white, African-American, and non–Spanish-surnamed white (NSSW) women were included, whereas Asian and other racial ethnic groups were excluded due to their participation in other studies.
Sample Selection
The cases were selected by rapid case ascertainment (RCA) by the Los Angeles Cancer Surveillance Program, the Surveillance, Epidemiology, and End Results (SEER) Cancer Registry for Los Angeles County. RCA is a method whereby cases are sampled, usually within a month of diagnosis, by registry field staff who manually review pathology reports at hospitals prior to their submission to the cancer registry. During the period August 1, 2005 to May 31, 2006 all pathology reports at Los Angeles County hospitals were reviewed monthly for breast cancer cases meeting the eligibility criteria. Because the pathology reports could include diagnostic biopsies as well as surgical pathology reports, multiple pathology reports per case could be obtained. Using historical incidence counts to estimate the number of cases that would be diagnosed during the ascertainment time period, we determined that all Latina and African-American cases should be selected as well as an 11% random sample of NSSWs to obtain the desired sample size.
Different RCA strategies were used for each racial/ethnic group. Because Latina (or Hispanic) status is not uniformly collected by the hospital at the time of diagnosis, we selected all pathology reports from women who were designated as Hispanic or Latina by the hospital as well as selecting pathology reports from women whose surname indicated a high probability of being Latina based on a list generated from the 1980 U.S. Census. To select African-Americans, we checked demographic information from the hospital database on all non-Latina or non–Asian-surnamed women and selected all pathology reports from women designated as Black or African-American. To select the random sample of NSSW from all of the remaining pathology reports, we provided all field technicians with lists of random numbers (generated using SAS RANUNI) that were programmed to accept 11% of all NSSW pathology reports consecutively screened. In total, 2,024 unique cases were sampled from the three racial/ethnic groups (Fig. 1).
Determination of Eligibility of Sampled Patients
Because cases selected at the time of RCA are not quality-controlled to eliminate ineligible patients (i.e., those that are not primary breast cancer diagnoses, out of county residents, first diagnosed prior to the recruitment period, or with an ineligible histology or stage of disease), the sampled cases were linked ∼6 to 8 months later to the quality-controlled registry file to eliminate the ineligible cases (Fig. 1). This was done after the survey had been sent and thus resulted in the disqualification of some survey responses. In addition, we excluded cases that died within 8 months after diagnosis because this was the average time between diagnosis and patient contact. The earliest eligible diagnosis date was extended back to June 1, 2005, because many of the selected cases from when ascertainment began on August 1, 2005 had been diagnosed in the 2 months beforehand. As a result of the linkage, 258 (12.8%) of the 2,024 initially sampled patients were determined to be ineligible. The reasons included out of county residence (n = 55), stage IV disease (n = 67), ineligible histology code (e.g., lymphoma; n = 17), diagnosis prior to June 1, 2005 (n = 87), or deceased within 8 months of diagnosis (n = 32).
The remaining registry-eligible sample of 1,766 cases included 796 Latina cases (based on the registry's definition), 459 African-American cases, 478 NSSW cases, and 33 cases listed by the registry under other racial categories. Based on the registry's quality-controlled database, the total number of incident cases who would have been eligible for the study during the period from June 1, 2005 to May 31, 2006 included 1,093 Latinas, 615 African-Americans, and 3,012 NSSWs. Thus, the final sample included 69.3% of Latina, 69.8% of African-American, and 14.2% of NSSW cases that would have been eligible for our study.
After initial contact efforts, another 74 patients were found to be ineligible for participation due to (a) physician refusal to contact (n = 7), (b) patient did not speak English or Spanish (n = 8), (c) patient was too ill or incompetent to participate (n = 30), and (d) patient denied having cancer (n = 23; Fig. 1). Thus, using these criteria, we defined 1,698 patients from the sample as eligible for analysis.
Assessment of Representativeness of Sample Selection
Bias in sample selection may occur due to (a) cases reported from doctor offices and free-standing pathology laboratories which are not surveyed during RCA, which only obtains cases reported in hospitals; (b) possible delays in the availability of path reports at the time of RCA; and (c) if the randomized selection of NSSW was somehow biased. The representativeness of the sampled cases compared with all eligible cases diagnosed during the ascertainment period was assessed ∼7 to 8 months after the sample was selected using registry variables (age, stage, vital status, date of diagnosis, race/ethnicity). Multivariate logistic regression was used to determine if any variables not linked to the sampling design independently predicted sample selection.
Patient Contact Methods
Physicians were notified of our intent to contact patients by a courtesy letter. If no objection was received, the selected patients were sent an introductory letter, survey materials, and $10 cash as compensation for the time and effort to complete the questionnaire. The patient survey instrument and introductory letter were translated into Spanish using previous translated versions of the validated questions included (e.g., the FACT-B). All patients who had a Spanish surname based on the U.S. census list were sent materials in both English and Spanish. The Dillman survey method was used to encourage response (18). Specifically, the subjects were called a minimum of five times within 2 to 3 weeks of the mailing if no response was received, sent second copies of materials, and eventually offered a phone interview. Spanish-speaking interviewers were used. Extensive tracing methods were used to locate cases with incorrect addresses, nonworking phone numbers, or those from whom no response was received and no contact was established. The study protocol was approved by the institutional review boards of the University of Michigan and the University of Southern California.
Acculturation Measure
In addition to race/ethnicity, we also assessed acculturation among Latinas in the survey using the Short Acculturation Scale for Hispanics (SASH; ref. 19) which consisted of four items that asked respondents (who indicated that they could speak any Spanish) to indicate their language preference on a five-point scale that included the categories “only English,” “English better than Spanish,” “both equally,” “Spanish better than English,” and “only Spanish.” The four questions were (a) What language(s) do you read and speak?, (b) What language do you usually speak at home?, (c) In what language do you usually think?, and (d) What language do you usually speak with your friends? Using the results from the individual items, a polychoric correlation matrix was constructed to examine inter-item correlations. A summary score was calculated for each respondent that was based on the mean of their responses to all four of the items. Confirmatory factor analysis and validity studies, including bivariate distributions of categories for education attainment, country of origin, and parents' origin by categories of the SASH summary score, were used to characterize the acculturation-related differences identified by this scale for the Latina respondents in our population.
Calculation of Response Rate to Survey and Representativeness of Respondents
The response rate was based on the number of respondents from among the 1,698 patients determined to be eligible. Subjects who refused to participate or who could not be located or contacted (after extensive tracing and multiple mailings of the questionnaire and follow-up phone calls) were considered nonrespondents. Using available registry variables, the χ2 statistic was used to identify univariate differences in characteristics between the respondents and nonrespondents, and multivariable logistic regression was used to determine independent variables that predicted response.
Results
Representativeness of RCA Sample Selection
Using a multivariable logistic regression model (Table 1), we found that the only nondesign factors associated with RCA sample selection were related to treatment received, age, and tumor differentiation. Those with no surgery, older age, and unknown differentiation were less likely than their counterparts to be selected, whereas those receiving radiation or hormone therapy were more likely than those not receiving these treatments.
Significant variables* . | Adjusted odds ratios (95% confidence limits) . | P . |
---|---|---|
No surgery | 0.65 (0.45-0.93) | 0.02 |
Received radiation therapy | 1.23 (1.05-1.44) | 0.01 |
Received hormone therapy | 1.23 (1.01-1.50) | 0.03 |
Age (y) | ||
<45 | 1.00 (reference) | |
45-54 | 0.79 (0.62-0.99) | 0.04 |
55-64 | 0.70 (0.56-0.88) | 0.002 |
65+ | 0.59 (0.47-0.73) | <0.0001 |
P trend = <0.0001 | ||
Unknown tumor differentiation grade | 0.63 (0.49-0.80) | 0.0001 |
Significant variables* . | Adjusted odds ratios (95% confidence limits) . | P . |
---|---|---|
No surgery | 0.65 (0.45-0.93) | 0.02 |
Received radiation therapy | 1.23 (1.05-1.44) | 0.01 |
Received hormone therapy | 1.23 (1.01-1.50) | 0.03 |
Age (y) | ||
<45 | 1.00 (reference) | |
45-54 | 0.79 (0.62-0.99) | 0.04 |
55-64 | 0.70 (0.56-0.88) | 0.002 |
65+ | 0.59 (0.47-0.73) | <0.0001 |
P trend = <0.0001 | ||
Unknown tumor differentiation grade | 0.63 (0.49-0.80) | 0.0001 |
Adjusted for race/ethnicity.
Response Rates
The survey was completed by 1,223 or 72.0% of the 1,698 eligible patients (97.8% of whom completed a written survey and 2.2% of whom completed a telephone survey). A total of 296 or 17.5% of the patients refused participation and 179 (10.5%) were lost to follow-up after extensive tracing efforts. Based on the race/ethnicity definition at the time of sample selection, the response rate for Latina women was similar to all non-Latina women (i.e., African-American and NSSW combined, 71.4% versus 72.7%; P = 0.56). However, within the non-Latina group, the response rate was significantly higher among NSSW than among AA (77.6% versus 68.2%; P = 0.002; Table 2). Among the 601 Hispanic women completing the questionnaire, 300 completed the English version and 301 completed the Spanish version. Although use of the telephone was relatively low (<3% were completed by phone overall), it was an option used more often for those responding in Spanish than in English (6% versus 2%). Extensive re-contact of all respondents who returned an incomplete questionnaire was also conducted by telephone in order to complete critical missing questionnaire items. In total, 24.1% of respondents sent back incomplete questionnaires and missing data were obtained for 262 or 88.8% of this group.
Response . | Hispanic* . | NSSW . | African-American . | Total (%) . |
---|---|---|---|---|
Completed questionnaire (total) | 601 (71.4) | 318 (77.6) | 304 (68.2) | 1,223 (72.0) |
English | 300 | 316 | 304 | 920 |
Spanish | 301 | 2 | 0 | 303 |
Refused | 149 (17.7) | 57 (13.9) | 90 (20.2) | 296 (17.4) |
Lost to follow-up/out of country | 92 (10.9) | 35 (8.5) | 52 (11.7) | 179 (10.6) |
Total | 842 (100.0) | 410 (100.0) | 446 (100.0) | 1,698 (100.0) |
Response . | Hispanic* . | NSSW . | African-American . | Total (%) . |
---|---|---|---|---|
Completed questionnaire (total) | 601 (71.4) | 318 (77.6) | 304 (68.2) | 1,223 (72.0) |
English | 300 | 316 | 304 | 920 |
Spanish | 301 | 2 | 0 | 303 |
Refused | 149 (17.7) | 57 (13.9) | 90 (20.2) | 296 (17.4) |
Lost to follow-up/out of country | 92 (10.9) | 35 (8.5) | 52 (11.7) | 179 (10.6) |
Total | 842 (100.0) | 410 (100.0) | 446 (100.0) | 1,698 (100.0) |
Based on RCA sample definition including both Spanish surname and/or hospital Hispanic identity.
Respondents versus Nonrespondents
Registry information on all eligible selected patients was used to compare differences between respondents and nonrespondents. There were no bivariate differences based on age at diagnosis, race/ethnicity, and Spanish surname (Table 3). However, nonrespondents were more likely to be never married than respondents, and patients from lower SES census tracts were slightly less likely to complete the survey than those from higher SES tracts. There were no differences in response based on receipt of chemotherapy or hormone therapy or tumor grade; however, nonrespondents were more likely to have stage II or III disease compared with respondents, were slightly more likely to have had a mastectomy or no surgery, and were less likely to have received radiation therapy. When these variables were included in a multivariable logistic regression model to determine independent predictors of response, we found that response was higher among older patients (P for trend = 0.02), those with lower stage disease (P for trend = 0.003), and among currently married women (P = 0.03). Socioeconomic status of census tract, type of surgery, and radiation therapy were unrelated to response in the multivariable model.
Selected characteristics . | Respondents . | . | Nonrespondents . | . | ||||
---|---|---|---|---|---|---|---|---|
. | N . | % . | N . | % . | ||||
Total | 1,223 | 100.0 | 475 | 100.0 | ||||
Age at diagnosis (y) | ||||||||
<45 | 235 | 19.2 | 97 | 20.4 | ||||
45-54 | 305 | 24.9 | 138 | 29.0 | ||||
55-64 | 344 | 28.1 | 125 | 26.3 | ||||
65+ | 339 | 27.7 | 115 | 24.2 | ||||
Race/ethnicity (registry definition) | ||||||||
Hispanic | 558 | 45.6 | 207 | 43.6 | ||||
African-American | 302 | 24.7 | 141 | 29.7 | ||||
NSSW | 343 | 28.1 | 116 | 24.4 | ||||
Other | 20 | 1.2 | 11 | 2.3 | ||||
Spanish surname | ||||||||
Yes | 572 | 46.8 | 218 | 45.9 | ||||
No | 651 | 53.2 | 257 | 54.1 | ||||
Marital status* | ||||||||
Single, never married | 231 | 18.9 | 119 | 25.2 | ||||
Married | 657 | 53.7 | 224 | 47.2 | ||||
Divorced, separated, widowed | 298 | 24.4 | 107 | 22.5 | ||||
Unknown | 37 | 3.0 | 25 | 5.3 | ||||
Socioeconomic status of census tract† | ||||||||
High (1) | 199 | 16.3 | 61 | 12.9 | ||||
(2) | 245 | 20.1 | 84 | 17.8 | ||||
(3) | 285 | 23.4 | 98 | 20.7 | ||||
(4) | 275 | 22.5 | 124 | 26.2 | ||||
Low (5) | 216 | 17.7 | 106 | 22.4 | ||||
Stage of disease* | ||||||||
In situ | 238 | 19.5 | 93 | 19.6 | ||||
I | 445 | 36.4 | 132 | 27.8 | ||||
II | 350 | 28.7 | 142 | 30.0 | ||||
III | 142 | 11.6 | 73 | 15.4 | ||||
Unknown | 46 | 3.8 | 34 | 7.2 | ||||
Type of surgery | ||||||||
None | 34 | 2.8 | 46 | 9.7 | ||||
Lumpectomy | 787 | 64.4 | 266 | 56.0 | ||||
Mastectomy | 402 | 32.9 | 163 | 34.3 | ||||
Chemotherapy | ||||||||
Yes | 401 | 32.8 | 164 | 34.5 | ||||
No | 822 | 67.2 | 401 | 65.5 | ||||
Hormone therapy | ||||||||
Yes | 223 | 18.2 | 69 | 14.5 | ||||
No | 999 | 81.8 | 406 | 85.5 | ||||
Radiation therapy† | ||||||||
Yes | 503 | 41.1 | 166 | 35.0 | ||||
No | 720 | 58.9 | 309 | 65.0 | ||||
Tumor grade | ||||||||
Grade 1 | 179 | 14.6 | 74 | 15.6 | ||||
Grade 2 | 447 | 36.6 | 147 | 31.0 | ||||
Grade 3-4 | 484 | 39.6 | 214 | 45.0 | ||||
Unknown | 113 | 9.2 | 40 | 8.4 |
Selected characteristics . | Respondents . | . | Nonrespondents . | . | ||||
---|---|---|---|---|---|---|---|---|
. | N . | % . | N . | % . | ||||
Total | 1,223 | 100.0 | 475 | 100.0 | ||||
Age at diagnosis (y) | ||||||||
<45 | 235 | 19.2 | 97 | 20.4 | ||||
45-54 | 305 | 24.9 | 138 | 29.0 | ||||
55-64 | 344 | 28.1 | 125 | 26.3 | ||||
65+ | 339 | 27.7 | 115 | 24.2 | ||||
Race/ethnicity (registry definition) | ||||||||
Hispanic | 558 | 45.6 | 207 | 43.6 | ||||
African-American | 302 | 24.7 | 141 | 29.7 | ||||
NSSW | 343 | 28.1 | 116 | 24.4 | ||||
Other | 20 | 1.2 | 11 | 2.3 | ||||
Spanish surname | ||||||||
Yes | 572 | 46.8 | 218 | 45.9 | ||||
No | 651 | 53.2 | 257 | 54.1 | ||||
Marital status* | ||||||||
Single, never married | 231 | 18.9 | 119 | 25.2 | ||||
Married | 657 | 53.7 | 224 | 47.2 | ||||
Divorced, separated, widowed | 298 | 24.4 | 107 | 22.5 | ||||
Unknown | 37 | 3.0 | 25 | 5.3 | ||||
Socioeconomic status of census tract† | ||||||||
High (1) | 199 | 16.3 | 61 | 12.9 | ||||
(2) | 245 | 20.1 | 84 | 17.8 | ||||
(3) | 285 | 23.4 | 98 | 20.7 | ||||
(4) | 275 | 22.5 | 124 | 26.2 | ||||
Low (5) | 216 | 17.7 | 106 | 22.4 | ||||
Stage of disease* | ||||||||
In situ | 238 | 19.5 | 93 | 19.6 | ||||
I | 445 | 36.4 | 132 | 27.8 | ||||
II | 350 | 28.7 | 142 | 30.0 | ||||
III | 142 | 11.6 | 73 | 15.4 | ||||
Unknown | 46 | 3.8 | 34 | 7.2 | ||||
Type of surgery | ||||||||
None | 34 | 2.8 | 46 | 9.7 | ||||
Lumpectomy | 787 | 64.4 | 266 | 56.0 | ||||
Mastectomy | 402 | 32.9 | 163 | 34.3 | ||||
Chemotherapy | ||||||||
Yes | 401 | 32.8 | 164 | 34.5 | ||||
No | 822 | 67.2 | 401 | 65.5 | ||||
Hormone therapy | ||||||||
Yes | 223 | 18.2 | 69 | 14.5 | ||||
No | 999 | 81.8 | 406 | 85.5 | ||||
Radiation therapy† | ||||||||
Yes | 503 | 41.1 | 166 | 35.0 | ||||
No | 720 | 58.9 | 309 | 65.0 | ||||
Tumor grade | ||||||||
Grade 1 | 179 | 14.6 | 74 | 15.6 | ||||
Grade 2 | 447 | 36.6 | 147 | 31.0 | ||||
Grade 3-4 | 484 | 39.6 | 214 | 45.0 | ||||
Unknown | 113 | 9.2 | 40 | 8.4 |
χ2 P < 0.005.
χ2 P < 0.05.
Hispanic Identification among Respondents
Using the self-reported identification of Hispanic ethnicity from the survey as the gold standard, the sensitivity and specificity of RCA sample definition and three registry-based definitions were assessed (Table 4). The first comparison was made with the initial determination of Hispanic identity at the time of RCA sample selection based on a combination of Spanish surname and hospital-defined Hispanic ethnicity. The sensitivity of the approach was 94.6% compared with a specificity of 90.0%. The predictive value positive was 89.2% and the predictive value negative was 95.0%. Sensitivity was slightly lower (89.5%) and specificity a bit higher (92.3%) when we based the definition solely on the RCA determination of Spanish surname. Similar results were found when just using the SEER Spanish surname definition. However, when the SEER North American Association of Central Cancer Registries (NAACCR) Hispanic Identification Algorithm variable was used, the sensitivity was 97.7%, and specificity was 90.7%. The NAACCR Hispanic Identification Algorithm (NHIA) variable was developed by the NAACCR and uses both direct and indirect methods of determining Hispanic origin (20). The algorithm is based on the NAACCR variables for Spanish/Hispanic origin, last name, maiden name, birthplace, race, and sex.
. | Sensitivity . | Specificity . | Predictive value positive . | Predictive value negative . |
---|---|---|---|---|
RCA Spanish surname and/or Hispanic identity from hospital database | 94.6 | 90.0 | 89.2 | 95.0 |
RCA Spanish surname (only) | 89.5 | 92.3 | 91.3 | 90.6 |
SEER Spanish surname | 90.2 | 90.4 | 89.0 | 91.4 |
SEER NHIA* | 97.7 | 90.7 | 90.1 | 97.8 |
. | Sensitivity . | Specificity . | Predictive value positive . | Predictive value negative . |
---|---|---|---|---|
RCA Spanish surname and/or Hispanic identity from hospital database | 94.6 | 90.0 | 89.2 | 95.0 |
RCA Spanish surname (only) | 89.5 | 92.3 | 91.3 | 90.6 |
SEER Spanish surname | 90.2 | 90.4 | 89.0 | 91.4 |
SEER NHIA* | 97.7 | 90.7 | 90.1 | 97.8 |
NAACCR Hispanic Identification Algorithm.
Latina Acculturation Measure
Among the 601 respondents who were initially identified as Hispanic at the time of sample selection, 565 of them self-identified as Latina based on their survey responses. Forty-five of them did not answer the four items of the SASH because they indicated that they spoke only English on a screening question (and were given the code of 1 = “Only English”); only 4 of the remaining 520 failed to complete all four items. Responses to each of the four items in the SASH were based on a five-item Likert scale (ranging from 1 = “Only English” to 5 = “Only Spanish”). Given the ordinal nature of this scale, a polychoric correlation matrix was constructed which showed very high inter-item correlations (all >0.94). The factor analysis of the polychoric correlation matrix showed one dominant factor with very high loadings and uniqueness components of <0.07 for all items.
A SASH summary score was calculated by averaging responses from the individual items. Analysis using a nonparametric multilevel ordinal factor model showed that the summary scale values fell into a bimodal pattern separating the population into two groups that spoke predominantly English or Spanish. The bimodal distribution is evident in Fig. 2, which shows the distribution of mean SASH summary scores (overall mean was 3.4, SD 1.5). The red line indicates the recommended cutoff (2.99) by Marín et al. (19) to discriminate less acculturated from more acculturated Latinos, however, approximately half of the respondents in our sample (53.8%) had average scores of 4 or higher on the five-point scale and therefore we used 4.0 as the cutoff for our subsequent analyses. About one quarter of the Latina sample (27.9%) had scores <2, indicating that they spoke English more than Spanish; whereas only about one fifth (18.3%) had intermediate values between 2 and 4.
We validated the SASH score categories by examining the distribution of other variables that may be likely to be related to acculturation (Table 5). The table also suggests that there are two distinct groups identified according to our cutpoint of 4.0. Respondents in score ranges of 4 or more (strongly preferring Spanish) reported the lowest levels of education, being born in the United States, and having either parent born in the United States. By contrast, respondents with scores below 3 (strongly preferring English) reported much higher levels of education, being born in the United States, and having one or more parents born in the United States. Respondents with scores between 3.0 and 3.9 seem to be a group, largely foreign born (similar to those with scores of 4.0 or greater), but with education and literacy levels similar to the English-preferring group with scores of <3.0. Approximately 20% of these women said they needed help with written information (similar to those with scores of <3.0), whereas 44% or more of the respondents with scores of ≥4.0 needed such help.
. | SASH score categories* . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | <2.0 (134) . | 2.0-2.9 (68) . | 3.0-3.9 (58) . | 4.0-4.9 (146) . | 5 (162) . | |||||
Education‡ | ||||||||||
Grade school or less | 4.3 | 6.0 | 11.5 | 37.5 | 62.7 | |||||
Some high school | 17.1 | 15.0 | 23.1 | 22.9 | 18.7 | |||||
High school graduate | 20.0 | 22.4 | 19.2 | 25.0 | 12.7 | |||||
Some college | 45.7 | 29.9 | 30.8 | 11.1 | 4.0 | |||||
College graduate | 12.9 | 26.7 | 15.4 | 3.5 | 2.0 | |||||
Country of origin‡ | ||||||||||
United States | 54.3 | 33.8 | 5.7 | 2.0 | 1.3 | |||||
Mexico | 32.9 | 51.5 | 66.0 | 68.5 | 68.4 | |||||
Central America | 5.7 | 1.5 | 7.6 | 21.3 | 20.1 | |||||
Other Latin American | 6.1 | 11.7 | 16.9 | 6.9 | 8.9 | |||||
Other | 1.0 | 1.5 | 3.8 | 1.3 | 1.3 | |||||
Years in the United States‡ | ||||||||||
<10 | 0.7 | 1.5 | 3.8 | 8.2 | 18.4 | |||||
10-20 | 2.1 | 0.0 | 11.3 | 17.8 | 27.2 | |||||
≥20 | 97.2 | 98.5 | 84.9 | 74.0 | 19.4 | |||||
Parents born in the United States‡ | ||||||||||
Neither | 42.2 | 60.3 | 90.6 | 97.9 | 98.1 | |||||
One | 19.3 | 14.7 | 7.6 | 2.1 | 2.0 | |||||
Both | 42.2 | 25.0 | 1.9 | 0.0 | 0.0 | |||||
Literacy†,‡ | ||||||||||
Difficulty with written information | 2.4 | 5.8 | 5.8 | 6.0 | 11.6 | |||||
Needed help with written information | 14.8 | 20.6 | 19.0 | 44.1 | 57.2 | |||||
Problems filling out forms | 3.3 | 2.9 | 1.9 | 12.3 | 32.2 |
. | SASH score categories* . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | <2.0 (134) . | 2.0-2.9 (68) . | 3.0-3.9 (58) . | 4.0-4.9 (146) . | 5 (162) . | |||||
Education‡ | ||||||||||
Grade school or less | 4.3 | 6.0 | 11.5 | 37.5 | 62.7 | |||||
Some high school | 17.1 | 15.0 | 23.1 | 22.9 | 18.7 | |||||
High school graduate | 20.0 | 22.4 | 19.2 | 25.0 | 12.7 | |||||
Some college | 45.7 | 29.9 | 30.8 | 11.1 | 4.0 | |||||
College graduate | 12.9 | 26.7 | 15.4 | 3.5 | 2.0 | |||||
Country of origin‡ | ||||||||||
United States | 54.3 | 33.8 | 5.7 | 2.0 | 1.3 | |||||
Mexico | 32.9 | 51.5 | 66.0 | 68.5 | 68.4 | |||||
Central America | 5.7 | 1.5 | 7.6 | 21.3 | 20.1 | |||||
Other Latin American | 6.1 | 11.7 | 16.9 | 6.9 | 8.9 | |||||
Other | 1.0 | 1.5 | 3.8 | 1.3 | 1.3 | |||||
Years in the United States‡ | ||||||||||
<10 | 0.7 | 1.5 | 3.8 | 8.2 | 18.4 | |||||
10-20 | 2.1 | 0.0 | 11.3 | 17.8 | 27.2 | |||||
≥20 | 97.2 | 98.5 | 84.9 | 74.0 | 19.4 | |||||
Parents born in the United States‡ | ||||||||||
Neither | 42.2 | 60.3 | 90.6 | 97.9 | 98.1 | |||||
One | 19.3 | 14.7 | 7.6 | 2.1 | 2.0 | |||||
Both | 42.2 | 25.0 | 1.9 | 0.0 | 0.0 | |||||
Literacy†,‡ | ||||||||||
Difficulty with written information | 2.4 | 5.8 | 5.8 | 6.0 | 11.6 | |||||
Needed help with written information | 14.8 | 20.6 | 19.0 | 44.1 | 57.2 | |||||
Problems filling out forms | 3.3 | 2.9 | 1.9 | 12.3 | 32.2 |
Scale ranged from 1 (always speaking English) to 5 (always speaking Spanish).
Percentage often/always.
P<.001 for differences across SASH categories.
Discussion
We have shown that a highly representative sample of patients with cancer can be selected as they are being diagnosed using RCA; a high response rate can be achieved among all racial/ethnic groups, and that Latina ethnicity is a valid variable in registry databases. Furthermore, we found a bimodal distribution of Latinas in Los Angeles County with regard to level of acculturation which was highly related to educational level and health literacy. Over half of the self-identified Latinas indicated that they preferred to speak Spanish over English. The representative nature of these respondents is supported by data from the California Health Interview Survey, which found that 50% of Latina women in Los Angeles County in 2003 spoke English not well or not at all (21). The study of this vulnerable group will be especially important because previous research has indicated that low acculturation is related to reduced cancer screening and poorer quality of life among Latinas (8, 22-26).
The results of this study have already found large differentials in decision-making and perceptions of information and care support between the low and high acculturated Latina women (27, 28). This highlights the importance of achieving a representative sample of minority patients.
Studying cancer survivors can be difficult due to illness, loss to follow-up, and unwillingness to participate when confronted with the all of the stresses of cancer survivorship. Minorities and those with lower income and education tend to be less likely to participate in research studies and clinical trials (9, 29, 30), and there may be factors associated with a greater lack of trust in research studies among African-Americans and Hispanics (31). Our study shows that, despite these obstacles, a representative sample can be achieved. Aspects of this study that insured its success included ascertainment of cases by RCA, which allowed contact relatively soon after diagnosis (thus minimizing loss to follow-up); a professionally formatted and translated survey instrument with content that was engaging to recipients; a rigorous data collection method based on the Dillman method including follow-up mailings of surveys and calls to nonrespondents, and call backs for missing data; and the dedicated commitment of staff including a native Spanish-speaking study coordinator. Of note, despite the availability of a full telephone interview, nearly all respondents preferred to complete a mailed version of the survey. Despite this preference, having the telephone available as an option was important to achieving the high response rate that we did, especially for the Spanish speakers. Studies that rely on a single option for response, or a more lengthy survey, may achieve a lower response rate. For example, another study based on an RCA sample of Los Angeles patients that involved a 90-minute telephone interview achieved a 64% response rate (3).
The validation of the registry Spanish ethnicity variable in Los Angeles is also an important finding from this study because sample selection was based on this information. Our predictive value positive of 89.2% when using both Spanish-surname and hospital ethnicity or 91.3% based on Spanish surname alone were higher than found in a previous registry based breast cancer study in Utah and New Mexico where a predictive value positive of 82.3% was found for Spanish surname (32).
Our study is also the first to evaluate the reliability and validity of the SASH in a large population-based sample of Latinas with breast cancer in Los Angeles County. The reliability of the four-item scale was very high and confirmatory factor analysis indicated one principal component. Indeed, the between-item correlation was so high that one or two questions (perhaps those related to general language use) could be used in place of the item scale. However, it is important to note that the short scale was very easy to complete as we had incomplete data for only 4 of 565 respondents (<1%). There was a broad distribution of scores across the five-point score range with obvious clustering of respondents around predominantly Spanish-speaking and predominantly English-speaking, with about a quarter of patients with intermediate scores. Higher scale scores were associated with lower education attainment, being Mexican or Central American born, living in the United States for fewer years, and having parents born outside the United States, and lower health literacy. One key issue raised by our study is the choice of cutoffs for the five-point scale. The developers of the SASH have recommended a cutoff of 2.99 with no midpoint representing biculturalism (19). However, our results suggest caution in selecting the cutoff point for different Latino populations in different studies. In our study, we drew the cutoff point at 4 or more because this identified a large group (half the Latina study population) that had particularly low education and health literacy. In some studies, this group may be particularly important to identify (27, 28). Investigators may need to establish other cutoff points on the SASH scale that optimize its use with regard to the research questions and target population.
An important limitation of our study related to the acculturation measure is that the target population was restricted to Latinas with breast cancer in Los Angeles County. Obviously, men were excluded from the sample. Furthermore, the immigrant experience of Los Angeles County (predominantly Mexican and Central American) might be different from other Latino communities where they may have more representation from other countries from the Caribbean or South America. We expect that the SASH would perform well in these populations but that question cannot be addressed in our study.
The limitations of the RCA method should be noted. This method may not be available in all settings due to different capabilities of cancer registries. When it can be implemented, there are some methodologic limitations. Because cases are selected prior to registry-based quality control measures, patients are initially included (and resources spent contacting and interviewing them) that are later found to be ineligible. However, in our study, this proportion was <13%. Furthermore, all potentially eligible patients cannot be selected in the RCA process. Nevertheless, we have shown that the sample obtained in Los Angeles was highly representative of the total incident cases.
Finally, despite our high response rate, we did find some differences in response according to age, marital status, and stage of disease—with older, married, and lower stage patients more likely to participate. Because these women may be expected to have more social support and less disease burden than nonrespondents, the findings from the study may conservatively state the extent of the problems being faced by patients with breast cancer.
The study methods are applicable to other studies of cancer survivors with a goal of oversampling minority patients and indicates the need to assess acculturation in Latinas. Long-term follow-up of our cohort will prove to be especially valuable, given other research indicating continuing racial/ethnic disparities in breast cancer survival over time (4, 7, 33).
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: National Cancer Institute grants R01 CA109696 and R01 CA088370 (University of Michigan); and an Established Investigator Award in Cancer Prevention, Control, Behavioral, and Population Sciences Research from the National Cancer Institute (K05CA111340; S.J. Katz). The collection of cancer incidence data used in this study was supported by the California Department of Health Services as part of the statewide cancer reporting program mandated by California Health and Safety Code Section 103885; the National Cancer Institute's Surveillance, Epidemiology, and End Results Program under contract N01-PC-35136 (Northern California Cancer Center), contract N01-PC-35139 (University of Southern California), and contract N02-PC-15105 (Public Health Institute); and the Centers for Disease Control and Prevention's National Program of Cancer Registries, under agreement no. U55/CCR921930 (Public Health Institute).
Note: The ideas and opinions expressed herein are those of the author(s) and endorsement by the State of California, Department of Public Health the National Cancer Institute, and the Centers for Disease Control and Prevention or their Contractors and Subcontractors is not intended nor should be inferred.
Acknowledgments
Alma Acosta, Marlene Caldera, Norma Caldera, Maria Isabel Gaeta, Mary Lo, and Urduja Trinidad participated in data collection and processing.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.