Hispanics in the U.S. Southwest have genetic ancestry from Europeans and from American Indians, two groups with markedly different breast cancer incidence rates. Genetic admixture may therefore bias estimates of associations between candidate cancer susceptibility genes and breast cancer in Hispanics. We estimated genetic admixture using 15 ancestry-informative markers for 1,239 Hispanics and 2,505 non-Hispanic Whites in a breast cancer case-control study in the Southwest, the Four Corners Study. Confounding risk ratios (CRR) were calculated to quantify potential bias due to admixture. Genetic admixture was strongly related to self-reported race and ethnic background (P < 0.0001). Among Hispanic controls, admixture was significantly associated with allele frequency for 5 of 11 candidate gene single nucleotide polymorphisms (SNP) examined. Hispanics in the highest versus the lowest quintile of American Indian admixture had higher mean body mass index at age 30 years (25.4 versus 23.6 kg/m2; P = 0.003), shorter mean height (1.56 versus 1.58 m; P = 0.01), higher prevalence of diabetes (14.8% versus 7.2%; P = 0.04), and a larger proportion with less than a high school education (38.5% versus 23.2%; P = 0.001). Admixture was not associated with breast cancer risk among Hispanics (P = 0.65). CRRs for potential bias to candidate SNP-breast cancer risk ratios ranged from 0.99 to 1.01. Thus, although genetic admixture in Hispanics was associated with exposures, confounding by admixture was negligible due to the null association between admixture and breast cancer. CRRs from simulated scenarios indicated that appreciable confounding by admixture would occur only when within-group candidate SNP allele frequency differences are much larger than any that we observed. (Cancer Epidemiol Biomarkers Prev 2007;16(1):142–50)

Hispanic populations in the U.S. Southwest have genetic heritage from Europeans and from American Indians (1, 2), and this background influences their incidence of chronic diseases (3-6). Age-adjusted breast cancer incidence among non-Hispanic White (i.e., European-American) women in New Mexico is 134.8 per 100,000 per year, about two and a half times higher than the incidence among American Indians, 53.4 per 100,000. Incidence in Hispanics, 89.9 per 100,000, is intermediate between the two (7). The Four Corners Breast Cancer study is a case-control study in four Southwest states investigating candidate cancer susceptibility genes, as well as obesity, physical activity, and other exposures, and breast cancer risk among Hispanic and non-Hispanic White women (8). Within the Hispanic population, each woman's breast cancer risk may differ depending on the amount of her European versus American Indian genetic heritage, and if so, genetic admixture may be a source of bias for within-group estimates of exposure-disease associations.

Bias from genetic admixture can occur in studies of a population that is admixed, with genetic heritage from one population with low incidence of a disease and a second population with high incidence. In this situation, also referred to as population stratification, a genetic variant that is more common in the high-incidence population can seem to be associated with disease in the admixed population, even if no causal association exists (9, 10). Genetic admixture can be assessed by determining genotypes for ancestry-informative markers (AIM), alleles that differ in frequency between the ancestor populations, and by applying computer algorithms to estimate proportion ancestry (11-16). The amount of potential bias to risk ratios due to uncontrolled confounding from genetic admixture can be quantified by the confounding risk ratio (CRR; refs. 17, 18). The strength of the exposure-disease relationship does not determine or influence the CRR (17). Rather, the CRR depends on the strength of the association between the confounder and the exposure of interest and on the strength of the association between the confounder and disease.

The objective of this analysis is to report on factors that vary in relation to genetic admixture in a Hispanic population and to examine potential for admixture to bias candidate gene-breast cancer risk ratios. We assessed genetic admixture using AIMs known to differ in prevalence between American Indian and European-American populations (11, 19). We evaluated associations between estimated genetic admixture and the allele frequencies for single nucleotide polymorphisms (SNP) in several candidate cancer susceptibility genes and between genetic admixture and characteristics known to differ among non-Hispanic White, Hispanic, and American Indian women, including body mass index (BMI), height, and diabetes. We estimated the association between genetic admixture and breast cancer among Hispanics in the Four Corners Study population to calculate CRRs for the amount of confounding to candidate gene-breast cancer associations. We further explored the issue of bias due to admixture in studies of breast cancer in Hispanics by estimating CRRs for simulated scenarios based on more extreme possible associations between genetic admixture, disease, and exposures.

Study Subjects and Interview

The Four Corners Study is a population-based case-control study of breast cancer in the U.S. states of Arizona, Colorado, New Mexico, and Utah. The study methods and study population have been described in detail in previous reports (8, 20, 21). Women ages 25 to 79 years with a first diagnosis of breast cancer in the years 1998 to 2004 were identified through population-based cancer registries in each state. All eligible Hispanic cases and a random sample of non-Hispanic cases frequency matched to Hispanics on age were selected for the study. Control subjects ages 64 years and younger were randomly selected from computerized drivers' license lists in New Mexico and Utah and from commercially available lists in Arizona and Colorado. Control subjects ages 65 years and older were selected from Center for Medicare Studies lists. Controls were frequency matched to cases on 5-year age group and ethnicity. Hispanic ethnicity was initially determined from cancer registry data and by application of the Generally Useful Ethnic Search (22, 23) computer program to identify Spanish surnames. Final classification of race and ethnicity was based on self-report. Women who reported Hispanic, non-Hispanic White, or American Indian racial or ethnic background were eligible for the study. The study did not recruit subjects on Indian reservations, but American Indians who were selected and did not reside on reservations were eligible for the study. Additional eligibility criteria were having had no prior breast cancer diagnosis and being capable of answering interview questions in English or Spanish.

Subjects completed an in-person interview, in English or Spanish, reporting on exposure history for breast cancer risk factors, including reproductive history, diet, and physical activity. Hispanic subjects were asked about ability to read and speak English and Spanish, and responses were used to define three categories of language acculturation among Hispanic women: low, women reporting reading and speaking Spanish only or Spanish better than English; medium, women who read or spoke both languages equally well; and high, those who reported speaking and writing English better than Spanish or English only. Anthropometric measurements were taken by the interviewer according to a standardized protocol; BMI was calculated as follows: body weight (kg) / height2 (m). All aspects of the study were reviewed and approved by human subjects' research Institutional Review Boards at each of the collaborating institutions, and subjects signed informed consent documents. Participation rates have been reported (8); among cases, the response rate and cooperation rate were 49% and 68% respectively, and among controls, these rates were 30% and 42%. Of interviewed subjects, 76.6% of cases and 82.4% of controls provided blood samples for genotyping.

Race and Ethnicity Variables

Each subject was asked to identify which best described her race or ethnic background, selecting one or more categories from a card showing the following: Hispanic/Latino/Spanish Origin; White; American Indian or Alaska Native; Black or African American; Asian; Native Hawaiian or other Pacific Islander; and Other. The first three categories, the groups eligible for study, are used for most analyses in this report. If a woman stated that her background included more than one of these groups, she was assigned to a primary category as follows: if Hispanic was among her responses, Hispanic; if American Indian and not Hispanic was reported, American Indian. Each subject also was asked to describe the race and ethnic background of her parents. These data were used to create a second ethnicity variable with seven categories, one for each of the possible combinations of Hispanic, non-Hispanic White, and American Indian, taking into account backgrounds reported for self and parents.

Genotyping

Fifteen genetic admixture markers were selected from the markers described by Collins-Schramm et al. (11, 19) as informative for Mexican-Americans. We selected AIMs reported as having large allele frequency differences for an American Indian population (Pima) compared with Americans of European heritage; δ values reported for the markers that we used6

had a mean of 0.63, median of 0.64, and range of 0.46 to 0.76 (19). African heritage in Southwest Hispanics (1) or Mexican-Americans (2) has been estimated to be low, 0% to 3.7%, indicating that genotyping markers of African heritage would be of no value in the Southwest Hispanic population. Markers designated with ‘MID’ are from a set of Marshfield biallelic Insertion/Deletion polymorphisms. PCR primers used were as described at the Marshfield Clinic Web site.6 Markers MID94, MID142, MID185, MID218, MID558, MID577, MID743, MID919, and MID1656 were PCR amplified using fluorescently labeled primers and size fractionated on an ABI 3700. Markers MID152, MID237, MID856, MID944, and MID1469 were PCR amplified, size fractionated on 3% to 4% Nusieve agarose gels and visualized using ethidium bromide staining. Marker TSC001075 (rs713366) represents a C/T SNP that is not located within a gene and was genotyped in a Taqman assay (ABI Assays on Demand).

We genotyped 11 SNPs in candidate genes for breast cancer susceptibility that had been selected for case-control analyses in the parent study. The candidate SNPs were selected based on their hypothesized role in a pathway linking energy balance and obesity to breast cancer through estrogen, insulin, and insulin-like growth factors. For most of these SNPs, as is true for many candidate cancer susceptibility gene polymorphisms, there was no prior information about allele frequency in Hispanic or American Indian populations. The β3-adrenergic receptor (ADRB3) W64R (T>C) variant was assessed by PCR and BstNI restriction digest as described previously (24). Estrogen receptor α (ESR1) 351 A>G was evaluated by PCR and XbaI restriction (25, 26). The insulin-like growth factor binding protein 3 (IGFBP3) −202 C>A substitution was detected by PCR and digestion with the Alw21I restriction enzyme (27). Interleukin 6 (IL6) SNPs rs1800797 (−596 G/A), rs1800796 (−572 G/C), and rs2069849 (C/T, exon 5) were genotyped by Taqman assays (ABI Assays on Demand). The G to A transition in the IRS1 gene that results in an amino acid change, G972R, was assayed using PCR and digestion with BstN1 (28). The IRS2 G1057D polymorphism was detected using a Taqman assay (29) with modifications as described (30). The sex hormone binding globulin (SHBG) D327N (G>A) variant was genotyped using PCR and a HinfI restriction digest (31). Vitamin D receptor (VDR) Bsm1 (rs154410, located in introns 8 and 9) and Fok1 (T>C SNP, rs10735810, affecting the first of two transcription start codons) polymorphisms were assayed by published PCR and restriction fragment length polymorphism methods (32, 33) with modifications as described (34).

Data Analysis

We summarized allele frequencies for each genetic admixture marker by self-reported ethnicity among control subjects and calculated 95% confidence intervals (95% CI). Genetic ancestry estimates were computed from genotypes for the 15 AIMs using the program STRUCTURE 2.0 (12, 13). STRUCTURE was used because it does not assume prior knowledge of the population but allows the genotype data to define the population structure. STRUCTURE was run using 50,000 replicates for both the burn-in and the simulation phases without any prior assumption of population allele frequencies under the linkage model assuming correlated allele frequencies (19). We ran models to test for one to four distinct groups. The STRUCTURE program estimates, for each individual, the proportion of genetic heritage from each assumed ancestor population. We calculated descriptive statistics for these proportion admixture values for controls within each self-reported ethnicity category and determined quintile cut points for Hispanics based on the control distribution. Admixture estimates based on the entire study population are used for most analyses in this report. For some analyses, we used a second set of admixture estimates based on the same AIMs but estimated from a run of the STRUCTURE program for self-reported Hispanic subjects only, excluding non-Hispanics; we again selected the best-fitting model after testing for one to four groups.

The frequencies (and 95% CIs) of the minor allele of each candidate gene SNP were summarized for controls in each ethnic group and within quintiles of admixture among Hispanics. We estimated means and SDs for BMI and other anthropometric measures for controls within each ethnic group and for Hispanics by quintile of genetic admixture. We used generalized linear models to estimate slopes and to test for trends in association of SNP alleles and anthropometric measures with genetic admixture, adjusting for study center and age. Anthropometric variables were transformed to conform better to a normal distribution, if necessary, for significance testing. For other characteristics, including age, educational attainment, and language acculturation, we constructed contingency tables for categories of each characteristic according to quintiles of genetic admixture among Hispanic controls and tested for associations across the ordered categories using ordinal logistic regression models.

The odds ratios (OR) for association between breast cancer and quintile of genetic admixture among Hispanics were calculated using logistic regression. Models were adjusted for covariates selected a priori that were expected to differ in relation to culture in a minority group and also may be related to breast cancer risk: age, study center, participation in mammography, education, physical activity, consumption of alcohol, and total calories consumed in diet.

We calculated CRRs to describe the amount of potential confounding in exposure-breast cancer associations among Hispanics attributable to genetic admixture. The CRR represents the ratio of the crude exposure-disease risk ratio to the adjusted exposure-disease risk ratio. The CRR is calculated as the ratio of the association between the confounder and disease and the association between the confounder and the exposure. We calculated the CRR from the estimated prevalence of the minor alleles for candidate gene SNP in Hispanics in each quintile of admixture and the breast cancer OR for that quintile, according to the equation used by Wacholder et al. (18).

We also generated CRRs for several simulated scenarios based on extreme possible associations between genetic admixture and disease and between genetic admixture and exposures. The strongest possible association between genetic admixture and breast cancer among Hispanics would be a scenario in which Hispanics in the highest quintile of American Indian admixture have a breast cancer incidence rate equal to that of American Indians, whereas Hispanics and the lowest quintile of admixture have a breast cancer incidence rate equal to that of non Hispanic Whites. We used this scenario, a risk ratio of 2.5 for the first quintile relative to the fifth, for our calculations. We evaluated several scenarios for variation in prevalence of candidate gene SNPs across admixture categories.

Self-Reported Ethnicity

DNA for genotyping was available for a total of 1,239 participants who reported Hispanic ethnicity (Table 1). Among these were 205 (16.5%) women who reported themselves or a parent to have a background from another racial group, in addition to Hispanic. Of the four study centers, the largest number of Hispanic participants, 478, were recruited in New Mexico. From our population-based recruitment that excluded Indian reservations, only 64 of the subjects who completed the interview and provided a blood sample reported their race as American Indian, and the majority of these women reported that they or a parent had a background from another race in addition to American Indian.

Table 1.

Characteristics of Four Corners Study participants who were evaluated for genetic admixture, by self-reported race and ethnic background

White, non-Hispanic
Hispanic
American Indian
Case, n (%)Control, n (%)Case, n (%)Control, n (%)Case, n (%)Control, n (%)
Total 1,175 1,330 555 684 21 43 
Center       
    AZ 164 (14.0) 263 (19.8) 114 (20.5) 156 (22.8) 2 (9.5) 8 (18.6) 
    CO 237 (20.2) 218 (16.4) 108 (19.5) 128 (18.7) 4 (19.0) 4 (9.3) 
    NM 465 (39.6) 496 (37.3) 243 (43.8) 235 (34.4) 9 (42.9) 15 (34.9) 
    UT 309 (26.3) 353 (26.5) 90 (16.2) 165 (24.1) 6 (28.6) 16 (37.2) 
Age (y)       
    25-39 74 (6.3) 101 (7.6) 57 (10.3) 81 (11.8) 1 (4.8) 1 (2.3) 
    40-49 348 (29.6) 347 (26.1) 190 (34.2) 188 (27.5) 9 (42.9) 16 (37.2) 
    50-59 334 (28.4) 352 (26.5) 158 (28.5) 183 (26.8) 7 (33.3) 11 (25.6) 
    60-69 282 (24.0) 300 (22.6) 105 (18.9) 158 (23.1) 3 (14.3) 9 (20.9) 
    70-79 137 (11.7) 230 (17.3) 45 (8.1) 74 (10.8) 1 (4.8) 6 (14.0) 
Another race/ethnicity reported       
    No 1,152 (98.0) 1,296 (97.4) 465 (83.8) 569 (83.2) 6 (28.6) 8 (18.6) 
    Yes, one of these three 22 (1.9) 30 (2.3) 87 (15.7) 109 (15.9) 15 (71.4) 35 (81.4) 
    Yes, other 1 (0.1) 4 (0.3) 3 (0.5) 6 (0.9) 0 (0) 0 (0) 
White, non-Hispanic
Hispanic
American Indian
Case, n (%)Control, n (%)Case, n (%)Control, n (%)Case, n (%)Control, n (%)
Total 1,175 1,330 555 684 21 43 
Center       
    AZ 164 (14.0) 263 (19.8) 114 (20.5) 156 (22.8) 2 (9.5) 8 (18.6) 
    CO 237 (20.2) 218 (16.4) 108 (19.5) 128 (18.7) 4 (19.0) 4 (9.3) 
    NM 465 (39.6) 496 (37.3) 243 (43.8) 235 (34.4) 9 (42.9) 15 (34.9) 
    UT 309 (26.3) 353 (26.5) 90 (16.2) 165 (24.1) 6 (28.6) 16 (37.2) 
Age (y)       
    25-39 74 (6.3) 101 (7.6) 57 (10.3) 81 (11.8) 1 (4.8) 1 (2.3) 
    40-49 348 (29.6) 347 (26.1) 190 (34.2) 188 (27.5) 9 (42.9) 16 (37.2) 
    50-59 334 (28.4) 352 (26.5) 158 (28.5) 183 (26.8) 7 (33.3) 11 (25.6) 
    60-69 282 (24.0) 300 (22.6) 105 (18.9) 158 (23.1) 3 (14.3) 9 (20.9) 
    70-79 137 (11.7) 230 (17.3) 45 (8.1) 74 (10.8) 1 (4.8) 6 (14.0) 
Another race/ethnicity reported       
    No 1,152 (98.0) 1,296 (97.4) 465 (83.8) 569 (83.2) 6 (28.6) 8 (18.6) 
    Yes, one of these three 22 (1.9) 30 (2.3) 87 (15.7) 109 (15.9) 15 (71.4) 35 (81.4) 
    Yes, other 1 (0.1) 4 (0.3) 3 (0.5) 6 (0.9) 0 (0) 0 (0) 

NOTE: “Yes” indicates that, in addition to the primary race/ethnicity category (column variable), the subject also reported herself or a parent to have race or ethnic background from another of the three groups shown or to have other (Black or African-American, Asian, Pacific Islander, or other) racial background.

Abbreviations: AZ, Arizona; CO, Colorado; NM, New Mexico; UT, Utah.

Genetic Admixture

The minor allele frequency for each genetic admixture marker differed between Hispanics and non-Hispanic Whites, with nonoverlapping 95% CIs on the allele frequencies.6 The STRUCTURE program estimated log probabilities of −68844.2, −64299.2, −64437.4, and −65509.1 for one, two, three, and four populations, respectively, indicating that a two-population model fit the study data best. In the Hispanic-only STRUCTURE runs, a two-population model was again the best fit. For control subjects in our three primary ethnicity categories, the estimates of median (and interquartile ranges) membership in the population that STRUCTURE designated as population 1 were 0.21 (0.14-0.30) for non-Hispanic White controls, 0.63 (0.48-0.74) for Hispanic controls, and 0.30 (0.13-0.55) for American Indian controls. When the admixture values were summarized across seven categories, taking into account multiple race and ethnic backgrounds reported for self and parents (Fig. 1), the admixture estimates differed strongly by group (in an ANOVA adjusted for study center, 6 degrees of freedom; P < 0.0001). American Indians who had reported no other race or ethnic background had the highest membership in population 1 (median, 0.88; interquartile range, 0.80-0.89), and women who reported only non-Hispanic White background had the lowest membership in population 1 (median, 0.21; interquartile range, 0.14-0.30), consistent with the interpretation that a higher proportion membership in population 1 represents relatively (within our study population) higher American Indian ancestry.

Figure 1.

Genetic admixture estimated from the STRUCTURE program, using 15 AIMs, by self-reported race and ethnic background, Four Corners Study. Vertical bar, estimated proportion membership in population 1 (blue) and population 2 (yellow) for an individual in the control population. Proportions are based on a two-population model (K = 2), which best fit the data. Horizontal line, median membership in population 1 for that group.

Figure 1.

Genetic admixture estimated from the STRUCTURE program, using 15 AIMs, by self-reported race and ethnic background, Four Corners Study. Vertical bar, estimated proportion membership in population 1 (blue) and population 2 (yellow) for an individual in the control population. Proportions are based on a two-population model (K = 2), which best fit the data. Horizontal line, median membership in population 1 for that group.

Close modal

Candidate Gene Allele Frequencies

Allele frequencies for the minor allele for each of the 11 candidate gene SNPs differed by self-reported race and ethnicity in control subjects (Table 2). Differences in allele frequency between Hispanics and non-Hispanic Whites ranged from 0.03 to 0.24 for these 11 SNPs, which had been selected without prior knowledge of frequencies in either American Indian or Hispanic populations. Proportion membership in population 1 was associated with a trend in prevalence of each SNP. When both genetic admixture estimates and self-reported race and ethnicity were included in the same model, the associations of each with candidate SNPs were attenuated, but the trend with admixture remained significant for four of the SNPs. Among Hispanic controls, the allele frequency in the lowest quintile of membership in population 1 (Q1) tended to be closest to that of non-Hispanic Whites. The trend in allele prevalence with genetic admixture among Hispanics was statistically significant for five of the SNPs.

Table 2.

Candidate cancer susceptibility gene polymorphisms: allele frequencies (with 95% CIs) by self-reported race and ethnicity and by genetic admixture, among control subjects

GeneMinor alleleNon-Hispanic White, n = 1,330 (95% CI)Hispanic, n = 684 (95% CI)American Indian, n = 43 (95% CI)P*Trend with genetic admixture
Hispanics: allele frequency by quintile of genetic admixture
P
PEthnicity adjusted P§Q1Q2Q3Q4Q5
ADRB3 0.08 (0.07-0.09) 0.13 (0.11-0.14) 0.19 (0.09-0.28) <0.001 <0.001 0.009 0.13 0.09 0.11 0.10 0.20 0.15 
ESR1 Xba 0.35 (0.33-0.37) 0.30 (0.28-0.32) 0.30 (0.20-0.41) 0.002 0.002 0.33 0.35 0.30 0.29 0.30 0.26 0.01 
IGFBP3 0.46 (0.44-0.48) 0.35 (0.32-0.37) 0.29 (0.17-0.41) <0.001 <0.001 0.10 0.36 0.36 0.38 0.36 0.26 0.03 
IL6 rs1800797 0.43 (0.41-0.45) 0.19 (0.17-0.21) 0.26 (0.18-0.36) <0.001 <0.001 0.04 0.23 0.19 0.19 0.19 0.15 0.04 
IL6 rs1800796 0.05 (0.05-0.06) 0.26 (0.24-0.28) 0.20 (0.13-0.28) <0.001 <0.001 0.001 0.23 0.26 0.27 0.24 0.30 0.04 
IL6 rs2069849 0.02 (0.02-0.03) 0.07 (0.06-0.08) 0.02 (0.00-0.06) <0.001 <0.001 0.38 0.06 0.05 0.08 0.10 0.06 0.20 
IRS1 0.07 (0.06-0.08) 0.04 (0.03-0.05) 0.05 (0.01-0.09) 0.001 0.002 0.47 0.04 0.05 0.04 0.05 0.03 0.87 
IRS2 0.37 (0.35-0.39) 0.40 (0.38-0.43) 0.35 (0.24-0.45) 0.03 0.002 0.02 0.39 0.40 0.36 0.39 0.47 0.11 
SHBG 0.11 (0.10-0.13) 0.07 (0.06-0.08) 0.10 (0.05-0.16) <0.001 <0.001 0.11 0.09 0.08 0.07 0.06 0.04 0.06 
VDR Bsm1 0.40 (0.38-0.42) 0.27 (0.25-0.29) 0.34 (0.23-0.44) <0.001 <0.001 0.09 0.32 0.23 0.29 0.27 0.23 0.04 
VDR Fok1 0.38 (0.36-0.40) 0.42 (0.39-0.44) 0.47 (0.38-0.55) 0.03 0.04 0.22 0.40 0.42 0.44 0.40 0.43 0.82 
GeneMinor alleleNon-Hispanic White, n = 1,330 (95% CI)Hispanic, n = 684 (95% CI)American Indian, n = 43 (95% CI)P*Trend with genetic admixture
Hispanics: allele frequency by quintile of genetic admixture
P
PEthnicity adjusted P§Q1Q2Q3Q4Q5
ADRB3 0.08 (0.07-0.09) 0.13 (0.11-0.14) 0.19 (0.09-0.28) <0.001 <0.001 0.009 0.13 0.09 0.11 0.10 0.20 0.15 
ESR1 Xba 0.35 (0.33-0.37) 0.30 (0.28-0.32) 0.30 (0.20-0.41) 0.002 0.002 0.33 0.35 0.30 0.29 0.30 0.26 0.01 
IGFBP3 0.46 (0.44-0.48) 0.35 (0.32-0.37) 0.29 (0.17-0.41) <0.001 <0.001 0.10 0.36 0.36 0.38 0.36 0.26 0.03 
IL6 rs1800797 0.43 (0.41-0.45) 0.19 (0.17-0.21) 0.26 (0.18-0.36) <0.001 <0.001 0.04 0.23 0.19 0.19 0.19 0.15 0.04 
IL6 rs1800796 0.05 (0.05-0.06) 0.26 (0.24-0.28) 0.20 (0.13-0.28) <0.001 <0.001 0.001 0.23 0.26 0.27 0.24 0.30 0.04 
IL6 rs2069849 0.02 (0.02-0.03) 0.07 (0.06-0.08) 0.02 (0.00-0.06) <0.001 <0.001 0.38 0.06 0.05 0.08 0.10 0.06 0.20 
IRS1 0.07 (0.06-0.08) 0.04 (0.03-0.05) 0.05 (0.01-0.09) 0.001 0.002 0.47 0.04 0.05 0.04 0.05 0.03 0.87 
IRS2 0.37 (0.35-0.39) 0.40 (0.38-0.43) 0.35 (0.24-0.45) 0.03 0.002 0.02 0.39 0.40 0.36 0.39 0.47 0.11 
SHBG 0.11 (0.10-0.13) 0.07 (0.06-0.08) 0.10 (0.05-0.16) <0.001 <0.001 0.11 0.09 0.08 0.07 0.06 0.04 0.06 
VDR Bsm1 0.40 (0.38-0.42) 0.27 (0.25-0.29) 0.34 (0.23-0.44) <0.001 <0.001 0.09 0.32 0.23 0.29 0.27 0.23 0.04 
VDR Fok1 0.38 (0.36-0.40) 0.42 (0.39-0.44) 0.47 (0.38-0.55) 0.03 0.04 0.22 0.40 0.42 0.44 0.40 0.43 0.82 
*

P value for a difference in frequencies of the variant allele by self-reported ethnicity, adjusted for age and study center.

Proportion membership in each of two assumed populations was estimated based on ancestry informative markers and the STRUCTURE program. For Hispanics, allele frequencies are shown for subgroups, categorized according to quintile of membership in population 1.

P value for a trend in prevalence of the minor allele with genetic admixture among control subjects, from a regression model adjusted for age and study center.

§

P value for a trend in prevalence of minor allele with genetic admixture, from a model as described above, additionally adjusted for reported race and ethnicity of self and parents.

Body Size Measures

Means for BMI, adult weight gain, waist circumference, and waist-to-hip ratio were higher for Hispanic women in the control group than non-Hispanic whites, whereas mean height was lower (Table 3); these differences were statistically significant in our large study population, although for some measures, the magnitude of the difference was small (e.g., mean waist-to-hip ratio of 0.80 for non-Hispanic Whites compared with 0.84 among Hispanics). There was a significant trend in each anthropometric measure with estimated genetic admixture. Associations of admixture with body size measures were attenuated when adjusted for self-reported race and ethnicity, but proportion membership in population 1 remained significantly positively associated with BMI at interview, BMI at age 30 years, waist circumference, and waist-to-hip ratio and significantly inversely associated with height. The trends in each body size measure with genetic admixture among Hispanics were in the same direction as for the overall population. Hispanic controls in the highest quintile of American Indian admixture had a higher mean BMI at age 30 years, 25.4 versus 23.6 kg/m2, and lower mean height, 1.56 versus 1.58 m, than those in the lowest quintile. The slopes in mean values of each body size measure with admixture among Hispanics were not steep (e.g., 0.34 kg/m2 in mean of current BMI for each quintile of admixture), but this translates to a potentially meaningful increase in the prevalence of obesity, from 42.3% in quintile 1 to 49.6% in quintile 5. In regression models using the second set of admixture estimates based on a run of the STRUCTURE program for Hispanic subjects only, the slopes of the associations were stronger [e.g., 0.90 units of BMI at interview per quintile of admixture (P = 0.09) and −0.015 m of height per quintile of admixture (P = 0.01)].

Table 3.

Body size measures among control subjects by self-reported ethnicity and by genetic admixture, Four Corners Study

Non-Hispanic White (n = 1327), mean ± SDHispanic (n = 682), mean ± SDAmerican Indian (n = 43), mean ± SDP*Trend with genetic admixture
All subjects
Hispanics
SlopePEthnicity adjusted PSlopeP
BMI at interview (kg/m227.9 ± 6.4 30.0 ± 6.3 29.6 ± 7.7 <0.0001 0.84 <0.001 0.02 0.34 0.12 
BMI at age 30 y (kg/m222.9 ± 4.1 24.4 ± 4.5 24.4 ± 6.9 <0.0001 0.64 <0.001 <0.001 0.56 0.003 
BMI at age 15 y (kg/m219.9 ± 2.9 20.2 ± 3.2 21.5 ± 9.4 0.13 0.13 0.08 0.30 0.15 0.49 
Height (m) 1.63 ± 0.06 1.57 ± 0.06 1.62 ± 0.07 <0.0001 −0.02 <0.001 0.01 −0.006 0.014 
Adult weight gain (kg) 18.7 ± 15.0 22.1 ± 14.5 17.5 ± 26.7 <0.0001 1.23 <0.001 0.19 0.17 0.78 
Waist (cm) 86.6 ± 15.0 91.8 ± 14.4 91.2 ± 17.3 <0.0001 2.12 <0.001 0.03 0.87 0.096 
Waist-to-hip ratio 0.80 ± 0.07 0.84 ± 0.07 0.82 ± 0.08 <0.0001 0.015 <0.001 0.04 0.003 0.18 
Non-Hispanic White (n = 1327), mean ± SDHispanic (n = 682), mean ± SDAmerican Indian (n = 43), mean ± SDP*Trend with genetic admixture
All subjects
Hispanics
SlopePEthnicity adjusted PSlopeP
BMI at interview (kg/m227.9 ± 6.4 30.0 ± 6.3 29.6 ± 7.7 <0.0001 0.84 <0.001 0.02 0.34 0.12 
BMI at age 30 y (kg/m222.9 ± 4.1 24.4 ± 4.5 24.4 ± 6.9 <0.0001 0.64 <0.001 <0.001 0.56 0.003 
BMI at age 15 y (kg/m219.9 ± 2.9 20.2 ± 3.2 21.5 ± 9.4 0.13 0.13 0.08 0.30 0.15 0.49 
Height (m) 1.63 ± 0.06 1.57 ± 0.06 1.62 ± 0.07 <0.0001 −0.02 <0.001 0.01 −0.006 0.014 
Adult weight gain (kg) 18.7 ± 15.0 22.1 ± 14.5 17.5 ± 26.7 <0.0001 1.23 <0.001 0.19 0.17 0.78 
Waist (cm) 86.6 ± 15.0 91.8 ± 14.4 91.2 ± 17.3 <0.0001 2.12 <0.001 0.03 0.87 0.096 
Waist-to-hip ratio 0.80 ± 0.07 0.84 ± 0.07 0.82 ± 0.08 <0.0001 0.015 <0.001 0.04 0.003 0.18 
*

P value for differences in body size measure by self-reported race and ethnicity, adjusted for age and study center.

Proportion membership in each of two assumed populations was estimated based on ancestry informative markers and the STRUCTURE program. Slope values represent the change in mean of the body size measure per quintile of membership in population 1, and P value is the significance of this trend, among control subjects, from a regression model adjusted for age and study center.

P value for trend in body size measure with genetic admixture from a regression model as described above, further adjusted for reported race and ethnic background of self and parents.

Subject Characteristics of Hispanics in Relation to Admixture

Quintile 5, representing highest proportion membership in population 1, had the highest proportion of women with low language acculturation (i.e., reporting speaking and writing Spanish only or Spanish better than English; Table 4) and the highest proportion of women with less than a high school education. The associations with language acculturation and education persisted (P = 0.009 and 0.001, respectively) when the group was restricted to controls who reported only Hispanic background for themselves and both parents. The prevalence of self-reported diabetes increased with genetic admixture among Hispanic controls. The statistical significance of these associations became stronger if the second, Hispanic-only, admixture estimates were used. There was no evidence that reproductive history characteristics associated with breast cancer, including age at menarche, age at first birth, or number of births, differed by genetic admixture.

Table 4.

Characteristics of Hispanic subjects in relation to genetic admixture, Four Corners Study

Hispanics, by quintile of genetic admixture*
P
Q1, n (%)Q2, n (%)Q3, n (%)Q4, n (%)Q5, n (%)
Cases 99 (41.8) 124 (47.9) 121 (46.9) 95 (40.6) 116 (46.2) 0.65 
Controls 138 (58.2) 135 (52.1) 137 (53.1) 139 (59.4) 135 (53.8)  
    OR 1.17 1.13 0.96 1.14  
    (95% CI) Referent (0.81-1.68) (0.78-1.63) (0.65-1.40) (0.78-1.66)  
Controls, by selected characteristics       
Age (y)       
    25-39 7 (5.1) 14 (10.4) 16 (11.7) 21 (15.1) 23 (17.0) 0.06 
    40-49 41 (29.7) 42 (31.1) 32 (23.4) 42 (30.2) 31 (23.0)  
    50-59 39 (28.3) 28 (20.7) 40 (29.2) 34 (24.5) 42 (31.1)  
    60-69 33 (23.9) 42 (31.1) 30 (21.9) 26 (18.7) 27 (20.0)  
    70-79 18 (13.0) 9 (6.7) 19 (13.9) 16 (11.5) 12 (8.9)  
Language acculturation       
    Low 20 (14.9) 22 (16.3) 21 (15.4) 28 (20.1) 44 (32.6) 0.009 
    Moderate 49 (36.6) 51 (37.8) 53 (39.0) 51 (36.7) 47 (34.8)  
    High 65 (48.5) 62 (45.9) 62 (45.6) 60 (43.2) 44 (32.6)  
Education       
    Less than high school graduate 32 (23.2) 31 (23.0) 31 (22.6) 43 (30.9) 52 (38.5) 0.001 
    High school graduate or GED 40 (29.0) 41 (30.4) 35 (25.5) 35 (25.2) 34 (25.2)  
    Some college or trade school 38 (27.5) 37 (27.4) 40 (29.2) 42 (30.2) 31 (23.0)  
    Bachelor's degree or higher 28 (20.3) 26 (19.3) 31 (22.6) 19 (13.7) 18 (13.3)  
History of diabetes       
    No 128 (92.8) 115 (85.2) 122 (89.1) 115 (82.7) 115 (85.2) 0.04 
    Yes 10 (7.2) 20 (14.8) 15 (10.9) 24 (17.3) 20 (14.8)  
Age at menarche (y)       
    ≤11 32 (23.4) 29 (21.5) 25 (18.4) 25 (18.1) 25 (18.5) 0.48 
    12 36 (26.3) 27 (20.0) 34 (25.0) 42 (30.4) 32 (23.7)  
    13 31 (22.6) 32 (23.7) 34 (25.0) 31 (22.5) 42 (31.1)  
    ≥14 38 (27.7) 47 (34.8) 43 (31.6) 40 (29.0) 36 (26.7)  
Age at first birth (y)       
    <20 32 (24.8) 37 (30.1) 32 (26.7) 41 (32.8) 35 (28.2) 0.28 
    20-24 55 (42.6) 52 (42.3) 55 (45.8) 56 (44.8) 49 (39.5)  
    25-29 27 (20.9) 32 (26.0) 23 (19.2) 21 (16.8) 31 (25.0)  
    ≥30 15 (11.6) 2 (1.6) 10 (8.3) 7 (5.6) 9 (7.3)  
    None 7 (5.1) 12 (8.9) 16 (11.8) 14 (10.1) 11 (8.1)  
No. births       
    None 7 (5.1) 12 (8.9) 16 (11.8) 14 (10.1) 11 (8.1) 0.45 
    1-2 51 (37.5) 45 (33.3) 54 (39.7) 34 (24.5) 54 (40.0)  
    3-4 57 (41.9) 56 (41.5) 44 (32.4) 58 (41.7) 50 (37.0)  
    5+ 21 (15.4) 22 (16.3) 22 (16.2) 33 (23.7) 20 (14.8)  
Hispanics, by quintile of genetic admixture*
P
Q1, n (%)Q2, n (%)Q3, n (%)Q4, n (%)Q5, n (%)
Cases 99 (41.8) 124 (47.9) 121 (46.9) 95 (40.6) 116 (46.2) 0.65 
Controls 138 (58.2) 135 (52.1) 137 (53.1) 139 (59.4) 135 (53.8)  
    OR 1.17 1.13 0.96 1.14  
    (95% CI) Referent (0.81-1.68) (0.78-1.63) (0.65-1.40) (0.78-1.66)  
Controls, by selected characteristics       
Age (y)       
    25-39 7 (5.1) 14 (10.4) 16 (11.7) 21 (15.1) 23 (17.0) 0.06 
    40-49 41 (29.7) 42 (31.1) 32 (23.4) 42 (30.2) 31 (23.0)  
    50-59 39 (28.3) 28 (20.7) 40 (29.2) 34 (24.5) 42 (31.1)  
    60-69 33 (23.9) 42 (31.1) 30 (21.9) 26 (18.7) 27 (20.0)  
    70-79 18 (13.0) 9 (6.7) 19 (13.9) 16 (11.5) 12 (8.9)  
Language acculturation       
    Low 20 (14.9) 22 (16.3) 21 (15.4) 28 (20.1) 44 (32.6) 0.009 
    Moderate 49 (36.6) 51 (37.8) 53 (39.0) 51 (36.7) 47 (34.8)  
    High 65 (48.5) 62 (45.9) 62 (45.6) 60 (43.2) 44 (32.6)  
Education       
    Less than high school graduate 32 (23.2) 31 (23.0) 31 (22.6) 43 (30.9) 52 (38.5) 0.001 
    High school graduate or GED 40 (29.0) 41 (30.4) 35 (25.5) 35 (25.2) 34 (25.2)  
    Some college or trade school 38 (27.5) 37 (27.4) 40 (29.2) 42 (30.2) 31 (23.0)  
    Bachelor's degree or higher 28 (20.3) 26 (19.3) 31 (22.6) 19 (13.7) 18 (13.3)  
History of diabetes       
    No 128 (92.8) 115 (85.2) 122 (89.1) 115 (82.7) 115 (85.2) 0.04 
    Yes 10 (7.2) 20 (14.8) 15 (10.9) 24 (17.3) 20 (14.8)  
Age at menarche (y)       
    ≤11 32 (23.4) 29 (21.5) 25 (18.4) 25 (18.1) 25 (18.5) 0.48 
    12 36 (26.3) 27 (20.0) 34 (25.0) 42 (30.4) 32 (23.7)  
    13 31 (22.6) 32 (23.7) 34 (25.0) 31 (22.5) 42 (31.1)  
    ≥14 38 (27.7) 47 (34.8) 43 (31.6) 40 (29.0) 36 (26.7)  
Age at first birth (y)       
    <20 32 (24.8) 37 (30.1) 32 (26.7) 41 (32.8) 35 (28.2) 0.28 
    20-24 55 (42.6) 52 (42.3) 55 (45.8) 56 (44.8) 49 (39.5)  
    25-29 27 (20.9) 32 (26.0) 23 (19.2) 21 (16.8) 31 (25.0)  
    ≥30 15 (11.6) 2 (1.6) 10 (8.3) 7 (5.6) 9 (7.3)  
    None 7 (5.1) 12 (8.9) 16 (11.8) 14 (10.1) 11 (8.1)  
No. births       
    None 7 (5.1) 12 (8.9) 16 (11.8) 14 (10.1) 11 (8.1) 0.45 
    1-2 51 (37.5) 45 (33.3) 54 (39.7) 34 (24.5) 54 (40.0)  
    3-4 57 (41.9) 56 (41.5) 44 (32.4) 58 (41.7) 50 (37.0)  
    5+ 21 (15.4) 22 (16.3) 22 (16.2) 33 (23.7) 20 (14.8)  

Abbreviation: GED, General Education Development.

*

Proportion membership in each of two assumed populations was estimated based on ancestry informative markers and the STRUCTURE program. Characteristics of Hispanics are shown for subgroups, categorized according to quintile of membership in population 1.

P value for association of genetic admixture with ordered categories of row variable, from age- and center-adjusted ordinal logistic regression.

Acculturation category is based on self-report of language spoken and written. Low represents speaking and writing Spanish only or Spanish better than English; moderate, both Spanish and English; and high, English better than Spanish or English only.

Breast Cancer Risk and CRRs

The proportions of breast cancer cases and controls were similar across quintiles of admixture among Hispanics (Table 4), ORs were close to one, and there was no evidence of a trend in breast cancer risk with genetic admixture (P = 0.65). Subgroup analysis for premenopausal and postmenopausal women and of estrogen receptor–positive and estrogen receptor–negative breast cancers did not reveal significant associations between genetic admixture and breast cancer risk nor did addition or removal of variables representing potential confounders from the model. Estimates of admixture from the second, Hispanic-only model were not associated with breast cancer.

The CRRs based on our observed prevalence of candidate SNPs in each quintile of admixture among Hispanics and ORs for breast cancer in each quintile fell between 0.99 and 1.01 (Fig. 2A), indicating almost no confounding due to genetic admixture of the associations between candidate SNPs and breast cancer in this Southwest Hispanic population.

Figure 2.

CRR for influence of genetic admixture on associations between candidate gene alleles and breast cancer in Hispanics, calculated from observed data in the Four Corners Study population (A) and from simulated scenarios (B-D). A. CRRs based on observed risk ratios across quintiles of genetic admixture and observed proportion with a minor allele in each quintile, for 11 candidate gene. Labels for each data point are gene (and SNP) names. B. CRRs based on the observed risk ratios across quintiles of genetic admixture and a simulated more extreme distribution of minor allele frequencies. Labels for each data point indicate the lowest value of frequency of the minor allele used for that calculation. C. CRRs based on a simulated extreme possible range of risk ratios and the observed distribution of candidate SNP allele frequencies. D. CRRs based on simulated extreme distributions of both risk ratios and minor allele frequencies.

Figure 2.

CRR for influence of genetic admixture on associations between candidate gene alleles and breast cancer in Hispanics, calculated from observed data in the Four Corners Study population (A) and from simulated scenarios (B-D). A. CRRs based on observed risk ratios across quintiles of genetic admixture and observed proportion with a minor allele in each quintile, for 11 candidate gene. Labels for each data point are gene (and SNP) names. B. CRRs based on the observed risk ratios across quintiles of genetic admixture and a simulated more extreme distribution of minor allele frequencies. Labels for each data point indicate the lowest value of frequency of the minor allele used for that calculation. C. CRRs based on a simulated extreme possible range of risk ratios and the observed distribution of candidate SNP allele frequencies. D. CRRs based on simulated extreme distributions of both risk ratios and minor allele frequencies.

Close modal

For a simulated scenario of more extreme possible distribution of allele frequencies across genetic admixture, combined with our observed admixture-breast cancer ORs (Fig. 2B), or a more extreme possible distribution breast cancer risk across genetic admixture, combined with our observed data for allele frequency distributions (Fig. 2C), the range of CRRs became wider, but all fell within the interval 0.90 to 1.10. It was only in scenarios that combined the two extremes of confounder-allele frequency and confounder-breast cancer associations that the magnitude of CRRs fell beyond this range. Our calculations indicate that CRRs falling outside the range of 0.80 to 1.20 occur if the difference in allele frequencies between quintiles 1 and 5 was large, >0.20, or if the frequency of the minor allele was very small, <0.05, in one quintile of admixture (Fig. 2D).

We found that genetic admixture as estimated from 15 AIMs was strongly related to self-reported race and ethnicity and to allele frequencies for candidate gene SNPs in women from the U.S. Southwest. The allele frequencies of candidate gene SNPs in the control population varied in a trend across genetic admixture for every SNP examined, and the significance of trends with genetic admixture for several SNPs persisted after adjustment for race and ethnicity. The trends in SNP allele frequencies with genetic admixture represent an association between the confounder and exposures, one component confounding, as predicted by those concerned about population stratification and its effects on epidemiologic studies (2, 9, 35). However, an essentially null association between admixture and breast cancer was observed, so that CRRs calculated from these observed data ranged from 0.99 to 1.01, indicating no appreciable confounding of breast cancer risk ratio estimates by genetic admixture in Hispanics. Thus, in this specific example, using genetic admixture measured for a large population, the bias due to genetic admixture or population stratification seems to be too small to be of concern for most exposures, consistent with what others have predicted based on simulations and theory (10, 18, 36).

The values of the admixture estimates in our study population (e.g., Hispanics having a median of 0.63 membership in population 1) should not be compared with those from other investigators who have estimated 30% to 35% Native American heritage for Hispanics in the Southwest (1, 11, 37). Our study does not include an American Indian reference population. The great majority of genotyped individuals in our population were self-reported non-Hispanic Whites or Hispanics. Our estimated admixture values therefore represent relative position within this study population, not estimates of true percent American Indian ancestry.

A potential limitation of our study is that the number of genetic admixture markers genotyped, 15, is smaller than what several authors have proposed be used to estimate proportion ancestry (16, 38, 39). Rosenberg et al. (16) estimated that when using highly informative markers (δ of 0.6, which is the mean for our markers; ref. 19), 9 markers provide a SD of 0.2 and 35 markers provide a SD of 0.1. Another group came to a similar conclusion, where 40 markers with δ of 0.6 would provide a SD of 0.1 (39). These authors emphasize precision of assignment of individual ancestry as a goal. In contrast, Wang et al. (40) asserted that a single marker, if well chosen, would reduce bias to OR estimates. Only a few disease association studies incorporating measurement of genetic admixture in U.S. Hispanics have been carried out to date. The number of markers used by some recent studies have ranged from 6 to 44 (1, 2, 4, 6, 19). A study using 17 markers in a population of >1,000 (4) described strong associations between admixture and diabetes and other characteristics among Hispanics. Bertoni et al. (1), using only six markers, were able to detect distinct differences in proportion of ancestry from European, American Indian, and African heritage for Hispanics from different regions of the United States.

We know of no other studies that have evaluated breast cancer in Hispanics by genetic admixture. Practical considerations (i.e., the cost and the amount of DNA needed) limit the number of admixture markers that can be genotyped for large epidemiologic studies. It is possible that studies genotyping larger numbers of AIMs may detect an association between admixture and breast cancer in Hispanics. However, the observations that the admixture estimates from this study were associated with candidate SNP allele frequencies and with body size and diabetes in a direction consistent with prior research (4, 5), but were not associated with breast cancer, suggest to us that that the admixture measure has validity but that the true association of admixture with breast cancer among Hispanics is relatively weak. A second set of admixture estimates, based on Hispanics only, seemed to fit the data better in terms of revealing stronger associations with SNPs and body size but also was not associated with breast cancer. Cultural and environmental factors, in addition to genetics, play a role in differences in breast cancer incidence between Hispanics and non-Hispanic Whites (41-43). Among Hispanics, the within-group difference in genetic factors inherited from Europeans and American Indians may have a relatively minor role in determining breast cancer risk.

Another possible limitation affecting the estimates of association between breast cancer and admixture is low participation rates. If factors influencing participation were related to admixture and if these factors differed between Hispanic cases and Hispanic controls, the ORs could be biased. An analysis of census characteristics of communities of residence for individuals selected for the Four Corners Study suggests that characteristic affecting participation were similar for cases and controls.7

7

Unpublished data.

Low participation should not influence our ability to correctly describe associations between admixture and exposures among participating controls.

We addressed the possible limitation that CRRs based on our observed data may have been underestimated by carrying out CRR calculations for simulated scenarios. These estimates place boundaries on the amount of confounding by genetic admixture that might be present. An extreme scenario would be one in which Hispanic women in the lowest quintile of genetic admixture have a breast cancer risk equal to that of non-Hispanic Whites, and women in the highest quintile have a breast cancer risk equal to that of Southwest American Indians, 2.5 times lower. For the allele frequency differences that we observed for 11 candidate gene SNPs, with a maximum of 0.10 across quintiles of admixture, CRRs would remain negligible even with this extreme range of breast cancer risks. CRRs from the simulated scenarios indicate that for candidate genes with allele frequency differences, >0.20 across quintiles of admixture, or with very low prevalence, <0.05 in one quintile, investigators should be concerned about bias to candidate gene-breast cancer risk estimates due to admixture.

Allele frequency differences of 0.20 between race and ethnic groups have been observed (44), but for the candidate SNPs genotyped in this study, selected without prior knowledge of allele frequencies in Hispanics or American Indians, most had smaller allele frequency differences between Hispanics and non-Hispanic Whites, the exception being the IL6 gene SNPs. If between-group allele frequencies of this magnitude are infrequently observed for Hispanics and non-Hispanic Whites, we would expect that candidate genes with within-group allele frequency differences of 0.20 or greater across admixture categories in the Hispanic population can occur but will not be common.

In the Hispanic population, the genetic admixture estimates were associated with language acculturation and with educational attainment, characteristics that are not thought to be genetically determined. These associations persisted when the comparison was limited to women reporting only Hispanic heritage. Chakraborty et al. (4) reported associations between genetic admixture and socioeconomic status for Mexican-Americans in Texas. These associations between genetic admixture and cultural factors likely reflect the diverse backgrounds of Hispanics in the Southwest, who include communities who have resided in New Mexico and southern Colorado since the establishment of Spanish-speaking settlements in the 1690s (45), as well as more recent migrants. Populations within Mexico are recognized to vary substantially in their amount of European heritage, with some isolated groups having almost no European admixture (46) so that groups of Hispanic migrants to the U.S. Southwest also may vary in their proportion of genetic heritage from populations native to the Americas. It is recognized that Hispanics in the U.S. cannot be thought of as a homogeneous group when assessing factors influencing health (1, 47). The role of societal and cultural influences versus genetics in determining health differences between groups defined by self-identified race and ethnicity has been the subject of discussion (48). The data from the Four Corners Study provide an example of the expected complexity (49) present when genetic background and cultural factors vary, and are mutually correlated, within one self-identified group.

AIMs were genotyped in the Four Corners Study population to provide a measure of genetic admixture, which could be used to address potential effect modification and confounding of exposure-breast cancer risk estimates. The present analysis addresses confounding. The absence of confounding does not rule out the possibility of effect modification by genetic admixture. Several studies have reported that the influence of exposures on breast cancer risk differs between non-Hispanic White and Hispanic women (8, 43, 50-52), so it follows that the strength of exposure-disease associations among Hispanics may vary according to the amount of a woman's American Indian versus European genetic heritage. In analyses of exposure-breast cancer associations for the Four Corners Study, the genetic admixture estimates will be used as a stratifying variable to assess effect modification.

Considering the strong association between self-reported race/ethnicity and admixture and the essentially null association within the Hispanic group for admixture and breast cancer risk, it seems that self-reported race and ethnicity will, in most situations, be an adequate measure for adjustment for genetic admixture in studies of breast cancer that include Hispanic populations. Candidate genes with potential for residual confounding can be recognized by large differences in allele frequency between Hispanics and non-Hispanic Whites. Further studies should consider whether application of larger panels of AIMs for Hispanics affect the strength of association of admixture with exposures and with breast cancer and whether the associations that we observed are similar or different in other Hispanic populations.

Grant support: U.S. NIH grants CA 078682, CA 078762, CA 078552, and CA 078802. National Cancer Institute contract #N01-PC-67000, with additional support from the State of Utah Department of Health (Utah Cancer Registry).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

We thank Sandra Edwards, Karen Curtin, Roger Edwards, Leslie Palmer, Betsy Riesendal, Tara Patton, Jason Witter, and Kelly May for their contributions to this study.

1
Bertoni B, Budowle B, Sans M, Barton SA, Chakraborty R. Admixture in Hispanics: distribution of ancestral population contributions in the Continental United States.
Hum Biol
2003
;
75
:
1
–11.
2
Choudhry S, Coyle NE, Tang H, et al. Population stratification confounds genetic association studies among Latinos.
Hum Genet
2006
;
118
:
652
–64.
3
Samet JM, Coultas DB, Howard CA, Skipper BJ, Hanis CL. Diabetes, gallbladder disease, obesity, and hypertension among Hispanics in New Mexico.
Am J Epidemiol
1988
;
128
:
1302
–11.
4
Chakraborty R, Ferrell RE, Stern MP, Haffner SM, Hazuda HP, Rosenthal M. Relationship of prevalence of non-insulin-dependent diabetes mellitus to Amerindian admixture in the Mexican Americans of San Antonio, Texas.
Genet Epidemiol
1986
;
3
:
435
–54.
5
Mitchell BD, Williams-Blangero S, Chakraborty R, et al. A comparison of three methods for assessing Amerindian admixture in Mexican Americans.
Ethn Dis
1993
;
3
:
22
–31.
6
Parra EJ, Hoggart CJ, Bonilla C, et al. Relation of type 2 diabetes to individual admixture and candidate gene polymorphisms in the Hispanic American population of San Luis Valley, Colorado.
J Med Genet
2004
;
41
:
e116
.
7
National Cancer Institute DCCPS Surveillance Research Program Cancer Statistics Branch. Surveillance, Epidemiology, and End Results (SEER) Program. SEER*Stat Databases: Incidence—SEER 13 Regs Public-Use. 2006. Available from: http://www.seer.cancer.gov.
8
Slattery ML, Sweeney C, Edwards S, et al. Body size, weight change, fat distribution and breast cancer risk in Hispanic and non-Hispanic white women. Breast Cancer Research and Treatment. In press 2006.
9
Thomas DC, Witte JS. Point: population stratification: a problem for case-control studies of candidate-gene associations?
Cancer Epidemiol Biomarkers Prev
2002
;
11
:
505
–12.
10
Wacholder S, Rothman N, Caporaso N. Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer.
Cancer Epidemiol Biomarkers Prev
2002
;
11
:
513
–20.
11
Collins-Schramm HE, Phillips CM, Operario DJ, et al. Ethnic-difference markers for use in mapping by admixture linkage disequilibrium.
Am J Hum Genet
2002
;
70
:
737
–50.
12
Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies.
Genetics
2003
;
164
:
1567
–87.
13
Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data.
Genetics
2000
;
155
:
945
–59.
14
Pritchard JK, Rosenberg NA. Use of unlinked genetic markers to detect population stratification in association studies.
Am J Hum Genet
1999
;
65
:
220
–8.
15
Pritchard JK, Donnelly P. Case-control studies of association in structured or admixed populations.
Theor Popul Biol
2001
;
60
:
227
–37.
16
Rosenberg NA, Li LM, Ward R, Pritchard JK. Informativeness of genetic markers for inference of ancestry.
Am J Hum Genet
2003
;
73
:
1402
–22.
17
Miettinen OS. Components of the crude risk ratio.
Am J Epidemiol
1972
;
96
:
168
–72.
18
Wacholder S, Rothman N, Caporaso N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias.
J Natl Cancer Inst
2000
;
92
:
1151
–8.
19
Collins-Schramm HE, Chima B, Morii T, et al. Mexican American ancestry-informative markers: examination of population structure and marker characteristics in European Americans, Mexican Americans, Amerindians, and Asians.
Hum Genet
2004
;
114
:
263
–71.
20
Rogers A, Murtaugh MA, Edwards S, Slattery ML. Contacting controls: are we working harder for similar response rates, and does it make a difference?
Am J Epidemiol
2004
;
160
:
85
–90.
21
Slattery ML, Sweeney C, Edwards S, et al. Physical activity patterns and obesity in Hispanic and non-Hispanic white women.
Med Sci Sports Exerc
2006
;
38
:
33
–41.
22
Buechley RW, Generally Useful Ethnic Search System: GUESS. Cancer Research and Treatment Center. Albuquerque (NM): University of New Mexico; 1976.
23
Howard CA, Samet JM, Buechley RW, Schrag SD, Key CR. Survey research in New Mexico Hispanics: some methodological issues.
Am J Epidemiol
1983
;
117
:
27
–34.
24
Elbein SC, Hoffman M, Barrett K, et al. Role of the β3-adrenergic receptor locus in obesity and noninsulin-dependent diabetes among members of Caucasian families with a diabetic sibling pair.
J Clin Endocrinol Metab
1996
;
81
:
4422
–7.
25
Slattery ML, Sweeney C, Murtaugh M, et al. Associations between ERα, ERβ, and AR genotypes and colon and rectal cancer.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
2936
–42.
26
van Meurs JB, Schuit SC, Weel AE, et al. Association of 5′ estrogen receptor α gene polymorphisms with bone mineral density, vertebral bone area, and fracture risk.
Hum Mol Genet
2003
;
12
:
1745
–54.
27
Deal C, Ma J, Wilkin F, et al. Novel promoter polymorphism in insulin-like growth factor-binding protein-3: correlation with serum levels and interaction with known regulators.
J Clin Endocrinol Metab
2001
;
86
:
1274
–80.
28
Almind K, Bjorbaek C, Vestergaard H, Hansen T, Echwald S, Pedersen O. Aminoacid polymorphisms of insulin receptor substrate-1 in non-insulin-dependent diabetes mellitus.
Lancet
1993
;
342
:
828
–32.
29
Ehrmann DA, Tang X, Yoshiuchi I, Cox NJ, Bell GI. Relationship of insulin receptor substrate-1 and -2 genotypes to phenotypic features of polycystic ovary syndrome.
J Clin Endocrinol Metab
2002
;
87
:
4297
–300.
30
Slattery ML, Samowitz W, Curtin K, et al. Associations among IRS1, IRS2, IGF1, and IGFBP3 genetic polymorphisms and colorectal cancer.
Cancer Epidemiol Biomarkers Prev
2004
;
13
:
1206
–14.
31
Hardy DO, Carino C, Catterall JF, Larrea F. Molecular characterization of a genetic variant of the steroid hormone-binding globulin gene in heterozygous subjects.
J Clin Endocrinol Metab
1995
;
80
:
1253
–6.
32
McClure L, Eccleshall TR, Gross C, et al. Vitamin D receptor polymorphisms, bone mineral density, and bone metabolism in postmenopausal Mexican-American women.
J Bone Miner Res
1997
;
12
:
234
–40.
33
Harris SS, Eccleshall TR, Gross C, Dawson-Hughes B, Feldman D. The vitamin D receptor start codon polymorphism (FokI) and bone mineral density in premenopausal American black and white women.
J Bone Miner Res
1997
;
12
:
1043
–8.
34
Slattery ML, Yakumo K, Hoffman M, Neuhausen S. Variants of the VDR gene and risk of colon cancer (United States).
Cancer Causes Control
2001
;
12
:
359
–64.
35
Heiman GA, Hodge SE, Gorroochurn P, Zhang J, Greenberg DA. Effect of population stratification on case-control association studies. I. Elevation in false positive rates and comparison to confounding risk ratios (a simulation study).
Hum Hered
2004
;
58
:
30
–9.
36
Wang Y, Localio R, Rebbeck TR. Evaluating bias due to population stratification in case-control association studies of admixed populations.
Genet Epidemiol
2004
;
27
:
14
–20.
37
Bonilla C, Parra EJ, Pfaff CL, et al. Admixture in the Hispanics of the San Luis Valley, Colorado, and its implications for complex trait gene mapping.
Ann Hum Genet
2004
;
68
:
139
–53.
38
Tsai HJ, Choudhry S, Naqvi M, Rodriguez-Cintron W, Burchard EG, Ziv E. Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations.
Hum Genet
2005
;
118
:
424
–33.
39
McKeigue PM, Carpenter JR, Parra EJ, Shriver MD. Estimation of admixture and detection of linkage in admixed populations by a Bayesian approach: application to African-American populations.
Ann Hum Genet
2000
;
64
:
171
–86.
40
Wang Y, Localio R, Rebbeck TR. Bias correction with a single null marker for population stratification in candidate gene association studies.
Hum Hered
2005
;
59
:
165
–75.
41
John EM, Phipps AI, Davis A, Koo J. Migration history, acculturation, and breast cancer risk in Hispanic women.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
2905
–13.
42
Chlebowski RT, Chen Z, Anderson GL, et al. Ethnicity and breast cancer: factors influencing differences in incidence and outcome.
J Natl Cancer Inst
2005
;
97
:
439
–48.
43
Gilliland FD, Hunt WC, Baumgartner KB, et al. Reproductive risk factors for breast cancer in Hispanic and non-Hispanic white women: the New Mexico Women's Health Study.
Am J Epidemiol
1998
;
148
:
683
–92.
44
Goddard KA, Hopkins PJ, Hall JM, Witte JS. Linkage disequilibrium and allele-frequency distributions for 114 single-nucleotide polymorphisms in five populations.
Am J Hum Genet
2000
;
66
:
216
–34.
45
Gonzalez NL, The Spanish-Americans of New Mexico: A heritage of pride. Albuquerque (NM): University of New Mexico Press; 1969.
46
Bonilla C, Gutierrez G, Parra EJ, Kline C, Shriver MD. Admixture analysis of a rural population of the state of Guerrero, Mexico.
Am J Phys Anthropol
2005
;
128
:
861
–9.
47
Lara M, Gamboa C, Kahramanian MI, Morales LS, Bautista DE. Acculturation and Latino health in the United States: a review of the literature and its sociopolitical context.
Annu Rev Public Health
2005
;
26
:
367
–97.
48
Rebbeck TR, Sankar P. Ethnicity, ancestry, and race in molecular epidemiologic research.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
2467
–71.
49
Risch N, Burchard E, Ziv E, Tang H. Categorization of humans in biomedical research: genes, race, and disease.
Genome Biol
2002
;
3
:
comment2007.1
–comment2007.12.
50
Wenten M, Gilliland FD, Baumgartner K, Samet JM. Associations of weight, weight change, and body mass with breast cancer risk in Hispanic and non-Hispanic white women.
Ann Epidemiol
2002
;
12
:
435
–4.
51
Li R, Gilliland FD, Baumgartner KB, Samet J. Family history and risk of breast cancer in Hispanic and non-Hispanic women: the New Mexico Women's Health Study.
Cancer Causes Control
2001
;
12
:
747
–53.
52
Baumgartner KB, Hunt WC, Baumgartner RN, et al. Association of body composition and weight history with breast cancer prognostic markers: divergent pattern for Hispanic and non-Hispanic White women.
Am J Epidemiol
2004
;
160
:
1087
–97.