To the Editors: Of course we agree with Kristal et al. (1) that validity is an issue of degree rather than a dichotomy and that multiple methods will usually be useful when evaluating validity because a perfect standard is rarely available. However, their examples based on face validity, construct validity, and predictive validity do not provide evidence that food frequency questionnaires should be abandoned.
Face validity may have some value when data are not available, but their example based on a single item from our questionnaire (frequency of consuming “beef, pork, or lamb as a sandwich or mixed dish, such as, stew, casserole, lasagna, etc.”) shows how this subjective assessment can be abused. Certainly, we would never draw any conclusions from this question alone. However, when used in combination with our main meat question (frequency of consumption of beef, pork, or lamb as a main dish, such as a steak or roast) and four other questions on red meat, this question helps identify individuals with very different intakes of red meat and, in conjunction with other questions, very different intakes of saturated fat and cholesterol. The item on meat in a sandwich or mixed dish (which we consider a “cleaner question”) reduces the inclusion of persons who rarely use beef, pork, or lamb as a main dish in a low red meat category if they frequently use these foods in other circumstances. Contrary to their implications, we do not use this question to estimate intakes of vegetables, cheese, or fats in food preparation, which are assessed by other items. Kristal et al. argue that the face validity of actual intakes as assessed by diet records is far greater. However, highly precise details about the constituents of a mixed dish are not useful if this dish is eaten only occasionally or perhaps never again. Thus, the value of this information as a measure of long-term diet is questionable. If many days of diet records are collected, we agree that this would be a superior method of dietary assessment, but even carefully weighed 7-day diet records provide intraclass correlations on the order of 0.6 for nutrient intakes (2), indicating substantial error from within-person variation alone.
Surprisingly, Kristal et al. ignore the use of weighed diet records for assessing construct validity as they argue that these provide a superior method of dietary assessment. Although they are not perfect for this purpose, with an adequate number of days and statistical adjustments for day to day variation, they can provide useful estimates of validity for food frequency questionnaires (FFQ) because the cognitive processes involved are very different, which minimizes correlated errors. As we noted earlier (3) by using the means of three FFQs over a 6-year period, correlations with energy-adjusted intakes from multiple diet records can reach 0.7 to 0.9. The findings from the Observing Protein and Energy Nutrition study (4) cited by Kristal et al. addressed intake of only one energy-adjusted nutrient, protein, and as they acknowledge, the findings need to be interpreted in light of the within-person error of the biomarker, which was not assessed in the Observing Protein and Energy Nutrition study. Contrary to their “conservative” assumption of 0.7 for within-person correlation for the biomarker, results presented by Neuhouser, et al. (5) based on findings from the Women's Health Initiative indicated that the within-person correlation for energy-adjusted or absolute protein intake assessed by repeated urinary nitrogen and doubly labeled water measurements was less than that of 0.4. This will result in a serious underestimation of the correlation with dietary intake and a serious overestimation of correlated error. As a secondary note, we disagree with the use of shared variance as a measure of validity; as pointed out by Ollie Miettinen, variance has no biological meaning (think of the biological meaning of grams of fat2/energy2); simple correlation coefficients provide more useful information in a biologically interpretable scale.
We agree with Kristal et al. that predictive validity has value, but the use of fat intake and breast cancer incidence as a criterion is a highly questionable as this is far from an established relationship. Although they say that the Women's Health Initiative results for the low-fat diet (relative risk, 0.91; 95% confidence interval, 0.83-1.01) were compatible with the expected effect given less than expected compliance, the results are also compatible with the expected effect due to the modest weight loss experienced in the trial (6) or with no effect at all. If the findings of the observational studies by Bingham et al. (7) or Freedman et al. (8) based on the Women's Health Initiative control group were adjusted for the known attenuation due to using just 4 days of diet records, their results would have been quite inconsistent with the Women's Health Initiative low-fat trial. In the Freedman et al. (8) report cited by Kristal et al., relative risks of breast cancer for the highest versus lowest quintiles of total fat intake were 2.09 (95% confidence interval, 1.31-3.61) using the food record and 1.71 (95% confidence interval, 0.70-4.18) using the FFQ. The 95% confidence intervals for the FFQ were extremely wide because the 42% of the population with the lowest fat intake assessed by the FFQ was excluded from follow-up, thus greatly reducing the statistical power, and the relative risks were not really very different or statistically distinguishable. Although the authors claim that the findings were consistent with those of Bingham et al., in the small Bingham study, the association with breast cancer appeared to be with saturated fat and in the Freedman study with unsaturated fat. These would have represented very different dietary patterns, and the most likely explanation, given the extreme differences with a vastly larger literature using other dietary assessments, is chance. Notably, animal studies of breast carcinogenesis designed to distinguish between the effects of energy balance and dietary fat have not supported an effect of dietary fat (9), and no compelling mechanism has been shown for the hypothesized effect. Far more direct and credible criteria for predictive validity include incidence of coronary heart disease and type 2 diabetes, where many lines of evidence, including well-established biological mechanisms, support causal effects of specific dietary factors, and biomarkers of nutrient intake. As we noted earlier (3), dietary intakes predicted by FFQs and by diet records tend to be similarly correlated with concentrations of these biomarkers if an appropriate temporal relation is considered. In a recent presentation (10), sucrose intake estimated by FFQ was more strongly correlated with a new biomarker for sucrose than was intake estimated by diet records.
Using the criteria for validity espoused by Kristal et al., FFQs seem to do well and comparably with 4 to 7 days of diet records for intakes of most energy-adjusted nutrients, which are the most relevant intakes for epidemiologic studies (11). In deciding among options for dietary assessment, we would add that the ability to assess intakes of foods as well as nutrients is highly desirable for a full understanding of disease relationships, and diet records do relatively less well for foods because of greater day to day variability (12). In addition, the ability to collect repeated measurements over time is important because the food supply and diets of individuals are constantly evolving; in this case, the FFQ has major advantages because of the low burden on participants and cost. Nutritional epidemiologists need a variety of tools in their chest of dietary assessment methodologies, including biochemical indicators and short-term methods. Carefully developed food frequency questionnaires have a proven record of construct and predictive validity; Kristal and Potter provide no good evidence why they should be abandoned and they provide no superior alternative.