Abstract
The development of molecular pathologic components in epidemiologic studies offers opportunities to relate etiologic factors to specific tumor types, which in turn may allow the development of better overall risk prediction and provide clues about mechanisms that mediate risk factors. In addition, this research may help identify or validate tissue biomarkers related to prognosis and prediction of treatment responses. In this mini review, we highlight specific considerations related to the incorporation of pathology in epidemiologic studies, using breast cancer research as a model. Issues related to ensuring the representativeness of cases for which research tissue is available and understanding limitations resulting from variable procedures for tissue collection, fixation, and processing are discussed. The growing importance of molecular pathology in clinical medicine has led to increased emphasis on optimized tissue preparation, which should enhance this type of research. In addition, the availability of new technologies including tissue microarrays, image scanning, and automated analysis to achieve high-throughput standardized assessment of immunohistochemical markers, and potentially other assays, is enabling consistent scoring of a growing list of markers in large studies. Concurrently, methodologic research to extend the range of assays that can be done on fixed tissues is expanding possibilities for molecular pathologic studies in epidemiologic research. Cancer Epidemiol Biomarkers Prev; 19(4); 966–72. ©2010 AACR.
Background
Large epidemiologic studies offer important opportunities to relate etiologic exposures to cancer subtypes defined by molecular pathologic characteristics. Such investigations may identify specific relationships between risk factors and tumor subtypes; enable improved overall risk prediction; reveal mechanisms that mediate risk factors; and discover and validate biomarkers for early detection, prognosis, or prediction of treatment responses (1). Analyses of cancer incidence and mortality rates stratified by tumor subtypes may provide etiologic clues, highlight racial/ethnic disparities, and improve surveillance (2-4). The promise of this research is potentiated by a National Cancer Institute–sponsored initiative to optimize tissue collection and storage through the Biorepository Research Network (5). Despite the enthusiasm for using molecular pathology in epidemiologic studies, this approach presents some specific challenges. In this mini review, we discuss key principles related to adding pathology components to epidemiologic studies, drawing on examples from breast cancer research.
An overview of steps involved in developing the pathology components in epidemiologic studies is presented in Fig. 1. As described below, internal and external validity and quality assurance are important considerations throughout the process and technical improvements are anticipated to meet growing demands for use of molecular assays in clinical medicine.
The patients for whom research tissues are available in epidemiologic studies reflects multiple factors related to diagnosis and treatment. These practices may vary between populations, geographic regions and calendar periods, and by patient and tumor features. Practices of tissue sampling and preparation for diagnosis may vary with the type of procedure used to collect the sample, the clinical context, and institutional protocols. All of the above may affect assay performance. In large studies, efforts to standardize, optimize, and automate performance and scoring of assays are important to control costs, speed completion of studies, and minimize variability. Computer infrastructure is important throughout the “cycle,” to provide tracking, data management, and facilitate downstream statistical analysis.
The patients for whom research tissues are available in epidemiologic studies reflects multiple factors related to diagnosis and treatment. These practices may vary between populations, geographic regions and calendar periods, and by patient and tumor features. Practices of tissue sampling and preparation for diagnosis may vary with the type of procedure used to collect the sample, the clinical context, and institutional protocols. All of the above may affect assay performance. In large studies, efforts to standardize, optimize, and automate performance and scoring of assays are important to control costs, speed completion of studies, and minimize variability. Computer infrastructure is important throughout the “cycle,” to provide tracking, data management, and facilitate downstream statistical analysis.
Availability of Tissue: Ensuring Representativeness of Samples
Ensuring the representativeness of tissue specimens collected in epidemiologic investigations presents concerns that differ somewhat from those of biological specimens that researchers obtain directly from patients. Specifically, differential patterns of screening, diagnosis, and treatment may be related to both patient and tumor characteristics and may affect the associations between the two. For example, in breast cancer, mammographically detected tumors differ from cancers that are either missed or found without screening (6, 7). Radiologically guided biopsies done in outpatient facilities may remove small tumors (8), potentially skewing the availability of tissues in hospital-based studies toward larger cancers. Digital mammography is more sensitive than film-based techniques among women with dense breasts (9), and high density is related to younger age, specific exposures, higher growth fractions, and interval cancers (10). Similarly, use of magnetic resonance imaging for screening high-risk women may lead to diagnosis at younger ages and detection of smaller tumors while paradoxically showing more frequent multicentricity (11).
Access to tissue is related to tumor size, which varies by population, risk factors, and subtype. Use of neoadjuvant therapy either by necessity to render large tumors operable, or by preference, may limit availability of nontreated specimens (12), although providing opportunities to analyze risk factors for breast cancer phenotypes, defined by a combination of marker expression and therapeutic response. Some tumors detected by screening may be too small to use for research. Both patient characteristics (e.g., younger age, premenopausal status, and obesity) and tumor pathology (e.g., higher grade and marker expression) may be related to larger tumor size, suggesting the need to consider pathologic factors in risk analyses (13-15).
Effect of Study Design
Prospective studies may permit collection of research tissues according to protocols, but diagnostic requirements and batch processing for histology may be limiting. Tissue that is not needed for diagnosis, such as fat, normal, or excess tumor, may provide research opportunities but require collaborations with pathology laboratories, especially if frozen tissue is sought. Developing a “virtual tumor bank” of fixed tissue may ensure later availability if prospective assembly is impossible. Retrospective studies may permit efficient targeted collection of tissues related to known outcomes, enabling the inclusion of pathology in “nested” retrospective case-control studies assessing questions such as risk of progression of cancer precursors (16). The Surveillance, Epidemiology and End Results program has established Residual Tissue Repositories that can provide access to discarded blocks that are linked to registry data, which can potentially permit the evaluation of population-based rates by molecular characteristics (17); however, retrieval of representative tumors from a high percentage of cases is required to impute data for missing tumors. In retrospective studies, access, retrieval, variable routine processing, and degradation of tissues over time present challenges.
Tissue Preservation and Fixation
Optimized tissue preservation and fixation is critical for tissue-based research; unfortunately, factors that affect tissue quality such as prior biopsy, intraoperative hypoxia, and ischemia; time to fixation; type, concentration, and adequacy of fixative; and method of paraffin block preparation are poorly standardized and may introduce unanticipated variation (18). Collection of fresh or frozen tissues averts some of these difficulties but is expensive and infeasible in most epidemiologic studies.
Fortunately, optimizing tissue fixation to maximize its utility is a major area of research (19-22); however, unless improved practices are widely adopted, such strategies will not be available for routine diagnostic specimens or archived tissues. Advances will allow improvements in immunohistochemistry and potentially wider application of proteomics, spectroscopy, and DNA and RNA assays (19-22). Given that histology laboratories use batch processing, the pace of clinical advances may be rate limiting for epidemiologic research. Fixation in neutral buffered formalin is anticipated to remain as the gold standard because it provides morphologic appearances familiar to diagnosticians.
Assembling Tissues for Research
Most pathology reports specify blocks and their corresponding slides using designations linked to the diagnostic impression of the tissue at dissection. However, microscopic examination is needed to confirm the presence of histologic targets of interest and localize them within blocks. Off-site storage may increase time and costs for retrieval. Older tissues have declining value for some assays (23) and are frequently discarded after a specified time period, whereas access to newer blocks may be limited by clinical demands. Applying bar code labels is helpful for tracking, but labels often adhere poorly to blocks, which has prompted some laboratories to embed radio tags into blocks (24).
Tissue Microarrays for High-Throughput Analysis
Tissue microarray (TMA) construction has been reviewed elsewhere (24-28). Briefly, TMAs are constructed by removing microscopically identified tissue targets from routinely prepared tissue (“donor”) blocks and transferring them into a new (“recipient”) TMA block in preconfigured arrangements. Sections of the resulting TMA contain cores of many tumors, enabling rapid, cost-effective, high-throughput performance of assays (Fig. 2). Selected, concerns specific to molecular epidemiology are discussed below.
TMAs facilitate rapid, cost-effective high-throughput analysis of large tumor sets. Target areas within individual donor blocks are identified by examining corresponding H&E-stained tissue sections. Cylindrical cores are removed with semiautomated devices from each donor block and arrayed in a precise matrix in the recipient block to create the TMA. Assays done on single sections cut from TMA blocks enable the batched analysis of multiple tumors. Reprinted with permission: Kallioniemi O-P et al., TMA technology for high-throughput molecular profiling of cancer, Human Molecular Genetics, 2001, volume 10, issue 7: 657-62 (27).
TMAs facilitate rapid, cost-effective high-throughput analysis of large tumor sets. Target areas within individual donor blocks are identified by examining corresponding H&E-stained tissue sections. Cylindrical cores are removed with semiautomated devices from each donor block and arrayed in a precise matrix in the recipient block to create the TMA. Assays done on single sections cut from TMA blocks enable the batched analysis of multiple tumors. Reprinted with permission: Kallioniemi O-P et al., TMA technology for high-throughput molecular profiling of cancer, Human Molecular Genetics, 2001, volume 10, issue 7: 657-62 (27).
Designing TMAs
Optimal TMA design varies by project aims and tumor site. Considerations include the following: (a) numbers of cores per target, (b) core diameter (typically 0.6 or 1.0 mm), (c) placement of replicate cores in single or multiple blocks, (d) total number of cores per block, (e) core arrangement, and (f) segregation of cores from tumors with specified characteristics in dedicated blocks (Table 1).
TMAs: selected design considerations
Design consideration . | Pros . | Cons . |
---|---|---|
Increased cores/sample | Reduces missing data | Increases effort |
Captures heterogeneity | ||
Larger core diameter | Better target representation | Fewer cores/TMA |
Replicate cores separated in different TMAs | Protects material if one TMA block lost | Need to ensure consistency |
More cores/TMA | Reduces assay costs | Mapping more difficult |
Grouping cores within TMA blocks by subject or tumor characteristics | Conserves material for relevant assays | Need to ensure consistency |
Design consideration . | Pros . | Cons . |
---|---|---|
Increased cores/sample | Reduces missing data | Increases effort |
Captures heterogeneity | ||
Larger core diameter | Better target representation | Fewer cores/TMA |
Replicate cores separated in different TMAs | Protects material if one TMA block lost | Need to ensure consistency |
More cores/TMA | Reduces assay costs | Mapping more difficult |
Grouping cores within TMA blocks by subject or tumor characteristics | Conserves material for relevant assays | Need to ensure consistency |
Redundant sampling reduces missing data per subject secondary to loss of cores, resulting from tumor depletion in deeper sections, nonadherence of sections, or loss during vigorous antigen retrieval. However, high-level redundancy increases time and cost of preparation and is impossible for small targets. Theoretically, replicate sampling should improve scoring accuracy by accounting for heterogeneity in marker expression. However, as has been shown for estrogen receptor (ER), there is probably more variation in expression between than within blocks because sampling more distant topographical regions is more likely to capture biological heterogeneity (e.g., hypoxic tumor center versus growing edge; ref. 29). In the end, useful markers must be robust to sampling variability in research and clinical applications.
Using larger cores facilitates histologic characterization and minimizes missing results, but as with multiplicity of cores, target size and efficiency are limiting. For tumors with low cellularity (e.g., lobular carcinoma), larger cores may improve representation throughout the length of cores. TMA cores are often targeted to cellular tumor areas, which theoretically may bias marker levels (e.g., proliferation or cell cycling). Caution may be required when using larger needles to avoid damaging blocks.
In our experience, placing replicate cores per tumor in one TMA block may offer modest advantages for standardization, particularly for immunofluorescence assays (30). For chromogenic assays, batch staining is probably sufficient, provided that sections are cut and stored consistently. Placing cores from the same specimen into different blocks prevents the loss of all cores from a subject if a block is lost or damaged and ultimately permits more potential assays per tumor.
Many TMAs contain hundreds of cores, permitting representation of large collections in a few blocks. Preparation of high-density TMAs raises the risk that damage or errors in preparation will affect many cases and poses greater challenges for mapping cores. Constructing TMA blocks by key characteristics (e.g., ER) may be advantageous by allowing preservation of tissues for relevant assays only.
Increasingly, stained TMA sections are scanned as digital images that can be scored visually or by computer algorithms. Central placement of sections on slides is essential to ensure quality scanning and optimal staining when using automated immunostainers. Asymmetrical TMA designs or placement of sentinel cores outside the main grid facilitate orientation. Use of a single scanner and maintaining a consistent orientation is also helpful.
Tissue Cores for Molecular Studies
Loose TMA cores provide an enriched source of target tissue for molecular extraction and banking. Punching blank paraffin in between targets is probably sufficient to avert cross-contamination for DNA assays and possibly others (31); however, we recommend saving and testing blank cores for testing to exclude carry over. The availability of disposable TMA needles presents the opportunity to receive loose tissue cores for molecular extraction without borrowing blocks. Techniques to build TMAs using loose cores as starting materials have not been widely applied.
Test Arrays and Cores
Preparing test arrays from representative specimens is useful for pilot testing. Inclusion of tissue cores from well-characterized tissues or cell lines in TMAs may allow the evaluation of assay procedures but does not address concerns about preservation or fixation of individual samples. The developmental of sentinel markers of tissue preservation/fixation for breast tissue akin to the suggested use of p27 for prostate specimens would be useful, allowing either comparison of each marker to an internal standard or discounting poor quality results (32).
Sectioning and Staining TMAs
Minimizing the number of times a block is sectioned conserves tissue. However, given that cut sections suffer degradation, manifested as loss of antigenicity by immunohistochemistry during storage (33-36), it is desirable to limit the number of sections to those needed for immediate use. Therefore, cutting and staining in batches is recommended. The mechanism of degradation of cut sections remains unknown. Paraffin dipping, oxygen-free storage, and cooling may slow degradation but do not completely eliminate the effect for all tissues and markers (33-35).
Cutting TMA blocks and mounting sections on slides is challenging; each core is potentially unique, so folds and irregularities result in lost data. Some laboratories favor the use of tape transfer methods to maintain core alignment, whereas others prefer standard techniques, which limits background. Use of coated slides that enhance section adherence helps prevent loss of cores with antigen retrieval for immunohistochemistry and nucleic acid denaturation for in situ hybridization. Standardized sectioning is important for accurate semiquantitative or quantitative analyses. Given that typical sections are 4- to 5-microns thick, slight absolute differences in thickness translate into large percentage differences. All scoring methods reflect light penetration, which is affected by tissue thickness.
Scoring
Microscopic scoring of TMAs is laborious and has largely been replaced by the use of digitized images for visual or automated analysis. Technologies to improve the speed, automation, magnification, and resolution of scanning methods and to increase image storage, enhance analysis, and create flexible databases are under continuous development. In particular, the computer infrastructure required for this work is complex.
Quality assurance procedures to establish TMA orientation, core alignment, and correct subject links are critical. Visual inspection of TMAs may reveal missing, misaligned, or obscured cores. Computer algorithms to assess cores for area of tissue coverage or cell number may identify poor quality spots.
TMA data should be analyzed across rows and columns to exclude unequal staining. Highly disparate results for replicate cores per subject or differences in marker frequency distributions by study center should prompt review. Evaluation of expected associations is also useful. For example, in breast cancer, higher ER levels should be related to older patient age, postmenopausal status, lobular or tubular histology, and low tumor grade. Similarly, a high frequency of coexpression of ER and progesterone receptor (PR) is expected. In general, percentage of cells and staining intensity are strongly related for most markers. Exploring results for correlative assays for cases tested by different methods [e.g., amplification of the human epidermal growth factor 2 (HER 2) gene and immunohistochemical analysis of protein] may substantiate quality. Similarly, highly correlated mRNA and immunohistochemical levels are encouraging, but this is only expected for some markers (e.g., ER, PR), reflecting both biological and technical factors. Demonstrating expected correlations between markers within a pathway also provides validation.
Options for scoring TMAs include visual assessment, semiautomated scoring, and fully automated scoring. Software that displays images of cores for visual scoring matched to subject records for data entry offers organizational efficiency and reduced recording errors. However, availability of microscopic slides for review of individual cores as needed is desirable.
Automated assessment may permit rapid throughput, improved precision, especially compared with multiple readers, and theoretically, greater accuracy, although this has not been clearly established for most markers. Currently some staining patterns remain challenging for automated quantification. Human assessment has relative strengths related to recognition of tumor and artifacts, such as overstaining at edges of structures. Instrumentation has advantages for counting cells and assessing gradations of stain intensity. Semiautomated approaches in which a trained observer “segments” the region for computer analysis offer complementary strengths but little efficiency. Automated scoring to triage cases for visual review may also have value for some markers. Specifically, an automated algorithm that achieves excellent negative predictive value but suboptimal positive predictive value for a marker may facilitate a strategy in which only cores presumptively scored positive by the computer require visual review.
Marker data should be formatted to reflect precision and biological effects. For markers such as HER2, for which biological effects seem linked to gene amplification, dichotomous scoring with an optimal threshold for distinguishing strong positive immunohistochemical staining may be best. For other markers, categorical or continuous analyses of stain intensity, percentage of positive cells, or a combinatorial metric may be optimal. If multiple cores are scored, one can often use average or maximum values with similar results, but confirmation is required.
Future Directions
Recently, progress has been made toward developing and validating methods for methylation, mRNA, and miRNA profiling on fixed tissues (37-40). Many of these strategies compensate for the biochemical modifications and fragmentation that occur with fixation. Increasingly, newer technologies rely on sequencing strategies instead of detection using fixed probe sets. However, most carcinogenic mechanisms operate by affecting protein levels within normal, precancerous or cancerous cells, and their surrounding microenvironments. Therefore, application of robust marker-specific assays in morphologically defined contexts may be required for marker validation, biological understanding, and clinical translation.
Methods for relating immunohistochemical results to objective standards such as absolute numbers of molecules per cell could permit further advances. Development of robust assays for activated markers, such as phosphorylated proteins, may permit analysis of critical final steps in molecular pathways. Multiplexed labeling would increase the utility of individual tissue sections and enable characterization of marker coexpression within single cells. Finally, development of new statistical methods to concurrently model exposures and end points simultaneously is important for analyzing tumor markers, many of which are highly associated (41).
Relating epidemiologic exposures to molecular tumor markers is important for increasing biological understanding and clinical translation. Knowing a subject's risk for cancer is a starting point, but knowing the type of tumor that is likely to develop, the threat it poses to health, and the mechanisms that mediate its development are important. This detailed information may aid the implementation of prevention measures. For example, tamoxifen and raloxifene have established utility in preventing ER-positive breast cancers, but our modest ability to predict risk for ER-positive tumors specifically (similar to breast cancer overall), have limited implementation in practice (42). As treatment for breast cancer and other tumors moves increasingly toward the exploitation of molecular targets, risk modeling and prevention should pursue similar approaches in parallel. Incorporation of molecular pathology in epidemiologic studies will be required to realize this goal.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Acknowledgments
Grant Support: Intramural Research Program of the National Cancer Institute, the Applied Molecular Pathology laboratory, and the Tissue Array Research Program. Also supported by the Breast Cancer Association Consortium.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.