Abstract
Molecular assays have been routinely applied to improve diagnosis for the last 25 years. Assays that guide therapy have a similar history; however, their evolution has lacked the focus on analytic integrity that is required for the molecularly targeted therapies of today. New molecularly targeted agents require assays of greater precision/quantitation to predict the likelihood of response, i.e., to identify patients whose tumors will respond, while at the same time excluding and protecting those patients whose tumors will not respond or in whom treatment will cause unacceptable toxicity. The handling of tissue has followed a fit-for-purpose approach focused on appropriateness for diagnostic needs, which is less rigorous than the demands of new molecular assays that interrogate DNA, RNA, and proteins in a quantitative, multiplex manner. There is a new appreciation of the importance and fragility of tissue specimens as the source of analytes to direct therapy. By applying a total test paradigm and defining and measuring sources of variability in specimens, we can develop a set of specifications that can be incorporated into the clinical-care environment to ensure that a specimen is appropriate for analysis and will return a true result. Clin Cancer Res; 18(6); 1524–30. ©2012 AACR.
Introduction
The examination of tissue for diagnosis and guidance of optimal therapy has been practiced since the advent of surgery. The late 19th century saw the advent of microscopic examination of tissue, and substantial advancement in the classification of disease (1). The basic form of histomorphological examination of excised tissue stained with the contrast agents hematoxylin and eosin has been in widespread use for more than a century and remains the cornerstone of diagnostic anatomic pathology. Beyond defining a tumor, this approach is able to provide prognostic information that clinicians routinely rely on to guide therapy. The most common examples include the status of surgical margins, spread of disease, and differentiation state of the tumor (grade), which can all be combined to define the stage of a tumor and predict outcome. This approach of histopathology is dependent on a fund of knowledge. With the advent of molecular biology, it is now possible to extend beyond the histomorphology of a tumor and probe the tumor for specific molecular alterations that portend behavior or can be targets of therapy. Some of these alterations are observable at the level of histomorphology; however, most are more accurately measured at the DNA, RNA, or protein level. Many analytes can also be assessed in body fluids; however, this approach introduces additional challenges. Ultimately, the goal is to use biomarkers to functionally guide therapy beyond prognosis and to predict response to therapy. The capacity to use biomarkers to monitor response at the molecular level will provide new tools to fine-tune therapy, prevent toxicity, and identify treatment failure at an earlier time point than disease progression as measured by tumor mass.
Because histopathology predates molecular pathology, the handling and processing of specimens has been optimized for histopathology, with little/no reference to molecular biology (2). Protocols for biospecimens have evolved to meet the needs of the assays performed. From the alternative perspective, the handling of biospecimens has been well recognized as contributing to assay variability and issues in assay validation (3, 4). Some tissues are amenable to repeated sampling, without concern of substantial tissue heterogeneity or sampling issues, such that every sample can be assumed to be identical, and in this case, “molecular friendly” means of preserving and optimizing the analyte of interest, rather than histomorphologic examination, can be applied. Many of the commonly employed methods for preserving tissue in a clinical setting are optimized in such a manner that the biomolecules are damaged or destroyed. A class of molecularly friendly fixatives and preservatives have been developed in an attempt to overcome these issues, but they are not feasible as replacement agents. The most common example of an approach that uses such reagents is flow cytometry. Unfortunately, this approach cannot be applied to the vast majority of solid tumors.
Investigators seeking to develop, validate, and apply integral biomarkers face a number of challenges (5, 6). One challenge is the demands placed on the biospecimen by new assays. It is impractical to replace the current methods of biospecimen collection, processing, and handling entirely. The solution to improve integral biomarker assays involves a combination of (i) evolution of biospecimen protocols; (ii) appreciation of the limitations of the current collection of biospecimens, and efforts to harmonize new biomarker assays to perform within this context; and (iii) an integrated approach to the development and validation of integral biomarker assays. The difference between how a biospecimen is handled in a clinical setting and in a research setting must be reduced (5). The discovery pipeline of new biomarkers is strong; however, validation continues to be the rate-limiting step. Bringing the laboratorian actively into the process of assay validation early in the process should help address this bottleneck.
Total Test
The application of the total test paradigm has substantially improved the manner in which tests are developed and validated (7, 8). A test is divided into 3 phases: preanalytic, analytic, and interpretive [postanalytic (Fig. 1; ref. 8)]. The preanalytic phase is discussed further below, but it can be briefly summarized as everything that occurs before the assay. The analytic phase is the assay itself (i.e., the performance of the test and the reagents involved). In the interpretive (postanalytic) phase, the test results are interpreted by means of a standardized method, within the context of a particular disease. This paradigm is equally applicable to all diagnostic procedures in clinical and anatomic pathology, ranging from histomorphologic examinations of tissue to measurements of the specific gravity of urine. The goal is to qualify a specimen for a defined assay and obtain a result within the context of the patient and the underlying disease.
Preanalytic phase
Within the laboratory, the preanalytic phase involves everything pertaining to the specimen, before the performance of the assay. This oversimplification is an attempt to capture the concept that the biospecimen represents the patient, and the physiology of the patient is as important as the collecting and handling of the specimen once it is obtained from the patient. The preanalytic phase can be divided into 2 elements: those pertaining to the patient, for which variables cannot be controlled, and those pertaining to the collection of the specimen, until the analytic phase, for which specification and control are feasible (7). Elements of the preanalytic phase include the patient's medical history, genetic and environmental factors, and medications. The U.S. Food and Drug Administration (FDA) has become much more aware and concerned about patient-derived factors (9), and test developers will need to investigate these issues more fully. The act of collecting specimens (method and setting), the containers used, time in transit, temperature of the specimen, intervening preparations/separations and stabilizations, and storage until the assay is conducted are additional common elements of the preanalytic phase. Unfortunately, there is no exhaustive or prioritized list of preanalytic factors, because they vary according to the biospecimen and analyte. From the perspective of assay development, it is essential to test the preanalytic variables that are anticipated to be encountered in a clinical setting, encompassing both patient-derived factors and specimen-handling factors. The challenge is to determine when an assay is sufficiently robust to preclude the need for additional testing of preanalytic factors.
Another challenge for the test developer is the poor appreciation and standardization in current clinical practice of the preanalytic variables that arise from the time the specimen is collected until the analytic phase (2, 10). Test developers may believe they are following a standard practice, when in fact it is best characterized as a practice for which there is substantial undocumented variability. For example, an instruction such as “invert tube to mix” does not indicate how many times or how rapidly the tube should be inverted. This variability may be present not only among laboratories but also within laboratories or instruments in laboratories. In some instances, the specification may lack appropriate detail and summarize a set of actions or conditions, introducing unrecognized variables. This can be perceived as the difference between how a protocol is written in the guidelines and how it is performed on a daily basis in the laboratory.
A commonly encountered notation is “tissue was fixed and paraffin-embedded per standard protocol.” Even within a single facility, investigators routinely use multiple protocols for tissue impregnation (the formal term for embedding). Chung and colleagues (10) observed a substantial variation in RNA quality based on differences in fixation time or tissue impregnation conditions, both at the total RNA level and with specific mRNAs. The authors were able to show definitively that fixation is a critical element of tissue processing, and that underfixation is as detrimental as overfixation (with a final recommendation of 24 hours). They also showed that longer tissue-processing times result in improved biomolecule preservation, presumably as the result of better extraction of water. The authors concurrently explored differences in buffers used in the manufacture of neutral buffered formalin. They uncovered multiple variables that have a direct impact on new assays on formalin-fixed paraffin-embedded (FFPE) tissue, and had been confounding factors in previous studies but had been unappreciated.
In the case of serum specimens, the differences in preanalytic handling factors are substantial. The lack of specifications for blood collection, handling, and preparation has the potential to alter results, as noted in several previous reviews (11–13). Examples of commonly encountered factors with the capacity to introduce relevant differences include (i) instructions to invert the specimen at the time of collection, (ii) centrifugation conditions (e.g., g-force and temperature), and (iii) storage conditions. Although the majority of specifications are empirically derived, the validation of the specifications is frequently narrow and may be based on analysis of a single analyte, despite the fact that the preservation method is designed for multiple analytes. This issue is a function of the cost of comprehensive validation, as well as the rapid evolution of testing protocols (an existing specification is often adopted to expedite development). Although specifications for handling are highly standardized, practices are poorly monitored, and only in the case of assay failure will a retrospective (site-specific) analysis be carried out. Such an analysis will often reveal a deviation from the protocol specifications.
Substantial Challenges
Collection of biospecimens is not a new effort—analysis of tissue and fluids for diagnostic purposes is well over a century old. Procedures for specimen handling and preparation have evolved over this time period in response to the increasing demands of the laboratory. Matching an assay to a biospecimen should start with an examination of the current clinical guidelines and a determination of their adequacy for the designated purpose. The introduction of new biospecimen-handling protocols is expensive and comes with substantial barriers to validation and introduction into widespread use. The goal should be to minimize new protocols and, when necessary, align them as closely as feasible with current practice. In the case of tissue, the adoption of formalin was based on the need for an aseptic, nonflammable means of storing tissue for microscopic examination. The introduction of buffers to formalin was done to improve histomorphology (2). The advent of immunohistochemistry and the desire to obtain RNA from FFPE tissue have resulted in substantial efforts to understand and improve tissue processing, with new recommendations on neutral buffered formalin, the length of fixation, and tissue processing protocols. However, efforts to replace neutral buffered formalin have not had much success (8, 10)
The forensics of preanalytic failure forms the basis of ongoing advances in specimen handling. The lack of stability of RNA in FFPE tissue (14, 15) led to new guidelines for tissue storage (16). New blood-collection tubes for proteomics were developed by BD (Franklin Lakes, NJ) to address the failings of edetic acid (EDTA)–containing tubes for serum proteomics (11, 17). These new blood-collection tubes contain protease inhibitors that facilitate mass-spectrometry analysis of the serum proteome, and the removal of EDTA reduced artifacts introduced into the specimen from the cellular components of whole blood. The challenge is to provide salient guidance to investigators and laboratorians as to which variables require the development of a risk-stratification approach to ensure that biospecimens collected for assay are of sufficient quality and quantity to allow accurate testing.
The selection of an assay for replication rarely takes into account the quality of the specification of the test (preanalytic, analytic, or interpretative), and instead is driven by the quality of the purported utility. The investigator who seeks to replicate a previously described biomarker must know details regarding the specification of the specimens, the assay itself, and the interpretation. Historically, information has come to light when assays fail, and investigators must have the integrity, resources, and opportunity to determine why the assay failed. Unfortunately, investigators often lack resources and opportunity, and either drop the potential biomarker or report that the results could not be replicated. Determining the root cause of a failure to replicate/validate an assay is often beyond the skills and resources of those performing the replication/validation. Too often it is assumed that an assay is “ready out of the box,” when in fact the assay remains in evolution through the validation phase (3). Often, samples that have been used in the initial description of a biomarker are specimens of convenience (i.e., preexisting specimens that were not collected for the study), and specification of biospecimen-related factors is more a historical recounting of what the specimens were rather than a description of the appropriateness of the biospecimens. Application of the “fit-for-purpose” paradigm is a useful means of addressing the frequently encountered variables that limit the utility of a biospecimen; however, it may not be readily apparent that some variables have never been tested. As noted above, Chung and colleagues (10) showed a substantial difference in RNA quality based on tissue impregnation time, a variable that had not been investigated previously.
Multiple parallel efforts are under way to assist investigators in this regard. The Office of Biorepositories and Biospecimens of the National Cancer Institute (NCI) has produced guidelines under the title of “NCI Best Practices for Biospecimen Resources” (18), which are referenced to empirical data and provide a starting place for any specimen-collection effort. These guidelines are now in their second generation and function as a tool for researchers to understand the current recommendations for clinical specimen collection. These guidelines attempt to be comprehensive for all biospecimens and provide a framework for the collection, management, and use of biospecimens in research. Applied correctly, they also serve as a checklist to examine deviations from best practices.
Working with the NCI's Division of Cancer Treatment and Diagnosis, a consortium of assay experts have developed template-based questionnaires to assist investigators in defining and addressing questions about biospecimens that may come up in the process of a pre–Investigational Device Exemption meeting with the FDA (19). These questions can be categorized as first-order questions, which an investigator should be able to address without having to consult with test developers (Table 1), and second-order questions, which involve matters that are routinely tested and evaluated in the process of assay hardening (Table 2). As with many aspects of medicine, this process involves substantial specialization, and the capacity to develop a new, clinically validated assay suitable for introduction into patient care is no exception. The development of clinical-laboratory–ready assays requires collaboration between laboratorians and the investigator who initially discovers the biomarker. The laboratorians will specify the assay and interrogate sources of variability, thereby defining the assay prior to its clinical validation, and showing its clinical utility (7). In the following sections, we summarize the strengths and weaknesses of some commonly tested bioanalytes. The current data are insufficient for the list to be exhaustive, or even to prioritize these issues effectively.
What is the intended use of the assay? |
What is the clinical utility (how useful is the diagnostic for its intended clinical use)? |
What additional information is required for an accurate test result? |
What biomolecule or analyte is being analyzed? |
What types of tissue/fluid will be used for analysis? |
Does a clinical specification exist for this biospecimen? |
Is this specification adequate for this test? |
Have the specimens been serially tested to determine whether they are stable in storage? |
Can the assay be replicated in another laboratory with the original specimens? |
Can the assay be replicated in the original laboratory with submitted specimens? |
What is the intended use of the assay? |
What is the clinical utility (how useful is the diagnostic for its intended clinical use)? |
What additional information is required for an accurate test result? |
What biomolecule or analyte is being analyzed? |
What types of tissue/fluid will be used for analysis? |
Does a clinical specification exist for this biospecimen? |
Is this specification adequate for this test? |
Have the specimens been serially tested to determine whether they are stable in storage? |
Can the assay be replicated in another laboratory with the original specimens? |
Can the assay be replicated in the original laboratory with submitted specimens? |
What is the minimum specimen size? |
What quantities are involved (e.g., microliters of fluid, micrograms of tissue, number of cells)? |
Are nonstandard specimen-handling procedures required? |
Are frozen tissues, alternative fixatives, or nonroutine blood-collection tubes required? |
Have common variables of biospecimen handling been tested as a source of assay variability? |
What are some issues involving fluids (e.g., temperature, time, separation technologies)? |
What are some issues involving tissues (e.g., fixative, fixation times, processing times, specimen storage)? |
Have other analytes been shown to interfere with the assay? |
Is a document available that specifies an adequate vs. inadequate biospecimen for this assay? |
What is the minimum specimen size? |
What quantities are involved (e.g., microliters of fluid, micrograms of tissue, number of cells)? |
Are nonstandard specimen-handling procedures required? |
Are frozen tissues, alternative fixatives, or nonroutine blood-collection tubes required? |
Have common variables of biospecimen handling been tested as a source of assay variability? |
What are some issues involving fluids (e.g., temperature, time, separation technologies)? |
What are some issues involving tissues (e.g., fixative, fixation times, processing times, specimen storage)? |
Have other analytes been shown to interfere with the assay? |
Is a document available that specifies an adequate vs. inadequate biospecimen for this assay? |
Fluids as a biospecimen
Fluids and their nonproteinaceous components are the best-characterized analytes in reference to preanalytic variables, and generally fall within the specialty of clinical pathology. Serum/plasma is the most common fluid analyzed (20), followed by urine (18, 21). This diverse universe of analytes is the starting point for understanding analytes in tissue.
DNA
DNA is clearly the most robust analyte with reference to preanalytic variables. Quality issues typically center on the length of DNA fragments obtained from the specimen, and whether the length of the DNA affects assay performance (14). For in situ–based assays, few qualitative data are available. Array comparative genomic hybridization (aCGH) has gained substantial ground as a means of examining copy number variations. DNA fragment length has been shown to affect call rates on aCGH platforms, resulting in modification of cutoffs for the presence of a genomic alteration. The net effect is that aCGH with shorter DNA fragments provides less information and is less capable of defining small regions of loss or gain (22). At this time, we have few data regarding preanalytic factors and epigenetic modifications; however, the development of single-nucleotide sequencers may enable investigators to interrogate the genome at a level that was not previously obtainable.
Sequence-derived biomarkers are a rapidly developing field. Numerous techniques have been widely applied in a screening modality for the last 2 decades, but all of these techniques relied on confirmation by Sanger sequencing as a confirmatory approach. With the development of next-generation sequencing technologies, it is impractical to apply Sanger sequencing as a confirmatory step, because this approach becomes a bottleneck. False-positive mutations are frequently attributed to specimen fixation and processing issues, especially with reference to tissue (23). As noted above, the central theme of tissue is purity of target. These false-positive mutations are believed to occur because of alterations to the DNA by means of chemical crosslinking of the fixatives. In this instance, the connection between preanalytic and analytic phases of the assay plays a substantial role in defining the baseline false discovery rate of the assay. It is anticipated that each instrument/assay will have its own false-positive rate, specific to the instrumentation and protocol applied, and that this information can only be defined empirically (24). This is a subject of substantial concern in the development of sequence-based biomarkers, and will have a direct impact on assay sensitivity, specificity, and false-positive rates. There is now greater interest in the development of clinically feasible, molecular-friendly, noncrosslinking fixatives for tissue; however, this approach faces other challenges.
RNA
RNA is the most labile analyte that is commonly analyzed. The nature of cellular processes has evolved such that mRNAs are encoded with multiple elements that impact the stability of RNA as a mechanism of regulating the production of proteins (2). RNA is a very inviting target for measurement as a biomarker; molecular techniques makes its measurement relatively easy by means of PCR and hybridization techniques, and the linkage of RNA's highly regulated expression to pathophysiology. Key elements of assay design are the probe placement and (in the case of PCR) amplicon size.
Assays for the presence or absence of specific mRNAs can be constructed to avoid the majority of preanalytic challenges that are commonly encountered (10). More importantly, concurrent measurements of housekeeping RNAs can provide the essential quality control check for both the analyte and the assay. In situ arrays have improved in recent years with the development of chromogenic in situ assays for Her2 amplification (e.g., SPOT-Light HER2 chromogenic in situ hybridization; Life Technologies), and traditional Northern blot analysis and reverse transcriptase (RT)-PCR assays can be specified to provide binary (present or absent) information.
As the desire for information from RNA increases, the demands on the specimen will increase concurrently. Quantitative RT-PCR is a powerful tool but is very dependent on the quality of the starting RNA. This quality is measured in both integrity and quantity (2). In addition to quantitative RNA assays, multiplex assays can be performed. Fundamentally, both present the same complexity: the lack of a defined denominator of quality for the starting RNA. The general approach has been to increase the number of control genes (most commonly housekeeping genes) in an effort to better define the quality of the RNA being subjected to analysis. In the case of OncoType Dx, 5 of 21 genes are used to ensure adequacy of the assay results and provide a form of denominator by which to measure the target RNAs (25). In contrast, miRNAs appear to pose fewer challenges than mRNA. The size of the miRNAs appears to be the salient feature (26).
Protein
Protein is the analyte we have the most experience with as a clinical diagnostic in tissue. Immunohistochemistry as a technique is >70 years old (27), and has seen widespread clinical application for >20 years with FFPE tissue. This substantial experience has both confirmed its general utility and unmasked the weaknesses in the assay. The advent of heat-mediated antigen retrieval in immunohistochemistry has increased the number of potential targets and the sensitivity of the technique in general (28).
In the clinical setting, the overwhelming preanalytic variable in immunohistochemistry remains the fixation time (10). Although other variables have been shown to have a substantial impact on assay performance, fixation times remain poorly controlled in general practice (2). It was also shown that antigen retrieval can offset differences in fixation time, increasing antigen availability in tissues that have been fixed for longer times. However, although application of this approach can improve assay sensitivity, it has a negative effect on assay calibration and specificity (29).
Only recently addressed in a systematic fashion, and of substantial concern, is the veracity of data from clinical samples that have been stored for prolonged periods of time (2). It has been well documented that RNA degrades in FFPE samples over periods measured in 5- and 10-year units (15). At present, no means of preventing degradation has been shown to halt this process entirely (16). The measurement of modified proteins (most commonly phosphoproteins) as indicators of cellular signaling offers many opportunities. Phosphoproteins are best envisioned as being as delicate as mRNA, and are susceptible to both hypoxic/ischemic effects and damage from specimen handling and fixation (10).
Although nucleic acids resulting from the degradation of cellular components can be assayed in body fluids, they are a less common analyte than secreted proteins. Because the methods for assaying the nonprotein components of fluids are well developed, only slight modifications have been required to ensure robust preanalytic performance of proteins found in the serum/plasma. Tubes with EDTA are employed most commonly; however, specialized tubes are being used with increasing frequency. It should be noted that the temperature of storage/transportation and means of centrifugation have been shown to be important preanalytic variables.
Conclusions
Biospecimen handling is a process that is continuing to evolve to meet the demands of the clinical environment. The development of robust biomarkers based on nucleic acids and proteins requires a more rigorous evaluation of factors that can affect analyte stability and variability. The NCI's Office of Biorepositories and Biospecimens recently updated its NCI Best Practices (18), which provides current, data-driven guidelines while acknowledging the limitations of current practice. Unfortunately, there is no comprehensive set of recommendations regarding biospecimens and methods for evaluating the preanalytic phase of a test. The factors that determine the optimal processing of patient samples are biospecimen, analyte, and assay specific. Our current understanding of the factors that determine the quality of a biospecimen is limited, and guidelines require constant evaluation and revision. To date, the only unified set of guidelines is that put forth for breast cancer clinical trials by Leyland-Jones and colleagues (14). These guidelines (now 4 years old and becoming out of date) only provide a framework for issues involving other malignant diseases. Ultimately, success in translational research will require the standardization of specimen handling along the lines of what is currently seen in the field of clinical chemistry, which will then lead to improved biomarker test performance. One goal of these efforts is to move away from the ad hoc nature of specimen collection for research, and to introduce and adhere to specifications for collection and analyte stabilization that are found in the clinical diagnostic setting. A second aspect of this approach is the adoption of a “blood-tube paradigm” for the collection of biospecimens beyond blood. The immediate segregation of biospecimens into analyte-optimized biospecimen collection protocols will enhance the diversity and quality of end assays. Lastly, elevating the process of collecting and handling biospecimens in clinical research beyond that of an “add-on task,” and providing appropriate budget and staffing support are critical for new biomarker development. Biospecimen collection will continue to advance and provide more robust assays, but this progress will be driven by a fit-for-purpose model. Standardization and performance standards will be required to obtain the uniform quality of biospecimens required for emerging biomarkers.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.