The use of microarray technology to measure gene expression has created optimism for the feasibility of using molecular assessments of tumors routinely in the clinical management of cancer. Gene expression arrays have been pioneers in the development of standards; both for research use and now for clinical application. Some of the existing standards have been driven by the early perception that microarray technology was inconsistent and perhaps unreliable. More recent experimentation has shown that reproducible data can be achieved and clinical standards are beginning to emerge. For the transcriptional assessment of tumors, this means a system that correctly samples a tumor, isolates RNA and processes this for microarray analysis, evaluates the data, and communicates findings in a consistent and timely fashion. The most important standard is to show that a clinically important assessment can be made with microarray data. The standards emerging from work on various parts of the entire process could guide the development of a workable system. However, the final standard for each component of the process depends on the accuracy required when the assay becomes part of the clinical routine: a routine that now includes the molecular evaluation of tumors. Cancer Epidemiol Biomarkers Prev; 19(4); 1000–3. ©2010 AACR.
Scientists have been using hybridization-based assays for the quantitation of mRNA amounts since the 1970s (1). The measurement of one gene by Northern blotting quickly shifted to the measurement of several genes in a single sample (2), and by the early 1990s, some laboratories were looking at the expression of several hundred genes at once (3-5). In the mid-1990s, two reports showed the use of tiny arrays of probes to measure the expression of thousands of genes in a single experiment (4, 5). These “micro” arrays quickly became revolutionary, as well as controversial, techniques for the investigation of biological systems. Today, microarrays mass-produced by companies could assess nearly the entire complement of genes expressed in an organism in a single assay. This whole transcriptome analysis has generated a lot of optimism about the personalization of medical treatment, particularly with respect to cancer treatment based on the assessment of gene expression in tumors (6-8).
The early years of microarray experimentation yielded novel insights into diseases and biological systems, but also created a perception that the technology might not be reliable. Groups performing similar experiments published different gene lists (9-11). Perhaps contributing to this was the high cost of early microarray experimentation and the fact that limited space in journals restricted the first microarray reports to the most significant observations and only superficial details of the experimental process. For this reason, one of the first “standards” established for microarray experimentation was the MIAME standards outlined by the Microarray Gene Expression Data society (12). The intent of the MIAME standards was to provide enough experimental details to allow researchers to reproduce published microarray-based experiments. However, differences in the experimental details did not explain fully the poor reproducibility observed in early experiments. In direct, laboratory controlled, experimental comparisons many groups had difficulty obtaining reproducible results (13-15). In most of these reports, the blame for poor replication was cast on the uniqueness of the arrays, the variability of individuals, and the nuances of the processing or statistical methods used for the microarray process. The net result was that microarrays were gaining a reputation for poor reliability and therefore validation was necessary. These results were interpreted by some that microarray technology might only be useful for rough draft discovery of important genes, not as a definitive assay (16).
Fortunately, a second phase of experimentation supports a more optimistic view of the reproducibility of microarrays technology. The main reason for poor reproducibility in early experiments was poor gene identification. A series of articles showed that probes frequently did not detect the suggested genes (17-20). Better bioinformatics support led to the ability to identify genes based on the sequence of the probes and to account for biological considerations like splice variants, gene families, and pseudogenes. Sequence base comparisons of array data produced more favorable replication (21, 22). Another contributing factor is that the processing of microarray samples is now fairly routine, therefore, experienced laboratories could generate more consistent results. Several large multi-laboratory projects directly tested the reproducibility of microarray processing and now suggest that the technology is sufficiently reproducible for clinical applications (23-26).
During the brief history of microarrays, few standards have been established for clinical applications. One reason for this is that microarrays could be used in multiple creative ways and standards established for one application could restrict the suitability of the technology for other experimental purposes. The basic standards for the industry were established early by the companies that market arrays and the general assessment of a hybridized array has been reported in several reviews (27-29). These standards involve checking the quality of the RNA isolated from the samples (30), evaluating the images of the scanned arrays for gross defects (31), evaluating the overall intensity of the sample hybridization to the array (29, 32), and some form of global assessment of the probes with detection values above background (29). This last measure is a crude measure of the distribution of gene expression values for detected genes and gives a rough estimate of the similarity of any two samples in the set of samples used for an experiment. These assessments are not really standards, in that they don't define good or bad, but rely on the majority of the samples to set the standard for each experiment and suggest removing the outliers. However, cancer is a heterogeneous disease. Even tumors that have the same pathologic appearance could be distinctly different in gene expression. Assays that rely on outliers will have great difficulty in separating technological outliers from biological heterogeneity. The basic standards also focus on the identification of grossly inferior or damaged arrays rather than on the identification of subtle flaws that might affect the reproducibility of a single clinical application. Furthermore, these standards only address a small portion of the entire process that would be involved during clinical use (33). Lastly, the tests cannot be done on a single array as might be necessary in a clinical setting. For these reasons, many laboratories have begun to assess additional components of the microarray process to establish more standards.
Clinically Directed Standards
The Food and Drug Administration (FDA) began the process with a series of position reports detailing the need for additional standards (34, 35). Scientists from the FDA, the National Institute of Standards and Technology, and laboratories in both academia and industry developed a MicroArray Quality Control project that uses reference samples to allow the coordinated comparison of the processing and analysis of microarray data (26, 34). This is a necessary beginning for the establishment of laboratory proficiency in microarray processing. The External RNA Controls Consortium outlined the need for adding RNA controls to the microarray process (36). These supplemental molecules can be used to verify that the microarray process occurred appropriately for each individual sample. Several studies looked at the effect of the overall process on the final quality measures or classification success (24, 37, 38). However, the net conclusions of these studies are that reproducibly good microarray data can be generated; they do not establish standards that can be applied to the next batch of arrays. These guidelines are yet to come. As array data accumulates, it will eventually be possible to set defined standards based on the cumulative performance of many well-processed samples.
The quality of clinical microarrays can be adversely affected long before the processing begins. Studies have shown that the time between surgical removal and sample processing (or storage) could affect gene expression (39, 40). The method of sample preservation can influence the integrity of the RNA (41). But the most important factor to consider is the correct sampling of the tumor (28, 42, 43). Tumors are heterogeneous and it is important to minimize the amount of normal tissue or nontumorigenic hyperplasia passed on to the microarray process; this ensures that the measured sample represents the in vivo tumor. Therefore, one standard that has not been outwardly stated is that a pathologic review of the sample headed for microarray analysis is essential. It is also vital to have a good clinical process for handling the samples to avoid sample mix-ups (44). This means from operating room to pathologic review of the tissue, to RNA isolation, and microarray processing. It has already been shown that the overall process can be done accurately and produce useful clinical data (37, 45). What remains is the standardization of these good clinical practices. A parallel consideration is the documentation of the clinical progress of the patient from whom the sample was collected. One consideration that is lost in the concept of personalized medicine is who will be interpreting the microarray data and what information is returned to the primary physician; when would the analysis be done with respect to the treatment of a patient; and is this a singular event or will further analysis be required as additional clinical information is acquired. The interpretation of microarray data may also require additional information about the patient (45, 46). How the microarray analysis is incorporated into the entire clinical process will define the need for further standards.
Perhaps the biggest concern in the pipeline of tumor assessment by microarrays is the classification process. There are many articles reporting the effect of array platform, processing kit, and data preprocessing methods on the final gene expression values (47). and just as many studies demonstrating how gene selection techniques can influence both the gene list and the classification of tumor samples (48, 49). Additionally, the nature to the classification algorithm can influence the success rate of a classifier (45, 50, 51). In all, there are more than 2,000 articles published on the molecular classification of cancer using microarray data, and in most cases, the array type, gene list, and classification techniques are unique. Therefore, it is almost impossible to establish standards to manage this process. Fortunately, this is not a concern when the ultimate goal is concerned. Does the classifier work? For medical uses, another consideration is whether the classification contributes to the healthcare of the patient? The only real standard is whether the classification process accurately produces a clinically important metric. The FDA has already established this standard in their initial recommendations for the use of gene expression profiling (52). This is the normal expectation for any in vitro diagnostic, which is what the FDA considers all clinical microarray tests. The FDA has further set standards for how the classification accuracy must be shown. Basically, the assays should be validated in an independent population consistent with the defined use of the assay. Metaphorically, it doesn't matter how a device is built; only that it works when used in the normal usage environment.
In developing a clinical application for microarrays, it is good to keep this standard in mind. Once the classification can be done with the best quality possible at all steps in the process, it is possible to work back through the process to determine how much noise is acceptable and still yield good classification success. This raises another important consideration; although it is useful to know when to reject poor quality data, it is also important to know how to recapture the necessary information when a sample is rejected. This has consequences for the patient if this means collecting another sample of the tumor. During the development of the gene expression–based assay, one should determine the likelihood that clinically useful data can be successfully obtained from any given patient. This is a standard that has been ignored in most microarray studies thus far. If a classifier works well with high-quality data, but this is only achieved in 10% of the patients, the test is not likely to be well regarded. Conversely, if a classifier can work well with marginal quality data that is achievable in 95% of the patients, it is more likely to be widely implemented. Therefore, during the development of a clinical assay, one should define how frequently good data is acquired in the average hospital. This will take a lot of samples to establish creating plenty of opportunities to define standards for each step of the entire clinical process.
Other Uses for Microarrays
Microarrays have given true confidence to the concept of personalizing the treatment of cancer patients. The analysis of gene expression is leading the way, but microarrays can be used for many other purposes in the assessment of a cancer patient. Microarrays can be used for genotyping, which will allow the pharmacogenomic or toxicogenomic evaluation of a patient (53). Microarrays can also be used for sequencing to perform a mutational analysis of key oncogenes and tumor suppressors. Arrays can be used to identify amplifications and deletions through comparative genome analysis and assess microRNAs and epigenetic changes that play a role in cancer (54-56). The comprehensive evaluation of a cancer patient will probably involve many of these sources of information. Therefore, additional standards will be required for these other applications should a clinical assay emerge. In developing a new clinical assay, the most important standards are: does the assay produce important clinical information and does the assay accurately produce the necessary information? It is also important to consider whether the assay can be done routinely in a clinical setting. All other standards required of the clinical process will be dictated by the need to make sure these concerns are met.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.