The loco-regional treatment of breast cancer is surgery (lumpectomy or mastectomy) often with adjuvant radiotherapy (RT) and hormone- and/or chemotherapy. The efficiency of RT varies among patients and is occasionally accompanied by acute skin toxicity. Late adverse effects also occurs, manifested as telangiectasias and atrophy of the skin, subcutaneous fibrosis, rib fractures, thickening of the pleura and lung fibrosis. Identification of individual factors influencing treatment response will eventually lead to tailor-made treatments and minimize the strain of unnecessary therapy. The main goal of this study is to identify relationships between gene expression patterns in primary breast tumors and response to treatment. However, prior to the biological analysis, it is of utmost importance to investigate and ensure high quality of the whole-genome microarray-data that are utilized.

In the Danish Breast Cancer Cohort (DBCG) 82 b & c trials, 3,083 patients with stages II and III disease were randomized to receive post-mastectomy radiotherapy (RT) versus no RT in addition to systemic therapy. During follow-up, clinical parameters, treatment response, and various side effects have been recorded.

High quality RNA was obtained from 200 out of 273 available fresh frozen tumor samples. Sections were cut from each tumor, and the content of tumor cells, adipocytes and stroma were determined by a pathologist. Experimental handling procedures were randomized across the clinical parameters to minimize potential experimental bias. Amplified and digoxigenin-labeled cRNA was hybridized to Applied Biosystems Human Genome Survey microarrays. A total of 232 array experiments were conducted including some replicates.

Statistical methods were utilized to investigate whether tumor cell percentages, amplification dates, hybridization dates, cRNA amplification yield, RNA quality, or microarray lot-numbers showed any systematic effects on the microarray-data.

Relationship between array production lot-numbers and signal intensities was found, indicating array manufacturing-related batch effects. However, replicate samples analyzed on microarrays with different lot-numbers show satisfactory Pearson correlation ( r = 0.960 - 0.985 ) after filtering and normalisation. The number of undetected genes (y) on an array was shown to relate to cRNA yield (x) by the equation: y = 68.37 - 4.91 log (x) + 0.01 x.

Conclusion: Batch effects exist, probably intrinsic to microarray technology, but the effect can be minimized by careful experimental planning and statistical analysis of the raw data prior to the biological analyses.

98th AACR Annual Meeting-- Apr 14-18, 2007; Los Angeles, CA