Given the scarcity of human cell lines from under-represented populations available for study, it is crucial that these cell lines be accurately characterized regarding their genetic ancestry so findings can be properly contextualized. Mischaracterization of a cell line's race/ethnicity can lead to wasted time and resources and potentially publication retractions. Here we calculated genetic ancestry proportions for 22 commercially available cell lines to test the accuracy of the “race/ethnicity” categorization assigned to these cells and to provide previously unknown genetic ancestry estimates to the scientific community. To determine cellular genetic ancestry proportions, DNA was extracted from the cell lines and genotyped for ancestry informative markers using the Agena MassARRAY platform with iPLEX chemistry. Genotypes for each cell line were analyzed with STRUCTURE software using settings of 30,000 for burn-in length, 70,000 replications, and K = 3 ancestral populations, representing West African (WA), Native American (NA), and European (EA) input genotype data. Commercially available cell lines designated as “Caucasian” in ethnicity were accurately described with mean European ancestry proportions of 97% (range 92-99%). Recently, using a much smaller panel of markers, the 22Rv1 cell line was found to have 41% WA ancestry. Our genetic ancestry analysis found the 22Rv1 cell line to have mostly EA ancestry, 91% EA, while also containing a non-negligible proportion of WA ancestry (8%). Three of the commercially available cell lines ascribed “Black” race appear to be accurate (HeLa, MDA-MB-PCa2, MDA-MB-468), although two of the three, HeLa and MDA-MBPCa2, fall far below the mean of ~80% WA ancestry for US-born African Americans with 60% and 66% WA ancestry, respectively. Interestingly, the MDA-MB-468 “Black” cell line had an appreciable amount of NA ancestry, 23% NA and 77% WA ancestry, which could suggest possible Afro-Caribbean ethnicity. Most notably, the E006 cell line, designated as “Black,” is as European as the previously mentioned cell lines ascribed Caucasian ethnicity with > 92% EA genetic ancestry. Cell lines with unassigned ethnicity varied widely in genetic ancestry with ranges of WA, NA, and EA ancestry proportions of 1%-92%, 0%-30%, and 2%-96%, respectively. Our results suggest predominantly European ancestry for the Caucasian-designated cell lines and high variance in genetic ancestry proportions for the Black-designated cell lines. However, the E006 cell line is an example of extreme misclassification, which has led to erroneous findings in the disparities literature and leave open the possibility of mischaracterization of other unanalyzed cell lines. Genetic ancestry estimates add more advanced and detailed ancestral characterization to these cell lines and allow for better contextualization of comparisons, applicability, and significant findings. We suggest robust genetic ancestry estimates as a requirement for all current and novel cell lines used in research.

Citation Format: Stanley E. Hooker Jr., Madhavi Bathina, Stacy Lloyd, Priyatham Gorjala, Ranjana Mitra, Kevin S. Kimbro, Rick A. Kittles. Estimating genetic ancestry of commonly used cancer cell lines [abstract]. In: Proceedings of the Eleventh AACR Conference on the Science of Cancer Health Disparities in Racial/Ethnic Minorities and the Medically Underserved; 2018 Nov 2-5; New Orleans, LA. Philadelphia (PA): AACR; Cancer Epidemiol Biomarkers Prev 2020;29(6 Suppl):Abstract nr B073.