Deep learning has enabled great advances to be made in cancer research with regards to diagnosis, prognosis, and treatment. The study by Wang and colleagues in this issue of Cancer Research develops a deep learning algorithm with the ability to digitally stain histologic images, achieving reliable nuclei segmentation and cell classification. They use this tool to study the tumor morphologic microenvironment in tissue pathology images of patients with lung adenocarcinoma. On the basis of the image features, they develop a prognostic model and find correlations with the transcriptional activities of biological pathways.
See related article by Wang et al., p. 2056
The use of stain images in histology allows pathologists to visualize cells and tissues and is usually the first step for diagnosis and molecular characterization of many pathologies. Wang and colleagues developed a deep learning–based analytic tool to visualize and quantify the spatial organization of the main cell types in hematoxylin and eosin (H&E)-stained pathology images from patients with lung cancer (1). Lung cancer is the major cause of cancer-related death and economic burden worldwide. According to the latest published global cancer data (GLOBOCAN 2018), there were an estimated 2.094 million new lung cancer cases worldwide, representing 11.6% of all new cancers. It was also the most common cause of death from cancer, with 1.8 million deaths (18.4% of total deaths; ref. 2). About 80%–85% of lung cancers are non–small cell lung cancer (NSCLC). The overall 5-year survival rate for advanced NSCLC is 2%−4% and along with liver, pancreas, and esophageal cancers has the worst prognosis. More than half of the patients with NSCLC are diagnosed with metastasis, which directly contributes to poorer survival outcome. The histology-based digital (HD)-staining methodology developed by Wang and colleagues is applied in lung adenocarcinoma, the most common NSCLC histologic subtype.
Genetic changes that have prognostic and/or predictive significance in NSCLC include EGFR gene mutations or ALK gene rearrangements, among others. Patients with these molecular alterations have better overall survival (OS) and progression-free survival as these alterations have targeted therapies (3). Immune checkpoint inhibitors are now a new alternative therapeutic approach for patients with cancer. These inhibitors include antibodies that bind to the programmed cell death-1 (PD-1) receptor and block its interaction with ligands PD-L1 and PD-L2. The PD-1/PD-L1 interaction regulates immune response. The PD-1 receptor (also known as CD279) is expressed on the surface of activated T cells. Its ligands, PD-L1 (B7-H1; CD274) and PD-L2 (B7-DC; CD273), are commonly expressed on the surface of dendritic cells or macrophages and also on the surface of tumor cells (4). PD-L1 overexpression predicts better survival for patients treated with immune checkpoint inhibitors with a high positive predictive value, but it has a low negative predictive value (5). These data suggest that most patients still do not benefit from these agents and it is critical to continue working to define the selected patient population who will better benefit from PD-1/PD-L1 inhibition and identify markers that could have predictive value for combined immunotherapies.
Currently, immunotherapies based on predictive biomarkers, such as PD-L1 or tumor mutational burden status, have been a paradigm shift in NSCLC treatment. The identification, development, and incorporation of validated biomarkers into clinical practice are the building block for personalized care in oncology. The genetic and molecular basis of the disease guides decisions about prevention, diagnosis, and treatment. For NSCLC, in 2006 there was one biomarker test for patient stratification and a few therapies, while in 2020 there are 10 predictive biomarker tests and numerous treatment options (6). This represents a greater range of therapeutic options that help to improve the survival of patients with this pathology.
Immunotherapeutics and immune-oncology are quickly developing areas where the tumor is not analyzed as an isolated entity because its environment defines and characterizes it. The characterization of a tumor's immune/genetic profile is essential when classifying a tumor for personalized and precision medicine. Implementation of the HD-staining algorithm would help molecular profiling from a deep learning perspective, evaluating the tumor microenvironment (TME) from standard H&E pathology images from patients with lung adenocarcinoma. This is because these image-derived TME features correlate with the gene expression of biological pathways and are associated with patient OS (1). Recent studies have shown that RNA analysis can shine a light on precision medicine for treatment choice and therefore, the information provided by the HD-staining is relevant because it allows the tumor to be characterized from transcriptomic and survival approaches (7). Thus, the TME characterization would enable noninvasive prognostic studies from routine H&E pathology images and more effective selection of potential treatments.
Among the most interesting advances in health management are the applications of artificial intelligence (AI), where technologies such as machine learning, or most recently deep neural networks, are already a reality in routine clinical practice. In fact, the use of AI in the clinic and genomic fields has been validated by the FDA in a few cases, especially in imaging-based diagnostics. However, in other fields it still has a long way to go (8). This HD-staining tool was developed using the Mask Regional Convolutional Neural Network (Mask R-CNN) architecture, a type of AI within deep learning algorithms. Deep learning algorithms have been extensively used in histopathologic image analysis, highlighting convolutional neural networks in classification and object recognition (9).
In this machine learning approach, the algorithm must be trained using a training dataset and subsequently validated using another dataset. Having the training and test data, as well as applying the most appropriate algorithm to solve the problem, impacts the quality of the results obtained. However, to assess whether deep learning can be used in routine clinical practice, we must be aware of the extensive methodologic deficiencies that exist, such as problems with reproducibility, the need for prospective studies with real clinical data, independent validation tests that do not include data from the training sets, and the correct application of diagnostic performance metrics among other things. Unfortunately, there are few studies that compare human capacity versus AI performance that strictly meet all these requirements and it is still too premature to strongly affirm that deep learning algorithms have a sensitivity and specificity equivalent to health care professionals (10).
Therapeutic innovation based on improved understanding of biology and translational research has contributed to the changing paradigm of cancer treatment over the past two decades. The need for the identification of genetic diagnostic profiles in cancer together with the challenge of finding personalized therapy is a priority in modern medicine. The information extracted from medical images has enabled tumor characterization in patients with NSCLC and other cancer types, but the factors determining whether a patient will have a response remain unclear. AI is now an integral part of cancer research, and new advances in this field are stratifying patient populations using deep learning methods. In many instances, this has led to improved clinical outcomes by stratifying patients based on their prognosis or response to treatment.
The deep learning model developed by Wang and colleagues is a version adapted for pathologic image analysis, where the Mask R-CNN architecture segments and classifies nuclei at the same time and it adapts to different staining conditions to detect an object bounding box and assign pixels as foreground or background within this bounding box (1). The authors have created a publicly accessible website that allows users to replicate the work and apply the model to new datasets. In this way, the model can be applied and evaluated in other pathologies. The deep learning analysis of histopathologic images allows prediction of a cancer prognosis and TME evaluation without a pathology expert by significantly improving information accessibility and complementing the current diagnosis in patients with cancer. In their work, they suggest the use of HD-staining (a Mask R-CNN algorithm) together with routine clinical methods to increase TME knowledge and this will assist in the development of personalized precision medicine.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
The author thanks Rocío Rosas-Alonso for her helpful comments.