Abstract
Analysis of extracellular vesicles (EV) is a promising noninvasive liquid biopsy approach for breast cancer detection, prognosis, and therapeutic monitoring. A comprehensive understanding of the characteristics and proteomic composition of breast cancer–specific EVs from human samples is required to realize the potential of this strategy. In this study, we applied a mass spectrometry–based, data-independent acquisition proteomic approach to characterize human serum EVs derived from patients with breast cancer (n = 126) and healthy donors (n = 70) in a discovery cohort and validated the findings in five independent cohorts. Examination of the EV proteomes enabled the construction of specific EV protein classifiers for diagnosing breast cancer and distinguishing patients with metastatic disease. Of note, TALDO1 was found to be an EV biomarker of distant metastasis of breast cancer. In vitro and in vivo analysis confirmed the role of TALDO1 in stimulating breast cancer invasion and metastasis. Finally, high-throughput molecular docking and virtual screening of a library consisting of 271,380 small molecules identified a potent TALDO1 allosteric inhibitor, AO-022, which could inhibit breast cancer migration in vitro and tumor progression in vivo. Together, this work elucidates the proteomic alterations in the serum EVs of breast cancer patients to guide the development of improved diagnosis, monitoring, and treatment strategies.
Significance: Characterization of the proteomic composition of circulating extracellar vesicles in breast cancer patients identifies signatures for diagnosing primary and metastatic tumors and reveals tumor-promoting cargo that can be targeted to improve outcomes.
Introduction
Breast cancer is one of the most common cancers worldwide and accounts for 30% of female cancers (1, 2). Distant metastasis is the major cause of breast cancer–related mortality (3). Early detection and dynamic assessment of the metastatic status of breast cancer patients are of great value for the treatment and longitudinal analysis of cancer evolution in response to therapy. To achieve this, liquid biopsies utilizing molecular classifiers detected in blood from patients, such as extracellular vesicles (EV), offer minimal invasiveness, fewer complications, and an increased ability for longitudinal monitoring compared with traditional tumor tissue biopsies (4).
EVs are 30 to 200 nm in size and carry a restricted set of nucleic acids, lipids, and proteins that contribute to intercellular communication in normal physiology and pathology (5–7). The functional importance of EVs has been intensively studied in multiple human cancers, including breast cancer (8). Increasing evidence suggests that EVs are actively released from cancer cells and facilitate cancer growth and metastasis (9–11). Of note, the feature of membrane-encapsulation of EVs promotes their structural integrity, and cargos located within EVs are more stable than other serological proteins based on protection against degradation by circulating proteases and other enzymes (12). Considering their facilitated retrieval and their relatively ubiquitous presence and abundance in serum, EVs can provide ample materials for downstream analysis in breast cancer detection, prognosis, and therapeutic monitoring as a promising, noninvasive liquid biopsy approach (13).
The EV proteome offers unique advantages as an informative readout for the detection and stratification of breast cancer. Despite the availability of several public EV protein databases [e.g., Vesiclepedia (www.microvesicles.org; ref. 14), EVpedia (www.evpedia.info; ref. 15), and ExoCarta (www.exocarta.org; ref. 5], there is a lack of comprehensive understanding of breast cancer–specific EV characteristics and compositions on unique proteomic information from human samples.
In this study, we collected a discovery cohort and five validation cohorts involving a total of 560 samples. The discovery cohort was designed as a case–control study and involved 196 samples, including 70 healthy donors (HD) and 126 breast cancers. Furthermore, we conducted a five-step validation based on five independent cohorts. We applied a mass spectrometry-based, data-independent acquisition (DIA) quantitative approach to characterizing the proteomic profiling of human serum EVs derived from patients with breast cancer and HDs in the discovery cohort. By examining the EV proteomes, we constructed specific EV protein identifiers that could serve as a liquid biopsy tool for the diagnosis of breast cancer, as well as the assessment of lymph node (LN) metastasis. The integrated tissue-serum EV proteomic approach identified TALDO1 as a novel EV biomarker associated with poor prognosis for distant metastases of breast cancer. Finally, high-throughput molecular docking and virtual screening of a library consisting of 271,380 small molecules identified a potent TALDO1 allosteric inhibitor, AO-022, which could inhibit breast cancer migration in vitro and tumor progression in vivo. This work may provide reference value for the accurate diagnosis and monitoring of breast cancer progression using serum EVs, and the identification of novel molecules packaged in EVs offers an opportunity for the targeted therapy of breast cancer in the future.
Materials and Methods
Chemicals and reagents
The chemicals and reagents in this work can be divided into three groups: reagents for sample preparation, antibodies for protein detection, and critical commercial assay.
For sample preparation, the acetonitrile (cat# 9829-03), methanol (cat# 9830-03), and HPLC-grade water (cat# 4218-03) were purchased from J.T. Baker. The formic acid (cat# F0507) was purchased from Sigma-Aldrich. The trypsin (cat# V528A) was purchased from Promega.
For protein detection, the anti-CD63 (cat# 25682-1-AP, RRID: AB_2783831), anti-CD9 (cat# 60232-1-Ig, RRID: AB_11232215), anti-TSG101 (cat# 28283-1-AP, RRID: AB_2881104), anti-ALIX (cat# 12422-1-AP, RRID: AB_2162467), anti-N-cadherin (cat# 66219-1-Ig, RRID: AB_2881610), and anti-TALDO1 (cat# 67816-1-Ig, RRID: AB_2918579) were purchased from Proteintech. The anti-calnexin (cat# sc-23954, RRID: AB_626783) was purchased from Santa Cruz Biotechnology. The anti-E-cadherin (cat# ab201499, RRID: AB_2910587) and anti-Snail (cat# ab216347, RRID: AB_2910593) were purchased from Abcam. The anti-β-actin (cat# ET1702-67, RRID: AB_2890197) were purchased from HUABIO.
For the critical commercial assay, the BCA Protein Assay Kit (cat# 23227) and Bradford Protein Assay Kit (cat# 23236) were purchased from Thermo Fisher Scientific. The immunohistochemistry kit (cat# SP-9000) was purchased from Zhongshan Jinqiao.
Human breast cancer cell lines
Human breast cancer cell lines MCF7 (RRID: CVCL_0031) and MDA-MB-231 (RRID: CVCL_0062), and murine breast cancer cell lines 4T1 (RRID: CVCL_0125) and PY8119 (RRID: CVCL_AQ09) were obtained from the ATCC. All cell lines were used within 15 passages. All cell line authentication was carried out via short tandem repeat method by using Short Tandem Repeat Multi-Amplification Kit (Microreader 21 ID System), which was used for PCR amplification and PCR products were detected by ABI 3130xl DNA Analyzer (Applied Biosystems). All cell lines were confirmed without Mycoplasma contamination by using Mycoplasma Stain Assay Kit (Beyotime Institute of Biotechnology). Cells were grown in RPMI 1640 Medium (Gibco, cat# 22400089) or DMEM (Gibco, cat# 11965092) supplemented with 10% FBS (Gibco, cat# 10099141) and 100-U/mL penicillin–streptomycin solution (Gibco, cat# 15140122). Cells were maintained in a 5% CO2-humidified atmosphere at 37°C until ready for use.
Clinical subjects and sample harvesting
This study was conducted in accordance with the principles of the Declaration of Helsinki, and the study protocol was approved by the ethical committee of Fudan University Shanghai Cancer Center (FUSCC; Shanghai, China, permit number 050432-4-2018). Serum samples were collected from FUSCC, and signed informed consents were obtained from participants or their authorized representatives prior to all study procedures.
Blood samples from breast cancer patients were taken on an empty stomach before surgery and anesthesia. Blood samples from HDs met the following criteria: α-fetoprotein <20 ng/mL, carcinoembryonic antigen < 4 ng/mL, carbohydrate antigen (CA)199 < 40 U/mL, neuron-specific enolase < 12.5 U/mL, CA125 ≤ 35 U/mL, and CA153 ≤ 25 U/mL in serum.
Blood samples from breast cancer patients were taken on an empty stomach before surgery and anesthesia. Blood samples from HDs were also taken on an empty stomach. To harvest blood samples, first, the skin of the subject was disinfected, and 6 mL of peripheral blood was extracted and preserved with a yellow-top separating coagulant tube containing separator gel, which can completely separate the serum. Blood samples collected should be handled with care, and it is necessary to plug and place the nozzle upward and vertically to prevent cross-contamination between different samples. Then, the serum was separated by centrifugation at 3,000 rpm for 5 minutes after the coagulating tube was placed for 30 minutes. The upper layer of serum was removed and stored in an Eppendorf (EP) tube, and the EP tube was immediately placed at −80°C. Each tube of serum was carefully labeled, and the number was indicated on both the tube wall and the tube cover. The number corresponded to the name and the hospital number one by one to ensure that the specimen was consistent with the patient. Specimens should not be freeze-thawed repeatedly. To avoid sampling heterogeneity, the pristine and diluted serum samples were all homogenized by a vortex mixer before dilution and loading to the plate. Serum samples were collected between March 2011 and August 2019. Detailed information is shown in Supplementary Table S1.
Validation cohort
We conducted five-step validation based on five independent cohorts from our Fujian validation cohort, our Shanghai validation cohort #1, our Shanghai validation cohort #2, Clinical Proteomic Tumor Analysis Consortium (CPTAC) breast cancer cohort (16), and Tang and colleagues (17) breast cancer cohort. For our Fujian validation cohort, a prospective follow-up cohort study was conducted to verify the specificity of the breast cancer diagnostic classifier, and TALDO1 as a potential specific serum EV biomarker for breast cancer metastasis. This cohort involved 42 serum EV samples, including 17 liver cancer, nine lung cancer, seven gastric cancer, and nine esophagus cancer. For Shanghai validation cohort #1, a prospective follow-up cohort study was conducted to verify the effectiveness of the breast cancer diagnostic classifier. This cohort involved 41 serum EV samples, including 17 breast benign tumors, 12 other cancer types, and 12 breast cancers. For Shanghai validation cohort #2, a retrospective cohort study was conducted to verify TALDO1 as a potential serum EV biomarker for breast cancer metastasis. This cohort involved 88 tissue samples, including 36 normal tissues, 36 primary tissues, and 16 distant metastatic tissues [bone metastasis (n = 9), liver metastasis (n = 3), and lung metastasis (n = 4)]. For the CPTAC breast cancer cohort, a retrospective cohort study was conducted to verify the effectiveness of the classifier for diagnosing LN metastasis. This cohort involved 75 tissue samples, including 36 breast cancer with LN metastasis and 39 breast cancer without LN metastasis. For the Tang and colleagues’ breast cancer cohort, a retrospective cohort study was conducted to verify TALDO1 as a potential biomarker associated with poor prognosis in breast cancer. This cohort involved 118 tissue samples.
EVs purification from serum samples
In this study, EVs were isolated by ultracentrifugation (UC) combined with subsequent size exclusion chromatography (SEC) for all samples (18–21). In the first step, we extracted EVs from serum samples by UC. In the second step, we further purified EVs by SEC and finally obtained EV samples for proteomic analysis. In detail, UC was usually known as the “gold standard” of EV separation methods (22). A cleaning step was first conducted to eliminate large bioparticles by low-speed centrifugation (300 × g), followed by multiple cycles of centrifugation with centrifugal force from 2,000 × g up to 100,000 × g, to sequentially remove contaminants such as cell derbies, apoptotic bodies, and protein aggregates for EVs isolation. However, under the action of a certain centrifugal force, all components (including EVs, proteins, and lipoproteins) that reach a certain threshold of density, size, and mass can be precipitated at the bottom of the tube (22). SEC was widely applied to the high-resolution separation of large molecules or aggregates of macromolecules such as proteins, polymers, and various liposome particles (23, 24). To further improve the EV isolation purity of this classical separation technique, rough EV samples were further purified using an SEC kit (Izon’s qEV column; ref. 25). After removal of buffer volume (Fractions 1–6, 3.0 mL), the EV-rich fractions 7 to 9 (purified collection volume, 1.5 mL) were pooled. Then, purified EV samples (50 μL) were obtained by concentrating using an Amicon Utra-4-10 k centrifugal filter device (Merck Millipore).
The morphology and purity of the EVs were verified using transmission electron microscopy (TEM) according to previously published methods (26, 27). As previously described, purified EVs were resuspended in 2% paraformaldehyde and adsorbed onto carbon-coated formvar EM grids for 20 minutes. Grids were then washed in physiological saline and transferred to 50-mmol/L glycine/PBS for 3 minutes, repeated thrice. Finally, grids were embedded in 30 μL of uranyl-oxalate solution for 90 seconds and air-dried. Images were captured using an FEI Talos F200C TEM (Thermo Fisher Scientific). The size scale and concentration of EVs were measured using a nanoparticle tracking analysis (NTA) device NanoSight NS300. EVs were suspended in PBS and diluted 100-fold prior to analysis. A 60-second video was recorded and subsequently analyzed using the NTA software.
EV samples were also verified by immunoblotting analyses using the conventional markers CD9, CD63, TSG101, ALIX, and calnexin. The presence of EV-enriched proteins was determined by western blotting in 10 mg of lysates using the following antibodies: CD63 (Proteintech, cat# 25682-1-AP, RRID: AB_2783831, dilution 1:600), CD9 (Proteintech, cat# 60232-1-Ig, RRID: AB_11232215, dilution 1:5,000), ALIX (Proteintech, cat# 12422-1-AP, RRID: AB_2162467, dilution 1:1,000), TSG101 (Proteintech, cat# 28283-1-AP, RRID: AB_2881104, dilution 1:5,000), and calnexin (Santa Cruz Biotechnology, cat# sc-23954, RRID: AB_626783, dilution 1:1,000).
EVs protein extraction and tryptic digestion
EV samples (typically 5 μg, adjusted based on BCA measurements) were dried by vacuum centrifugation and redissolved in 30 to 50 μL of 8-mol/L urea/50-mmol/L ammonium bicarbonate/10-mm dithiothreitol. Following lysis and reduction, proteins were alkylated using 20-mmol/L or 30-mmol/L iodoacetamide (Sigma). Proteins were digested with trypsin (Promega) at an enzyme-to-protein mass ratio of 1:50 overnight at 37°C, and peptides were then extracted and dried (SpeedVac, EP). Peptides were desalted and concentrated using Empore C18-based solid phase extraction prior to analysis by high-resolution/high-mass accuracy reversed-phase (C18) nano-LC-MS/MS.
Liquid chromatography
We employed an EASY-nLC 1200 ultrahigh-pressure system liquid chromatography system (Thermo Fisher Scientific). Peptides were separated within 75 minutes at a flow rate of 600 nL/minutes on a 150 μm I.D. × 15 cm column with a laser-pulled electrospray emitter packed with 1.9-μm ReproSil-Pur 120 C18-AQ particles (Dr. Maisch). Mobile phases A and B were water and acetonitrile with 0.1 vol% FA, respectively. The %B was linearly increased from 15% to 30% within 75 minutes.
Mass spectrometry
Samples were analyzed on a Q-Exactive-HF mass spectrometer (Thermo Fisher Scientific) via a nanoelectrospray ion source (Thermo Fisher Scientific). The mass spectrometer was operated in the data-independent mode for ion mobility-enhanced spectral library generation. Typically, 75% of samples were injected. The peptides were dissolved in 12 μL of loading buffer (0.1% formic acid), and 9 μL was loaded onto a 100 μm I.D. × 2.5 cm C18 trap column at a maximum pressure of 280 bar with 14 μL of solvent A (0.1% formic acid). The DIA method consisted of an MS1 scan from 300 to 1,400 m/z at 60k resolution (AGC target 4e5 or 50 milliseconds). Then, 30 DIA segments were acquired at 15k resolution with an AGC target of 5e4 or 22 milliseconds for maximal injection time. The setting “injections for all available parallelizable time” was enabled. HCD fragmentation was set to a normalized collision energy of 30%. The spectra were recorded in profile mode. The default charge state for the MS2 scan was set to 3.
Peptide identification and protein quantification
All data were processed using Firmiana (28). The DIA data were searched against the UniProt human protein database (RRID: SCR_002380) using FragPipe (RRID: SCR_022864; v.12.1) with MSFragger (2.2; ref. 29). The mass tolerances were 20 ppm for precursor and 50 mmu for product ions. Up to two missed cleavages were allowed. The search engine set cysteine carbamidomethylation as a fixed modification and N-acetylation and oxidation of methionine as variable modifications. Precursor ion score charges were limited to +2, +3, and +4. The data were also searched against a decoy database so that protein identifications were accepted at a protein- and peptide-level FDR of less than 1%. The DIA data were analyzed using DIANN (v.1.7.0; ref. 30). The quantification of identified peptides was calculated as the average chromatographic fragment ion peak areas across all reference spectra libraries.
Measurement of metabolite levels
As previously described (31), to ensure that the same amount of sample is used for targeted metabolomics analysis, 1 × 107 cells/sample were taken. All standards were separately prepared and mixed to form a standard solution containing 20-μg/mL nucleotides. This mixed standard solution was serially diluted and finally mixed isometrically to obtain a standard curve. The cell sample with a 5-μL standard solution in 50% aqueous methanol was subjected to three cycles of ultrasonication for 1 minute and intervals for 1 minute in an ice-water bath. After incubation at −20°C for 30 minutes and centrifugation at 15,000 g for 15 minutes at 4°C, the supernatant was dried and reconstituted in 100 μL of 50% aqueous acetonitrile, followed by UPLC–MS/MS. The samples were injected into a Waters UPLC BEH C18 column (100 mm × 2.1 mm, 1.8 μm) at a flow rate of 0.25 mL/minute. The mobile phase consisted of (A) water and (B) 90% aqueous acetonitrile, both with 15-mmol/L ammonium acetate (pH = 9). Chromatographic separation was performed using a gradient elution program: 0 to 1 minute, 90% B; 1 to 4 minutes, 90% B–85% B; 4 to 8 minutes, 85% B–80% B; 8 to 15 minutes, 80% B–65% B; 15 to 15.2 minutes, 65% B–40% B; 15.2 to 16.9 minutes, 40% B; and then back to initial gradient at 17.1 minutes and equilibrated for 20 minutes. Mass spectrometry was performed on a multiple reaction monitoring system. UPLC–tandem MS (UPLC–MS/MS) analysis was performed on an AB SCIEX ExionLC UPLC system interfaced with an AB SCIEX QTRAP 6500. Raw data were processed using default parameters and assisting manual inspection to ensure qualitative and quantitative accuracies of each compound. A standard curve was constructed for nucleotide standard and used to determine the nucleotide concentration of each unknown sample.
Data analysis and statistics
Label-free protein quantifications were calculated using a label-free, intensity-based absolute quantification (iBAQ) approach (32). We calculated the peak area values as parts of the corresponding proteins. The fraction of total (FOT) was used to represent the normalized abundance of a particular protein across samples. FOT was defined as a protein’s iBAQ divided by the total iBAQ of all identified proteins within a sample. The FOT values were multiplied by 105 for ease of presentation.
For differential expression analysis, we screened more than 50% of the identified proteins in each group and divided the data into two parts to impute the proteomic data using the R package “impute” (https://git.bioconductor.org/ packages/impute) based on the K-NN algorithm. When the protein detection rate was <50%, the missing value was replaced with one-tenth of the minimum value. Then, differential expression analysis between groups was conducted by a simple linear model and moderated t statistics, implemented in the R package to identify upregulated and downregulated proteins. The threshold of significance was set to P < 0.05 and log2 ratio of average abundance between groups ≥1.
Each gene expression value in the global proteomic expression matrix was transformed to a z-score across all the samples. The z-score-transformed matrix was clustered using the R package “pheatmap” (https://cran.r-project.org/web/packages/pheatmap/index.html).
Pathway enrichment analysis was performed by DAVID (https://david.ncifcrf.gov), ConsensusPathDB (http://cpdb.molgen.mpg.de), and gene set enrichment analysis (http://software.broadinstitute.org/gsea/index.jsp).
Immunohistochemistry
First, the sections were baked at 65°C for 1 hour and incubated in xylene three times for 10 minutes each time. Then, the sections were hydrated by a graded series of ethanol (100% ethanol, 95% ethanol, 75% ethanol, 50% ethanol, and ddH2O), and each step took 5 minutes. Antigen retrieval was conducted using a microwave oven: 3 minutes at 100% power and 15 minutes at 20% power filled with 10-mmol/L sodium citrate buffer (pH 6.0). After naturally cooling to room temperature and washing in ddH2O, we blocked the sections with 5% normal goat serum for 10 minutes, incubated sections in 3% H2O2 for 10 minutes at room temperature, and washed the sections twice in PBS for 5 minutes. The following antibodies were diluted in the appropriate concentrations: TALDO1 (Proteintech, Cat# 67816-1-Ig, RRID: AB_2918579, dilution: 1:500). The antibody was incubated with the sections overnight at 4°C.
The next day, after washing the sections twice in PBS, we used an IHC Kit (Zhongshan Jinqiao, Cat# SP-9000) and incubated the sections with a biotin-labeled secondary antibody for 15 minutes. After washing sections twice in PBS, incubate the sections with horseradish enzyme-labeled streptomyces ovalbumin working solution for 15 minutes. Finally, we used a DAB solution to stain the tissues. Then, using hematoxylin to stain nuclear and wash them in ddH2O. Finally, the sections were dehydrated by graded ethanol (50% ethanol, 75% ethanol, 95% ethanol, and 100% ethanol). We dried the slides in a fume cupboard for at least 20 minutes and mounted coverslips.
The IHC scoring was analyzed using an IHC profiler–compatible plugin with integrated options for the quantitative analysis of digital IHC images stained for cytoplasmic or nuclear proteins (33).
In vitro cell migration assays
A transwell migration assay was performed using the Costar transwell system (Corning, cat# CLS3364). Briefly, MCF7 and MDA-MB-231 cells (1 × 105 cells) were suspended in 200-μL serum-free medium and seeded in the upper insert chamber, and 600-μL medium was added to the lower chamber. At 24 hours after the cells were seeded, media in both the upper insert and lower chambers were removed. Cells that had migrated into the lower chamber through the 8-μm pore membrane were stained using crystal violet, and then, the migrated cells were visualized using a microscope.
Western blotting
Cells and tumor tissues were harvested and lysed. Proteins were separated by SDS-PAGE and transferred onto polyvinylidene difluoride membranes. After blocking with 5% (w/v) fat-free milk, the membrane was stained with primary antibodies against TALDO1 (Proteintech, cat# 67816-1-Ig, RRID: AB_2918579, 1:1,000), CD63 (Proteintech, cat# 25682-1-AP, RRID: AB_2783831, 1:600), CD9 (Proteintech, cat# 60232-1-Ig, RRID: AB_11232215, 1:5,000), TSG101 (Proteintech, cat# 28283-1-AP, RRID: AB_2881104, 1:5,000), ALIX (Proteintech, cat# 12422-1-AP, RRID: AB_2162467, 1:1,000), calnexin (Santa Cruz Biotechnology, cat# sc-23954, RRID: AB_626783, 1:1,000), E-cadherin (Abcam, cat# ab201499, RRID: AB_2910587, 1:1,000), N-cadherin (Proteintech, cat# 66219-1-Ig, RRID: AB_2881610, 1:1,000), Snail (Abcam, cat# ab216347, RRID: AB_2910593, 1:1,000), and β-actin (HUABIO, cat# ET1702-67, RRID: AB_2890197, 1:5,000), followed by incubation with the appropriate secondary antibodies (Abcam, cat# ab205718, RRID: AB_2819160, 1:20,000), and development with enhanced chemiluminescence assay (Tanon). The signals were captured by Tanon 5200 control software. ImageJ software (RRID: SCR_003070) was used to quantify the western blotting results by densitometry.
TALDO1 shRNA interference assay
TALDO1 shRNA sequences were synthesized as follows by GenePharma. Synthetic scrambled shRNAs were used as negative controls. Transfections were performed using Lipofectamine 3000 (Invitrogen). • shRNA: 5′-GCAAGGACCGAATTCTTATA-3′.
Tumor models and studies
All animal experiments were conducted in accordance with the Institutional Animal Care and Use Committee of FUSCC (IACUC: FUSCC-IACUC-2024071). For fat-pad injection assays, 5 × 105 PY8119 cells or 4T1 cells expressing the control shRNA or TALDO1 shRNA vector were injected into the mammary fat pads on one side of 6-week-old female nude mice (Animal Unit, FUSCC) or BALB/c mice (Animal Unit, FUSCC), respectively. Mice were sacrificed 15 days later before being processed for subsequent IHC and tumor volume measurement.
To further investigate the role of EV-TALDO1 in promoting breast cancer progression in vivo, EVs from 4T1 cells that were transfected with TALDO1 overexpression (TALDO1-EV group) and 4T1 cells (control-EV group) were isolated and DiR-labeled. 4T1 cells were injected subcutaneously into BALB/c mice to develop a breast cancer xenograft model. Seven days later, we treated BALB/c mice with DiR-labeled TALDO1-EVs or DiR-labeled control-EVs by tail vein injection. DiR-labeled TALDO1-EVs or DiR-labeled control-EVs were administered every other day once the tumor volume reached 20 mm3. Fifteen days later, the fluorescence intensity of DiR in the tumor region was evaluated using the IVIS Lumina live animal biophotonic imaging system.
To determine whether AO-022 might suppress breast cancer cells in vivo, we implanted MDA-MB-231 cells into nude mice. We treated nude mice with AO-022 (50 mg/kg) by intraperitoneal injection. AO-022 (50 mg/kg) was administered every other day. Fourteen days later, the tumors were harvested by surgical removal.
Prediction of potential allosteric sites of TALDO1 and virtual screening
The potential allosteric sites were predicted based on the structure of TALDO1 (PDB ID: 1F05) using the Allosite server (http://mdl.shsmu.edu.cn/AST/). Grid-based ligand docking from energetics (GLIDE, RRID: SCR_000187) software (Schrödinger Maestro 11.4, RRID: SCR_016748) was used to virtually screen three commercial compound libraries containing over 65,000 compounds to target the predicted sites. The 10 highest-scoring compounds were purchased from MedChemExpress for experimental use.
In vitro viability assay of patient-derived organoid
Patient-derived organoid (PDO) samples were collected freshly from patients who underwent surgery between May 2023 and July 2023 at FUSCC. Overall, two HR+/HER2− samples and an HR−/HER2− PDO sample were eligible for further experiments. The generation of PDO was conducted as described in a previous study (34). Briefly, breast cancer tissue was cut into 1 to 3 mm3 pieces that were subsequently digested using collagenase (Sigma-Aldrich). The resulting organoids were then cultured in 24-well plates, suspended in basement membrane extract (BME; Corning, cat# 356231). Organoids were diluted to a concentration of 40 organoids per mL in breast cancer organoid medium supplemented with 10% basement membrane extract. A volume of 25 mL of the organoid suspension was then added to cell-repellent surface black, clear bottom 384-well plates (Corning, cat# 3764) and cultured for an additional 5 days before initiating drug treatments. Organoid cell viability was accessed by a CellTiter-Lumi II 3D Cell viability assay (Beyotime, cat# C0062L) following the manufacturer’s instructions. Compound AO-022 was purchased from MedChemExpress.
Data availability
The proteomics data generated in this study are hosted by the integrated proteome resources center (iProX; https://www.iprox.cn) and can be accessed at Project ID IPX0003429000. The breast cancer proteomics data analyzed in this study are available in the ProteomeXchange (http://proteomecentral.proteomexchange.org) at Project ID PXD005692 and in the Proteomic Data Commons at https://proteomics.cancer.gov/data-portal. All other raw data generated in this study are available upon request from the corresponding author.
Results
Proteomic characterization of breast cancer-derived EVs
To elucidate the proteomic profile of breast cancer–derived EVs, we purified EVs from 196 serum samples derived from breast cancers (n = 126) and age-matched HDs (n = 70) in the discovery cohort by UC combined with subsequent SEC (Materials and Methods; Fig. 1A; Supplementary Fig. S1A–C). Furthermore, we conducted five-step validation based on five independent cohorts from our Fujian validation cohort, our Shanghai validation cohort #1, our Shanghai validation cohort #2, CPTAC breast cancer cohort, and Tang and colleagues’ breast cancer cohort (Fig. 1B; Supplementary Table S1; Materials and Methods). All breast cancer samples were collected prospectively from treatment-naive stage I-IV breast cancer patients. Clinical data, including gender, age at diagnosis, tumor stage, subtype, and metastatic status, were summarized in Supplementary Fig. S1A and Supplementary Table S1.
Under TEM in combination with NTA, the isolated EVs appeared as morphologically uniform vesicular structures 30 to 200 nm in size range, surrounded by a double-layer membrane (Fig. 1C and D; Supplementary Fig. S1D and S1E). By comparing the proteins extracted from EVs with whole blood and serum lysates on SDS-PAGE, we observed distinct proteomic profiles between the three lysates, indicating a different EV protein composition (Supplementary Fig. S1F). EV samples were also verified by immunoblotting analyses using conventional markers. As shown in Fig. 1E and Supplementary Fig. S1G, the EV markers CD9, CD63, TSG101, and ALIX were detected in the EVs isolated, whereas the negative exosome marker calnexin was not detected. Taken together, these results indicated that the EVs were well separated and highly purified from the serum.
A proteomic database of serum EVs was constructed using label-free LC-MS/MS analysis, identifying 9,699 proteins in total from the 196 analyzed samples (Supplementary Fig. S2A). The mean number of proteins detected per EV sample was 1,721 (range 793 to 2,366 proteins; Supplementary Fig. S2B). Globally, the dynamic range of proteins detected spanned eight orders of magnitude (Supplementary Fig. S2C and S2D). To map previously reported EV proteins, we took our dataset of EV proteins and cross-referenced it with a publicly available EV proteome database Vesiclepedia (version 5.1, 2023; ref. 14). This analysis showed that >90% of our identifications (7,446 EV proteins identified in HD, 90.4%; 8,282 EV proteins identified in breast cancer, 94%) have been previously implicated in extracellular function (Fig. 1F). Similarly, in the breast cancer cell–derived EV proteome dataset identified by Rontogianni and colleagues (35), there was an overlap with 4,274 proteins (91.8%) in the Vesiclepedia protein list. Notably, of the top 100 EV-associated proteins from Vesiclepedia, 99% were identified in this study, which spanned six orders of intensity (Fig. 1G). Similarly, in the breast cancer cell–derived EV proteome dataset identified by Shen and colleagues (36), 94 of the top 100 proteins were quantified in their dataset, which spanned six orders of intensity. In addition, the identification frequencies of 24 EV markers in our mass spectrometry data were analyzed and compared with the serum EV cohort of Hoshino and colleagues (Supplementary Fig. S2E–G; ref. 8). Among them, the identification frequencies of exosome markers HSPA8 (HSC70), FN1, LGALS3BP, A2M, HBB, GSN, and JCHAIN were all above 90%, which was consistent with the report of Hoshino and colleagues (Supplementary Fig. S1E–G). Notably, the identification frequencies of EV markers HSP90AB1, HSP90AA1, FLOT1, HSPA4 (HSP70), FLOT2, CD63, ACTB, FLNA, MSN, PRDX2, and TSG101 were much higher than that of the serum EV cohort of Hoshino and colleagues (Supplementary Fig. S1E–G). These results suggested that the majority of proteins quantified in our study are well documented in the extracellular fraction, proving the value of this dataset.
Furthermore, the cellular composition was analyzed to explore the protein location (Supplementary Fig. S1H). Proteins identified from EVs were strongly enriched in the following cellular components (CC): cytoplasm (n = 3,012; P = 5.33E−130), plasma membrane (n = 2,225; P = 4.34E−4), and extracellular exosome (n = 1,816; P = 0), further confirming the efficient EV isolation. Similarly, in the breast cancer cell–derived EV proteome dataset identified by Shen and colleagues (36), a significant enrichment of proteins in the cytoplasm, membrane, and extracellular exosome Gene Ontology CC terms, both with >1,000 proteins, was observed.
EVs act as lipid-sealed vesicles whose surfaces expose extracellular domains of transmembrane proteins and contain various types of cytoplasmic proteins. In 2020, Kalluri and LeBleu reviewed the composition of EVs, including cargo proteins such as heat shock proteins, enzymes, and cytoskeletons, as well as surface proteins such as integrins, tetraspanins, and membrane transport fusion proteins (37). To provide information for diagnosis and treatment in the future, we used the GeneCards dataset and distinguished the 4,547 cargo proteins (e.g., heat shock proteins, n = 65; enzymes, n = 888; cytoskeletal proteins, n = 2,387; ESCRT components, n = 12; and cytosolic proteins, n = 1,195) and 3,218 surface proteins (e.g., tetraspanins, n = 1,552; MHC, n = 20; integrins, n = 24; membrane transport/fusion proteins, n = 104; other trans member proteins, n = 1,518) in our EV proteome dataset (Supplementary Fig. S2I).
Breast cancer–derived EVs exhibited specific signatures related to immune response, metabolism, and metastasis
In general, 1,463 and 485 unique EV proteins were identified in breast cancer and HD samples, respectively (Fig. 1F). Next, proteomic data were analyzed to determine the characteristics of breast cancer–derived EVs. We identified 1,130 proteins that were significantly differentially enriched between breast cancer and HD samples (BCmean/HDmean > 2-fold or < 0.5-fold, Student t test, P < 0.05; Fig. 2A; Materials and Methods). Clustering and cluster-specific enrichment analyses using DAVID pathway annotations showed that these differentially enriched proteins were involved in distinctive biological processes and pathways (Fisher exact test; Fig. 2B and C; Supplementary Table S2). Specifically, proteolysis (P = 8.68e−03), receptor-mediated endocytosis (P = 1.30e−02), phagocytosis, recognition (P = 7.70e−05), and vesicle-mediated transport (P = 3.86e−03) were enriched in HD samples (Fig. 2B; Supplementary Table S2). Of note, BC-EVs exhibited specific signatures related to the immune response, metabolism, and metastasis, potentially reflecting the functional roles and molecular heterogeneity of EVs during breast cancer tumorigenesis and progression (Fig. 2C; Supplementary Fig. S3; Supplementary Table S2). These findings suggested that BC-EV and HD-EV proteins are distinct, and EVs released by breast cancer cells may carry more encapsulated cargos for signal transfer to induce the malignant transformation of recipient cells.
Seven-protein diagnostic model to distinguish breast cancer
To further assess whether EV proteomic information could be used as a liquid diagnostic readout to discriminate cancers from noncancers, we applied the XGBoost classifier, which is robust to noise and overfitting, to verify a distinct EV protein subset that can accurately distinguish the breast cancers and HDs (Fig. 2D). By comparing the breast cancer– and HD-derived EV proteomes, we discovered that the best partition was achieved with seven proteins (IGHV3-23, MMP9, AHNAK, PAICS, VWF, ANGPTL6, and PSME1; Fig. 2E; Supplementary Table S2).
To train and subsequently test the seven-protein classifier, EV samples were evenly partitioned based on the sample source (breast cancer vs. HD), and 70% of samples were used as a training set, with the remaining 30% used as an internal validation set. Applying five-fold cross-validation, this seven-protein classifier reached an AUC of 98% based on the receiver operating characteristic curve analysis with 93.2% sensitivity and 93.9% specificity in the training set (Fig. 2F and G). We then tested the seven-protein classifier in the internal validation set, resulting in a promising AUC of 91%, with 81.6% sensitivity and 85.7% specificity for distinguishing breast cancers from HDs (Fig. 2F and H). In addition, we collected an additional 41 serum EV samples (breast benign tumor, n = 17; other cancer types, n = 12; breast cancers, n = 12) from Shanghai validation cohort #1 as an external validation test set and achieved 91.7% sensitivity and 93.1% specificity for distinguishing breast cancers from non-breast cancers (Fig. 2I).
To clarify whether this seven-protein classifier is specific to breast cancer and could distinguish breast cancer from other cancer types, we collected an additional 42 serum EV samples from other cancer types patients as an external validation cohort (our Fujian validation cohort; Materials and Methods). We tested the seven-protein classifier for distinguishing other cancers from HDs, resulting in 26.2% sensitivity and 80.0% specificity (Supplementary Fig. S4A; Supplementary Table S2). This result suggested that the seven-protein classifier was specific to breast cancer but not a broad-spectrum classifier for diagnosing other cancers. Furthermore, we randomly sampled 10 HDs from the discovery cohort to make the proportions of HDs and different other cancer types samples appropriate (Supplementary Fig. S4B–E). We then tested the seven-protein classifier for distinguishing different cancer types from HDs, respectively. The seven-protein classifier for distinguishing lung cancers from HDs resulted in 22.2% sensitivity and 60.0% specificity (Supplementary Fig. S4B). The seven-protein classifier for distinguishing liver cancers from HDs resulted in 23.5% sensitivity and 70.0% specificity (Supplementary Fig. S4C). The seven-protein classifier for distinguishing esophagus cancers from HDs resulted in 22.2% sensitivity and 60.0% specificity (Supplementary Fig. S4D). The seven-protein classifier for distinguishing gastric cancers from HDs resulted in 42.9% sensitivity and 60.0% specificity (Supplementary Fig. S4E). These results suggested that the seven-protein classifier was not suitable for diagnosing other cancer types, but was specific for diagnosing BCs.
Proteomic characteristics of EVs derived from four clinical subtypes of breast cancer
Breast cancer is a highly heterogeneous disease. Based on the expressions of estrogen receptor, progesterone receptor (PR), human epidermal growth factor receptor 2 (Her2), and proliferative cell nuclear antigen (Ki67), breast cancer was classified into four major clinical subtypes: luminal A, luminal B, HER2-enriched, and TNBC (38). To distinguish proteomic landscapes among diverse clinical subtypes of breast cancer, we analyzed EV samples from luminal A (n = 20), luminal B (n = 50), Her2-enriched (n = 21), and TNBC (n = 23) in our cohort. Principal component analysis demonstrated a clear distinction among the different molecular subtypes, which further highlighted the distinct proteomic patterns among clinical subtypes of breast cancer samples (Fig. 3A).
Next, we identified 110, 40, 67, and 139 EV proteins that were significantly overrepresented in luminal A, luminal B, Her2-enriched, and TNBC samples (Student t test, P < 0.05; Fig. 3B; Supplementary Table S3). Clustering and cluster-specific enrichment analyses of the enriched proteins using ConsensusPathDB and DAVID databases showed the distinctive biological processes and pathways represented in luminal A, luminal B, Her2-enriched, and TNBC samples (Fisher exact test; Fig. 3C and D). Specifically, luminal A-derived EVs were characterized by proteolysis (i.e., SDCBP, COLEC11, LTF, etc.; P = 0.0038), protein folding (i.e., CCT2, HSP90AA1, PPIG, etc.; P = 0.0191), regulation of necroptotic cell death (i.e., HSP90AA1 and SDCBP; P = 0.0092), cellular responses to stress (i.e., HSBP1, IL6, HSP90AA1, etc.; P = 0.03), and cellular responses to external stimuli (i.e., HSBP1, IL6, HSP90AA1, etc.; P = 0.0344; Fig. 3C and D). Luminal B-derived EVs were characterized by a tight junction (i.e., MYL6, MYL6, AFDN, etc.; P = 0.0009), ECM–receptor interaction (i.e., RELN and GP1BB; P = 0.0219), fructose and mannose metabolism (i.e., SLC2A14 and SLC2A3; P = 0.0013), glucose metabolism (i.e., SLC2A14 and SLC2A3; P = 0.0017), and insulin signaling pathway (i.e., PKLR and PYGM; P = 0.0493; Fig. 3C and D). Her2-enriched EV-enriched proteins were related to cellular response to hydrogen peroxide (i.e., SRC, ARG1, and MAPK13; P = 0.0217), ErbB2/ErbB3 signaling events (i.e., DOCK7 and SRC; P = 0.0137), keratinization (i.e., CDSN, KRT78, and KRT23; P = 0.0199), tyrosine metabolism (i.e., FAH and ADHFE1; P = 0.0095), and arginine and proline metabolism (i.e., ARG1 and RARS1; P = 0.025; Fig. 3C and D). TNBC samples were characterized by platelet activation (i.e., TLN2, VASP, GNAS, etc.; P = 0.0056), antigen processing and presentation (i.e., B2M and PSMB9; P = 0.0061), regulation of actin cytoskeleton (i.e., CFL2, ITGB4, GIT1, etc.; P = 0.05), angiogenesis (i.e., CEACAM1, COL4A2, CALD1, etc.; P = 0.0327), and cell motility (i.e., DST, ITGB4, and CFL2; P = 0.0422; Fig. 3C and D). Collectively, these data suggested that proteomic profiles of serum-derived EVs reflect selective packaging, which represents an informative readout and differs among diverse subtypes of BCs.
Twelve-protein diagnostic model for LN metastasis
To generate a protein signature that stratifies patients with or without LN metastases, we performed random forest classification to identify a subset of EV proteins that accurately discriminates between BC-LN+ and BC-LN− samples. EV samples were evenly partitioned based on sample type (BC-LN+ vs. BC-LN−), and 70% of samples were used as a training set, with the remaining 30% used as a test set. By comparing the BC-LN+ and BC-LN−-derived EV proteomes, we discovered that the best partition was achieved with 12 proteins (TTYH3, KPNB1, RANBP2, PEPD, NCL, PARP1, ACTA2, ACTG2, TBCA, MATR3, KRT16, and CCT6A; Fig. 4A and B; Supplementary Table S4). Applying five-fold cross-validation, receiver operating characteristic curve analysis showed that this 12-protein model reached an AUC of 98% analysis with 100% sensitivity and 89.4% specificity in the training set (Fig. 4C and D). We then tested the 12-protein model on the internal validation set, resulting in a promising AUC of 85% with 93.8% sensitivity and 81.3% specificity for distinguishing BC-LN+ from BC-LN− (Fig. 4C and E). In addition, we used the CPTAC breast cancer dataset (n = 75; ref. 16) based on proteomics profiling of tumor cells as an external validation test set and achieved 100% sensitivity and 100% specificity (Fig. 4F).
Potential EV survival biomarkers for distant metastases
To identify universal biomarkers associated with distant metastasis, we performed further analysis based on the proteomic profiles of seven ductal carcinoma in situ (DCIS) and 21 distant metastasis (D-MET) samples [e.g., M-multiple (n = 5), M-lung (n = 3), M-liver (n = 4), M-bone (n = 7), M-chest-wall (n = 1), and M-soft tissue (n = 1)] in our cohort. Clustering and cluster-specific enrichment analyses of the upregulated proteins using DAVID pathway annotations clearly showed distinctive biological processes and pathways enriched in D-MET samples compared with DCIS samples (Fisher exact test; Fig. 5A; Supplementary Table S5). Compared with DCIS samples, D-MET samples showed an upregulation of focal adhesion (i.e., FLNA and VTN; P = 5.45E−03), metabolism-related pathways [e.g., carbon metabolism (i.e., PKM, G6PD, and TALDO1; P = 5.42E−05)], and complement and coagulation cascades (i.e., CPB2, SERPINA1, and CFH; P = 3.58E−02; Fig. 5A and B; Supplementary Table S5). We found that 24 EV proteins were significantly overexpressed in D-MET (D-METmedian/DCISmedian > 2-fold, Student t test, P < 0.05), suggesting that they may be potential serum EV protein markers for distant metastasis of breast cancer (Fig. 5B).
We also compared the serum EV proteome profiles of patients with breast cancer to that of breast cancer tissue proteome and reasoned that ideal biomarkers should be overexpressed in the corresponding tumor tissue and released into the blood. To further explore the biomarker for breast cancer, an additional tissue cohort of Tang and colleagues (n = 118; ref. 17) was applied to assess this research. We cross-checked the 24 EV proteins in the breast cancer cohort of Tang and colleagues, finding that 10 EV proteins (FLNA, VTN, PKM, PDHB, G6PD, TALDO1, LDHB, ACACA, C7, and F2) were highly expressed in breast cancer tissues and were associated with poor prognosis (Fig. 5C; Supplementary Fig. S5A). Among them, four EV proteins (TALDO1, FLNA, VTN, and C7) were potential specific biomarkers of distant metastasis of breast cancer (Fig. 5D). It is worth mentioning that TALDO1 was significantly upregulated in serum EVs of breast cancer patients with distant metastasis, compared with HD, DCIS, BC-LN−, and BC-LN+ (Fig. 5D).
To further verify whether TALDO1 was associated with distant metastasis of breast cancer, we compared TALDO1 expression in normal (n = 36), primary tissue (n = 36), and distant metastatic tissues (n = 16) from Shanghai validation cohort #2 cohort by IHC (Fig. 5E and F). The analysis showed that a high percentage of normal and primary tissues displayed insignificant intensities (60% and 50%) or low intensities (40% and 27%) of TALDO1 when compared with those of distant metastatic tissues (P < 0.001). Moreover, all distant metastatic tissues displayed increased expression of TALDO1 protein levels when compared with primary tumor (70% vs. 23%, P < 0.001) and normal (70% vs. 0%, P < 0.001) tissues. Taken together, the increased TALDO1 protein expression suggested that TALDO1 may be a novel biomarker for breast cancer distant metastasis.
To validate EV-TALDO1 as a specific marker for breast cancer distant metastasis, we further analyzed 54 other cancer types from our Fujian validation cohort and Shanghai validation cohort #1. These results showed that there was no significant difference in the expression level of EV-TALDO1 between metastatic esophageal cancer and nonmetastatic esophageal cancer, metastatic liver cancer and nonmetastatic liver cancer, metastatic lung cancer, and nonmetastatic cancer, and metastatic gastric cancer and nonmetastatic gastric cancer (Supplementary Fig. S5B). These results suggested that the EV-TALDO1 was a potential survival marker for distant metastasis of breast cancer.
TALDO1 is a critical regulator of breast cancer metastasis
To test whether TALDO1 possesses a metastasis-promoting effect, we employed human breast cancer cells, MCF7 and MDA-MB-231 with differential TALDO1 expression levels for functional analysis (Fig. 6A). TALDO1 low-expressing MCF7 was transfected with a lentiviral vector carrying TALDO1 full-length cDNA to harbor stable expression of TALDO1, whereas the shRNA against TALDO1 was applied to knock down the expression of TALDO1 in MDA-MB-231 cells (Fig. 6B). Immunoblot analysis showed that TALDO1 knocking down significantly increased the expression of E-cadherin and decreased the expression of N-cadherin and Snail1 in MDA-MB-231 cells. Conversely, overexpressing TALDO1 significantly decreased the expression of E-cadherin and increased N-cadherin and Snail1 expression in MCF7 cells. Therefore, high TALDO1 expression can promote breast cancer cell epithelial–mesenchymal transition (Fig. 6B). Transwell and wound healing assays revealed that TALDO1 knockdown significantly reduced the ability of invasion (Fig. 6C–F) in TALDO1 high-expressing MDA-MB-231 cells. The ratio indicates the percentage of invasive cells in the TALDO1 knockdown group compared with the control group as 100% in MDA-MB-231 cells, and the percentage of invasive cells in the TALDO1 overexpress group compared with the control group as 100% in MCF7 cells. Conversely, overexpression of TALDO1 significantly increased the capacity of invasion (Fig. 6C and D) and migration (Fig. 6G and H) in MCF7 cells.
TALDO1 expression in EVs was measured by western blotting, revealing a high expression in EVs from MDA-MB-231 cells compared with MCF7 and normal cells MCF10A (Fig. 6I). Cell migration assay showed that supplementing with TALDO1-enriched EVs significantly increased the migration ability of TALDO1 depletion cells (Fig. 6J and K) and wild-type MDA-MB231 cells (Fig. 6L and M). These data suggested that TALDO1 plays a key role in breast cancer invasion and migration.
To further assess the impact of TALDO1 on breast cancer progression and metastasis in vivo, PY8119 cells and 4T1 cells stably transfected with TALDO1 shRNA or control vector were injected subcutaneously into Nude mice and BALB/c mice (Fig. 6N). Transwell and wound healing assays revealed that TALDO1 knockdown significantly reduced the invasion ability of 4T1 or PY81191 cell lines in vitro (Supplementary Fig. S6A–G). Compared with the control group, the TALDO1 knockdown group had significantly reduced tumor volume and weight (Fig. 6O–R; Supplementary Fig. S6H–K). Hematoxylin and eosin staining of the lung after xenograft tumor growth of 4T1 cells BALB/c mice revealed that TALDO1 knockdown significantly reduced lung injury (Fig. 6S and T). These results demonstrated that TALDO1 knockdown efficiently inhibits breast cancer growth and progression.
To further investigate the role of EV-TALDO1 in promoting breast cancer progression in vivo, EVs from 4T1 cells that were transfected with TALDO1 overexpression (TALDO1-EV group) and 4T1 cells (control-EV group) were isolated and DiR-labeled. 4T1 cells expressing luciferase-GFP were injected subcutaneously into BALB/c mice to develop a breast cancer xenograft model (Fig. 6N). Seven days later, we treated BALB/c mice with DiR-labeled TALDO1-EVs or DiR-labeled control-EVs by tail vein injection (Fig. 6N). DiR-labeled TALDO1-EVs or DiR-labeled control-EVs were administered every other day once the tumor volume reached 20 mm3 (Materials and Methods). Fifteen days later, the fluorescence intensity of DiR in the tumor region was evaluated using the IVIS Lumina live animal biophotonic imaging system (Fig. 6U). First, we detected the fluorescence intensity of the luciferase, indicating that 4T1 cells expressing luciferase-GFP were successfully injected into BALB/c mice and breast cancer xenograft models were established (the first row of mouse images in Fig. 6U). Second, we detected the fluorescence intensity of DiR, indicating that DiR-labeled TALDO1-EVs or DiR-labeled control-EVs were successfully injected into breast cancer xenograft models (the second row of mouse images in Fig. 6U). Finally, we merged the above images (the third row of mouse images in Fig. 6U) and quantified the fluorescence intensity of DiR in the tumor region (Fig. 6V), indicating that EVs significantly increased in the tumor region in the TALDO1-EV group compared with the control-EV group. Additionally, the TALDO1-EV group demonstrated a significant increase in tumor volume compared with the control-EV group (Fig. 6W and X). Furthermore, hematoxylin and eosin staining of the lung revealed a significant increase in lung injury in the TALDO1-EV group compared with the control-EV group (Fig. 6Y; Supplementary Fig. S6L). These data further suggested that EV-TALDO1 plays a key role in breast cancer progression.
Discovery of a potent TALDO1 inhibitor AO-022 for treating breast cancer
Based on the metastasis-promoting role of TALDO1 in breast cancer, we wondered whether targeting TALDO1 would be therapeutically beneficial. To this end, we applied high-throughput molecular docking technology and a library consisting of 271,380 small molecules (Specs library) to identify small molecular compounds that could functionally inhibit TALDO1 (Fig. 7A). Based on the docking results, we selected the top 10 compounds for further characterization. Small-molecule binding to a protein is known to increase its thermal stability, which is reflected by a detectable increase (or “shift”) in the melting temperature of the protein (39). Therefore, the interactions of TALDO1 and the compounds were evaluated in a thermal shift assay using recombinant TALDO1. Among the 10 compounds, we noticed that AO-022 (Fig. 7B and C) increased the melting temperature of TALDO1 from 42°C to 47°C (Fig. 7D), indicating that AO-022 binds to and stabilizes TADLO1 in vitro. The direct binding of AO-022 and TALDO1 was further evaluated using a cellular thermal shift assay. Indeed, incubation of MDA-MB231 with 100-μmol/L AO-022 for 2 hours significantly increased the thermal stability of TALDO1 at 54°C and 56°C (Fig. 7E), confirming the interaction of TALDO1 and AO-022 in vitro. Molecular docking results proved that two hydrogen bonds were generated between AO-022 and TALDO1 at Glu77 and Asp108 (Fig. 7B and C).
To assess the antitumor effect of AO-022 in breast cancer, MDA-MB-231 cells were treated with different concentrations of AO-022. CCK8 assay demonstrated that AO-022 could reduce the proliferation ability of MDA-MB-231 in a dose-dependent manner (Fig. 7F). Considering that overexpression of TALDO1 in breast cancer is frequently correlated with tumor metastasis, we next inquired whether AO-022 could inhibit the invasion and migration ability of tumor cells. Western blot results revealed that AO-022 inhibited the protein expression of vimentin and snail (Fig. 7G). Wound healing assay and transwell assay demonstrated that AO-022 impaired the cell migration (Fig. 7H and I) and invasion ability (Fig. 7J and K) of MDA-MB-231 cells. Collectively, these results suggested that AO-022 efficiently inhibits breast cancer progression and migration.
TALDO1 is a rate-limiting enzyme in the nonoxidative pentose phosphate pathway (PPP) and plays an important role in NADPH generation and glycolysis (40). Proliferative conditions in malignant tumor tissue are accompanied by increased metabolism of glucose. Glycolysis is the center pathway of glucose metabolism and is closely interconnected with the PPP (41). Therefore, we further explored whether AO-022 had an inhibition effect on extracellular acidification rate, which indicated the activity of glycolysis (42). The results showed that AO-022 reduced the level of extracellular acidification rate in MDA-MB231 cells dose-dependently (Fig. 7L).
To clarify whether TALDO1 primarily fuels breast cancer metastasis through positive modulation of redox metabolism (NADPH production), we measured the NADP+/NADPH ratio in MCF7 cells overexpressing TALDO1 and MDA-MB-231 cells knocked down TALDO1 after AO-022 treatment. These results showed that AO-022 treatment resulted in a significant increase in NADP+/NADPH ratio in wild-type MAD-MB-231 or MCF7 cells overexpressing TALDO1 (Fig. 7M and N). Conversely, AO-022 treatment did not affect the NADP+/NADPH ratio in MAD-MB-231 knocked down TALDO1 or wild-type MCF7 cells (Fig. 7M and N). Collectively, these data suggested that the TALDO1 inhibitor, AO-022, impaired the production of NADPH.
In addition, Patra and colleagues (43) reported that cancer cells utilize the reprogramming of the PPP to replenish the pool of R5P, the precursor for the biosynthesis of all types of nucleotides. To assess whether TALDO1 has an effect on energy metabolism (nucleotide production) in breast cancer, we measured the levels of ribose-5-phosphate, ATP, GTP, UTP, CTP, and TTP in cells treated with AO-022 by targeted metabolomic analysis. Targeted metabolomic analysis confirmed that the nucleotide pool of MDA-MB-231 cells (Fig. 7O) and MCF7 cells overexpressing TALDO1 (Fig. 7P) was reduced after AO-022 treatment. Collectively, these data suggested that TALDO1 inhibitor, AO-022, suppressed the PPP, impairing the production of R5P, which is essential for nucleotide synthesis. Together, these results indicated that TALDO1 may primarily fuel breast cancer metastasis through positive modulation of both redox metabolism (NADPH production) and energy metabolism (nucleotide production).
Further, to determine whether AO-022 might suppress breast cancer cells in vivo, we implanted MDA-MB-231 cells into nude mice and then treated nude mice with an intraperitoneal injection of AO-022 (50 mg/kg; Materials and Methods). The results showed that mice treated with AO-022 (50 mg/kg) did not show a difference in body weight loss from the PBS group (Fig. 7Q). Importantly, mice treated with AO-022 (50 mg/kg) showed significant reductions in tumor volume and tumor weight of MDA-MB-231 xenografts in a murine subcutaneous model compared with the PBS group (Fig. 7R–T). These results demonstrated that AO-022 efficiently inhibits the growth of breast cancer in vivo.
In addition, to further support this finding, we constructed three metastatic breast cancer PDOs. After 5 days of cultivation, we treated the PDOs with AO-022 (40 μmol/L). We observed the growth of metastatic breast cancer PDOs was inhibited by AO-022 (40 μmol/L; Fig. 7U and V). These results further demonstrated that AO-022 efficiently inhibits the growth of breast cancer.
Discussion
Nowadays, ultrasonography and mammography combined with histopathological confirmation are used as the gold standard for breast cancer diagnosis and assessment of metastatic status. However, the above-mentioned modalities still have several disadvantages, and it is necessary to continue research and develop alternative methods for diagnosing breast cancer and assessing the metastatic status. Blood tests are the most readily accessible source for the early detection, classification, and treatment guidance of breast cancer patients. The billions of EVs circulating in the blood could represent an essential component of liquid biopsy (44). Because EV-derived proteins are readily available and perform vital activities, EV-derived protein biomarkers introduce significant potential in breast cancer diagnosis as a complementary and adjunctive modality to the current clinical gold standard. Here, we performed a large-scale comprehensive analysis of EV proteomes from 196 serum samples obtained from patients with breast cancer and HDs. We constructed two sets of specific EV protein classifiers for diagnosing breast cancer and distinguishing patients with LN metastatic disease. This study may provide a reference value for differentiating breast cancer and non-breast cancer using serum in the future and also indicates that tumor-associated EV proteins can serve as biomarkers for early-stage cancer detection of LN metastasis.
TALDO1 is an enzyme of the PPP, which has been implicated in cancer (45). For example, Basta and colleagues (46) reported that certain genetic polymorphisms in TALDO1 are associated with squamous cell carcinoma of the head and neck. Heinrich and colleagues (47) found that in liver cancer, TALDO1 activity was increased by 1.5 to 3.4 times compared with normal liver tissue regardless of tumor stage. In this study, we found that serum EV-TALDO1 could promote the migration of breast cancer cells, and was a novel therapeutic target for distant metastasis of breast cancer. Based on the potential role of TALDO1 in promoting breast cancer metastasis, the development of drugs that modulate TALDO1 activity may provide a new therapeutic strategy for metastatic breast cancer. Allosteric regulation is a widely effective way to regulate protein function. After 5 decades of development, research on allosteric mechanisms has deepened and gradually translated into drug development and other applications (48). Due to the advantages of higher target selectivity, lower toxicity, and fewer side effects, the academia and pharmaceutical industry are increasingly committed to developing allosteric drugs (49). We used the computational chemical biology tool developed in the AlloSteric Database to analyze the TALDO1 structure and identify potential allosteric regulatory sites. Based on this site, we combined computer-assisted virtual screening and TALDO1 catalytic activity assays in vitro to screen and obtain the small-molecule inhibitor AO-022, which can modulate TALDO1 catalytic activity and preliminarily demonstrated its functions in inhibiting breast cancer metastasis in vitro and in vivo.
An increasing number of studies have demonstrated the association of EVs with specific molecules from blood (50, 51). In line with this, Buzás and colleagues (52) found that protein lists of EVs in EV databases such as EVpedia (5) or Vesiclepedia (14) show a high number of plasma proteins in EV preparations. To investigate whether these proteins are pollutants or intrinsic components of EVs, Buzás and colleagues (52) analyzed EVs in plasma and found that EVs adsorbed plasma proteins and formed a protein corona on the surface of them, which may be necessary for EVs to fully fulfill their potential function. These protein corona proteins (such as ApoA1, ApoB, CO3, CO4B, FIBA, and IGHG2) may be an important component of EVs, rather than acting as a contaminant that is not sufficiently removed during separation and purification (52). Thus, our EV proteome and blood proteome catalogs partially overlap. To further illustrate that our EV proteome dataset was different from the serum proteome dataset, we randomly selected 30 serum samples (including 15 HDs and 15 breast cancers) from the 196 serum samples in the discovery cohort in this study and conducted proteomic analysis. We identified 388 proteins that were significantly differentially enriched between breast cancer and HD serum samples (BCmean/HDmean > 2-fold or <0.5-fold, Student t test, P < 0.05). Only 27 proteins were upregulated in both serum and EV proteome datasets, and 11 proteins were downregulated in both serum and EV proteome datasets (Supplementary Fig. S7A). To further illustrate that the serum proteome dataset and serum-derived EV proteome dataset were two distinct datasets, we compared the quantitative results of markers in the two datasets. Notably, the seven-protein classifier that we identified in EV proteome (for diagnosing breast cancer) did not differ significantly between HDs and breast cancers in serum proteome (Supplementary Fig. S7B). Furthermore, we tested these seven proteins in serum proteome for distinguishing breast cancers from HDs, resulting in poor efficacy with 73.3% sensitivity and 60.0% specificity (Supplementary Fig. S7C). These results suggested that although there was an overlap of protein lists between EV proteins and serum proteins, the abundance of these proteins in serum and EV proteome was different, and thus, the ability of these proteins in different sample sources to diagnose breast cancer was different.
In conclusion, our findings show that proteins carried by breast cancer–derived EVs could be used as a novel, minimally invasive liquid biopsy tool for the early detection of breast cancer, as well as for discriminating LN involvement status and distant metastasis. TALDO1 is a novel serum EV biomarker for metastatic breast cancer. Finally, we screened and obtained the allosteric inhibitor AO-022 for TALDO1, demonstrating that pharmacological inactivation of TALDO1 is a potential therapeutic approach for metastatic breast cancer. These findings could advance the implementation of routine serum EV-based screening in the clinic.
Authors’ Disclosures
No disclosures were reported.
Authors’ Contributions
G. Xu: Data curation, software, formal analysis, investigation, visualization, methodology, writing–original draft. R. Huang: Data curation, validation, investigation. R. Wumaier: Data curation, validation, investigation. J. Lyu: Data curation, software, formal analysis. M. Huang: Data curation, formal analysis. Y. Zhang: Validation, investigation, methodology. Q. Chen: Resources, validation, investigation. W. Liu: Resources, validation, investigation. M. Tao: Resources, validation, investigation. J. Li: Resources, validation, investigation. Z. Tao: Resources, validation, investigation. B. Yu: Resources, validation, investigation. E. Xu: Resources, validation. L. Wang: Resources, validation. G. Yu: Methodology. O. Gires: Methodology. L. Zhou: Methodology. W. Zhu: Supervision, writing–original draft, project administration, writing–review and editing. C. Ding: Conceptualization, data curation, supervision, funding acquisition, investigation, methodology, writing–original draft, project administration, writing–review and editing. H. Wang: Conceptualization, resources, supervision, funding acquisition, validation, investigation, writing–original draft, project administration, writing–review and editing.
Acknowledgments
This work was supported by the National Key R&D Program of China [2023YFC2506400 (H. Wang)], National Natural Science Funds [82225038 (H. Wang), 82073269 (H. Wang), 82272828 (J. Li), and M-0349 (H. Wang)], Shanghai Science and Technology Innovation Action Plan [23J21900900 (H. Wang)], Innovative research team of high-level local universities in Shanghai [SHSMU-ZLCX20211600 (H. Wang)], National Key Research and Development Program of China [2022YFA1303200 (C. Ding), and 2022YFA1303201 (C. Ding)], National Natural Science Foundation of China [32330062 (C. Ding) and 31972933 (C. Ding)], Program of Shanghai Academic/Technology Research Leader [22XD1420100 (C. Ding)], the Major Project of Special Development Funds of Zhangjiang National Independent Innovation Demonstration Zone [ZJ2019-ZD-004 (C. Ding)], Shanghai Municipal Science and Technology Major Project [2023SHZDZX02 (C. Ding)], and the Fudan Original Research Personalized Support Project (C. Ding). This work is supported by the Human Phenome Data Center of Fudan University.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).