Dysregulation of miRNA expression may influence breast cancer progression, and experimental evidence suggests that miRNA silencing might suppress breast cancer metastasis. However, the relationship between miRNA and metastasis must be confirmed before this approach can be applied in the clinic. To this end, we conducted a two-stage study in a cohort of 3,760 patients with breast cancer to first identify and then validate the association between miRNA expression and risk of distant metastasis. The first stage (discovery) entailed miRNA sequencing of 126 case–control pairs; qPCR was used to validate the findings in a separate set of 80 case–control pairs. The 13 miRNAs most differentially expressed between cases and controls were combined into an miRNA score that was significantly associated with risk of distant metastasis in a logistic regression model that also included clinical variables (tumor size and number of positive lymph nodes) (ORper unit increase in score = 1.30; 95% confidence interval, 1.03–1.66). The results of this study suggest that in women with invasive breast cancer, a miRNA score that incorporates both clinical variables and miRNA expression levels in breast tumor tissue is moderately predictive of risk of subsequent distant metastasis.

Significance:

A novel predictive scoring system for patients with breast cancer includes clinical variables and the expression levels of 13 miRNAs and may help to identify those at increased risk of distant metastasis.

Breast cancer mortality is largely attributable to the development of systemic, hematogenously disseminated metastatic disease (1, 2). To decrease the risk of metastasis, approximately 80% of patients with breast cancer are treated with adjuvant chemotherapy. However, because far fewer than 80% of patients eventually relapse and develop metastatic disease, many patients are unnecessarily subjected to the acute and long-term side effects of chemotherapeutic regimens (2). Clinical prognostic criteria such as histopathologic grade and tumor size do not successfully predict systemic metastatic potential, and angiolymphatic invasion and regional lymph node metastasis do not always correlate with subsequent distant spread, perhaps because the mechanisms of hematogenous spread are different from those for lymphatic spread. Hence, new methods to identify tumors likely to metastasize are needed.

miRNAs are short noncoding RNAs, generally 19–25 nucleotides in length, that are found in all eukaryotic cells, and of which more than 2,000 have been identified in humans (3, 4). miRNAs have been shown to control several cancer-relevant processes such as proliferation, apoptosis, migration and invasion, stem cell differentiation, cancer stem cell formation, and acquisition of the epithelial–mesenchymal transition phenotype (3, 4). Not surprisingly, therefore, aberrant miRNA expression can influence cancer onset and progression, and there is some evidence that dysregulation of miRNA expression may influence breast cancer prognosis (5). Indeed, recent reports have related individual miRNAs to breast tumor invasion and metastasis (6, 7). For example, overexpression of hsa-miR-10b-5p (8), hsa-miR-373-5p, hsa-miR-520-5p (9), and hsa-miR-520h (10) has been shown to be positively associated with tumor invasion and metastasis, whereas overexpression of hsa-miR-31-5p, hsa-miR-126-5p, hsa-miR-206-5p, and hsa-miR-335-5p has been shown to suppress metastasis (11, 12). However, many of these findings have not been validated (5).

There is experimental evidence to suggest that miRNA silencing might suppress breast cancer metastasis (8, 13) However, before clinical application is attempted, it is important that miRNA–metastasis relationships be shown reproducibly. To this end, we conducted a population-based two-stage study to first identify and then validate the association between miRNA expression and risk of distant metastasis of breast cancer.

Study population

The study was conducted within a cohort of 3,760 women in the Kaiser Permanente Northwest (KPNW) health care system, who received a first diagnosis of invasive ductal carcinoma of the breast between January 1, 1980 and December 31, 2000, were aged ≥21 years at initial diagnosis, were treated surgically, and did not have evidence of metastasis at initial diagnosis (14). The KPNW health care system is a prepaid health plan that provides comprehensive medical care for >500,000 members in facilities located in Southwest Washington and Northwest Oregon. Members receive essentially all preventive and therapeutic care from KPNW physicians. Breast cancer cases were ascertained through the KPNW Tumor Registry. During the study period, the Registry maintained a follow-up rate of 98%, even for those patients who were no longer health plan members. The subjects with breast cancer were followed from the date of initial diagnosis until the date of: distant metastasis; termination of plan membership; death; or termination of follow-up of the cohort (December 31, 2010), whichever came first. A total of 530 women in the cohort developed a distant metastasis during follow-up.

The study reported here was approved under a waiver of written informed consent by the Institutional Review Boards of the participating institutions.

Study design/sample size

The study utilized data and tissue acquired as part of a previous case–control study focused on the relationship between tumor microenvironment of metastasis and risk of distant metastasis, which was nested within the breast cancer cohort (14). Cases were women who developed a distant metastasis subsequent to their initial breast cancer diagnosis, whereas controls were women who were alive and had not developed a distant metastasis by the date of metastasis of the corresponding case. Controls were individually matched to cases (1:1) on age at and calendar year of the diagnosis of invasive breast cancer (matching on both variables was generally within ±1 year) and were selected randomly from risk sets, with replacement. For this study, 126 case–control pairs, randomly selected from the original study population, were selected for inclusion in stage I for identification of miRNAs related to risk, and a separate set of 80 case–control pairs was used in stage II for validation of candidate miRNAs identified in stage I.

Clinical data

Information on tumor characteristics, treatment, and outcome was obtained from the KPNW Tumor Registry. Estrogen receptor (ER), progesterone receptor (PR), and HER2 were assessed in the previous study based on this cohort (14).

Tissue acquisition

Hematoxylin and eosin (H&E)-stained sections from the study subjects were reviewed to identify appropriate formalin-fixed, paraffin-embedded (FFPE) tissue blocks. Blocks were required to contain viable invasive tumor tissue that was representative of the tumor's growth pattern, nuclear grade, and mitotic activity. Blocks received from the KPNW cohort had an H&E slide prepared for confirmation of their appropriateness for inclusion. The tumors were reviewed histologically and graded using the modified Bloom–Richardson criteria (15, 16). All reviews were performed without knowledge of patient data, treatment, or outcome.

RNA extraction

All assays were performed without knowledge of case–control status. Total RNA extraction from FFPE clinical specimens was performed using a simultaneous RNA/DNA extraction protocol described previously (17, 18). In brief, the tumors were localized on the corresponding H&E sections and marked by the study pathologist. The marked H&E sections were then used as guides to select tumor tissue on 12 consecutive unstained 10 μ sections for macrodissection, as described elsewhere (19); the relevant tissue was scraped off the slides and transferred to siliconized eppendorf tubes. The tissues were deparaffinized using CitriSolv (Thermo Fisher Scientific) at room temperature on a Thermomixer (Eppendorf), followed by ethanol washes on ice, a 1 × PBS wash, and rehydratation in the presence of RNase inhibitors. The macrodissected tumor tissues were digested with proteinase K (3 mg/mL) at 59°C for 1 hour. The digested tissues underwent butanol-1 extraction to reach a final volume of 100 μL, which was homogenized in 1 mL of TRizol (Invitrogen) following the manufacturer's instructions. The RNA was recovered from the upper phase of the TRizol solution, transferred to a new siliconized eppendorf, and precipitated with 0.1 mg/mL linear acrylamide (1 μL), 3 mol/L sodium acetate (18 μL), and 600 μL of isopropanol. The tubes were stored at −20°C overnight, and centrifuged the next morning at 14,000 rpm for 30 minutes at 4°C. The RNA pellets were washed with 200 μL of 70% RNase-free ethanol, dried and resuspended in 12 μL of RNase-free 1 × Tris/EDTA solution, and incubated for 30 minutes at 70°C. The RNA was quantified on a total RNA chip on a Bioanalyzer (Agilent).

Small RNA sequencing

Multiple studies have demonstrated that miRNA expression can be measured reliably in RNA extracted from FFPE specimens, because their small size protects them from fixation and degradation (20). For this study, 100 ng of total RNA from each sample was used to perform small RNA sequencing according to the protocol described previously (21). In total, 20 libraries, each containing 18 specimens (including experimental controls), were prepared over a period of 3 months (Supplementary Fig. S1). Two of the libraries were duplicates, of which one was prepared at the beginning and one at the end of the 3 month period. Total RNA from a 4-year old FFPE MCF10A cell line block was obtained and 100 ng was analyzed in duplicate, in each library, as a control. The small-RNA libraries were prepared by setting up 18 individual ligation reactions between 18 FFPE RNA samples and 18 different 3′ adenylated barcoded adapters. The reactions were stored at 4°C for 16 hours. The next day, the 18 reactions were heat-deactivated, combined and precipitated, and the ligated small RNAs were size selected on a 15% polyacrylamide gel. The ligated products were excised from the gel, purified, and subjected to a 5′ adapter ligation. After this ligation, the 5′ and 3′ adapter-ligated small RNAs were purified and reverse transcribed using the SuperScript III reverse transcription kit. The cDNA templates were subjected to PCR amplification and the libraries were size selected on a 2% agarose gel and purified with the Qiaquick DNA Purification Kit from Qiagen. The libraries were sequenced on an Illumina HiSeq 2500 sequencer. Adapter trimming and alignment to the human genome were performed to identify and quantify individual miRNA expression, using the RNAworld pipeline of Dr. Thomas Tuschl at Rockefeller University (New York, NY; ref. 22).

RNA samples from 269 study subjects underwent small RNA sequencing. Samples from one batch (eight case–control pairs; n = 16) that demonstrated poor sequencing quality were excluded. This left 253 samples, which included samples from 118 matched case–control pairs upon which the statistical analysis was based.

Quantitative PCR validation

The 13 miRNAs that were most strongly differentially expressed between cases and controls based on sequencing were validated using TaqMan reagents and primers on a StepOnePlus Instrument (Applied Biosystems). In addition, one miRNA that was not significantly associated with risk of distant metastasis (negative control), and two endogenous controls (RNU48 and RNU6b), were also assayed. The 13 miRNAs selected for quantitative PCR (qPCR) validation were hsa-miR-30a-5p (Assay ID#000417), hsa-miR-30a-3p (Assay ID#000416), hsa-miR-30c-2-3p (Assay ID#002110), hsa-miR-503 (Assay ID#001048), hsa-miR-301b (Assay ID#002392), hsa-miR-93-5p (Assay ID#000432), hsa-miR-30c-5p (Assay ID#000419), hsa-miR-196a-3p (Assay ID#002336), hsa-miR-340-5p (Assay ID#002258), hsa-miR-196a-5p (Assay ID#241070_mat), hsa-mir-125b-1-3p (Assay ID#002378), hsa-miR-130b-5p (Assay ID#000456), and hsa-miR-205-5p (Assay ID#000509). The negative control and the two endogenous controls were hsa-miR-183-5p (Assay ID#002269), and RNU48 (Assay ID#001006) and RNU6B (Assay ID#001093), respectively. The reverse transcription reactions were performed using 10 ng of FFPE RNA, following the manufacturer's instructions and using the TaqMan MicroRNA Reverse Transcription Kit (catalog no. 4366597). The quantitative reactions were set up in triplicate for each miRNA for each individual RNA sample using the TaqMan Universal PCR Master Mix Kit, no AmpErase UNG (catalog no. 4326614), following the manufacturer's instructions and also described elsewhere (21). Selected specimens were analyzed on the same MicroAmp optical 96-well reaction plates (catalog no. 4306737) and sealed with optical adhesive films (catalog no. 1434320) for optimal quantification measures. qPCR measurements were performed on a StepOnePlus instrument following the manufacturer's instructions for amplification cycles. Quantitative data were transferred to excel sheets for statistical analysis. RNU48 and RNU6b were used as endogenous controls for data normalization. Relative quantification of the gene transcripts (ΔCt) was obtained by subtracting the mean Ct (comparative threshold) of the endogenous controls (RNU44 and RNU6b) from the mean Ct value of the miRNA of interest. The ΔΔCt formula was used for quantification of individual miRNAs.

The qPCR was performed on 176 unique individuals. After subjects that appeared in more than one stratum were removed (because they were recorded as a case and a control in different strata, given the risk set sampling with replacement), there were 167 subjects that included 80 case–control pairs upon which the statistical analysis was based.

Statistical analyses

Stage 1: miRNA-sequencing analysis.

After quality control, the sva R package was used to correct for potential batch effects. Differential expression of the miRNAs was assessed using the DESeq2 R package. P values were corrected for multiple testing using the method of Benjamini and Hochberg (23). The candidate miRNAs were selected on the basis of two criteria: their statistical significance (Padj < 0.05) and their expression level (the baseMean >100).

To evaluate whether the miRNAs identified above were associated with risk of distant metastasis beyond the contribution of important clinical variables, each miRNA was added into a base logistic regression model. The base model included clinical variables selected from tumor size, number of positive lymph nodes, lymphovascular invasion, chemotherapy, hormone therapy, radiotherapy, and ER/PR/HER2 status, by a step-wise variable selection procedure.

Stage 2: qPCR validation.

Differentially expressed miRNAs identified by sequencing were further confirmed by qPCR in an independent validation study population. Relative miRNA expression was calculated using the ΔΔCt method. The normalization controls, RNU6b and RNU48, were used as the reference for comparative measures. For quality control of the qPCR, hsa-miR-30a-3p was evaluated using two different total RNA input amounts at different times, which provided an estimate of technical replicability. The agreement between the technical replicates was evaluated using Spearman correlation.

We first used paired Wilcoxon tests to validate the association of each candidate miRNA with risk of distant metastasis. Then, to test whether the candidate miRNAs collectively predicted risk of distant metastasis, we constructed an miRNA score as: miRNA score = w1 × ΔCT1+…+wn × ΔCTn, where ΔCT was calculated as the reference CT value minus the targeted miRNA CT value, and w is the logistic regression coefficient of each miRNA estimated from the logistic regression model based on sequencing data from stage I, while accounting for important clinical variables. The clinical variables were evaluated by constructing a clinical score using the same approach as in stage I. The area under the curves [AUC; 95% confidence interval (CI)] from logistic regression models with the clinical score only and with both the clinical score and the miRNA score were estimated using the pROC R Package. Finally, in exploratory analyses we estimated the AUCs of the miRNA scores for breast cancer subtypes classified according to the St. Gallen Consensus (24).

External validation.

As an external validation, the R package “TCGA2STAT” was used to download an miRNA sequencing dataset for invasive breast carcinoma from The Cancer Genome Atlas (TCGA; ref. 25). In this dataset, 1,046 miRNAs of 755 patients with breast carcinoma were available for analysis. Cox proportional hazard regression models were used to examine the association of the individual miRNAs identified in stage I and of the combined score (created using the same weights as for the score created from the study data) with risk of death.

Pathway enrichment analysis.

DIANA miRPath pathway enrichment analysis was used to gain insight into pathways related to differentially expressed miRNAs (http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=mirpath/index). An enrichment analysis of multiple miRNA target genes compared miRNA targets with all known KEGG pathways by Fisher's exact test.

Stage 1: discovery

Supplementary Table S1 shows the clinical characteristics of the study subjects included in stage I according to case–control status. Risk of distant metastasis was increased in association with tumor size, number of positive lymph nodes, lymphovascular invasion, and HER2/neu receptor positivity, and was decreased in association with PR positivity. A stepwise variable selection procedure with a selection cutoff P of 0.05 was used to select clinical variables most predictive of risk of distant metastasis. The final clinical model included two variables: tumor size (OR, 1.04; 95% CI, 1.01–1.07; P = 0.003) and number of positive lymph nodes (OR, 1.13; 95% CI, 1.03–1.24; P = 0.009; the full model with all clinical variables is shown in Supplementary Table S2).

A total of 118 case–control pairs were used for detection of differential expression of miRNAs by sequencing. We first evaluated the reliability of the sequencing data by sequencing samples from 46 subjects twice, and calculating the Spearman correlation of the miRNA expression levels between the two sequencing runs. The results demonstrated excellent agreement, with the average of the Spearman correlation coefficients for the 500 miRNAs with the highest expression levels being 0.94 (interquartile range: 0.036, P < 10e−10).

Of the 1,160 miRNAs that were detected by sequencing, 19 were significantly differentially expressed between the cases and controls (Padj < 0.05; Fig. 1; Table 1). For 12 of the miRNAs, expression levels were higher in the cases than in the controls, while for seven miRNAs the reverse was observed. From the 19 miRNAs, we selected the 13 miRNAs (of which six were under expressed in the cases relative to the controls) that were most strongly differentially expressed between cases and controls as candidate miRNAs for validation. Of these 13 miRNAs, five (has-miR-30a-5p, hsa-miR-30a-3p, hsa-miR-30c-2-3p, has-miR-125b-1-3p, and hsa-miR-205-5p) remained statistically significant (P < 0.05) after adjustment for the important clinical variables identified in stage I.

Figure 1.

Volcano plot illustrates miRNAs differentially expressed between patients with distant metastasis and patients without distant metastasis based on miRNA sequencing data.

Figure 1.

Volcano plot illustrates miRNAs differentially expressed between patients with distant metastasis and patients without distant metastasis based on miRNA sequencing data.

Close modal
Table 1.

The 19 miRNAs with Padj < 0.05 in miRNA sequencing analysisa

The 19 miRNAs with Padj < 0.05 in miRNA sequencing analysisa
The 19 miRNAs with Padj < 0.05 in miRNA sequencing analysisa

The top 10 KEGG pathways involving at least 12 of the identified miRNAs are shown in Supplementary Table S3. Among the deregulated pathways were proteoglycans in cancer, cell-cycle pathways, ubiquitin-mediated proteolysis, and the p53 signaling pathway.

Stage 2: validation

Supplementary Table S4 shows characteristics of the cases and controls included in stage II, which involved 80 case–control pairs. Similar to stage I, univariate analysis indicated risk of distant metastasis was increased in association with the number of positive lymph nodes and was decreased in association with PR positivity. In addition, hormone therapy was associated with increased risk. In a multivariate analysis, two clinical variables, number of positive lymph nodes (OR, 1.29; 95% CI, 1.06–1.56; P = 0.011) and PR positivity (OR, 0.16; 95% CI, 0.04–0.67; P = 0.013) were statistically significant (Supplementary Table S4).

The 13 candidate miRNAs from stage I were validated by qPCR in an independent nested case–control study with 80 case–control pairs. For technical validation, we replicated the qPCR assay for one candidate miRNA (hsa-miR-30a-3p) in the 127 samples that had adequate RNA amounts. The correlation between two qPCR assays was high, with a Spearman correlation coefficient of 0.91 (P < 0.001; Supplementary Fig. S2).

Each of the 13 candidate miRNAs was first evaluated without accounting for clinical variables. The results indicated that 12 of the 13 miRNAs had the same direction of association in both stage I and stage II, the only exception being hsa-miR-340-5p (P = 0.837; Table 2). However, only the expression levels of hsa-miR-30a-3p, hsa-miR-30c-2-3p, hsa-miR-30c-5p, and hsa-miR-205-5p were statistically significantly different between subjects who did and did not develop distant metastasis (P < 0.05, paired Wilcoxon test). The 13 miRNAs were further validated by fitting each separately in a logistic regression model with adjustment for tumor size and number of positive lymph nodes. After adjustment for the clinical variables identified in stage I, only two individual miRNAs (hsa-miR-30c-2-3p and hsa-miR-30c-5p) were significantly associated with risk of distant metastasis (P < 0.05; Table 2).

Table 2.

Validation of the associations of 13 candidate miRNAs with metastasis in stage II by RT-PCR and survival in CGTA miRNA-sequencing data

Wilcoxon testLogistic regression results with adjustment for clinical variablesaCox regression, death as outcome with adjustment for age
miRNAsFCPLog(OR)PLog(HR)P
hsa-miR-30a-3p 0.73 0.029 −0.227 0.180 N.A. N.A. 
hsa-miR-30a-5p 0.81 0.073 −0.280 0.100 −0.276 0.022 
hsa-miR-301b-3p 1.22 0.118 0.062 0.726 0.024 0.782 
hsa-miR-196a-3p 1.43 0.082 0.157 0.350 0.004 0.969 
hsa-miR-196a-5p 1.22 0.149 −0.018 0.913 0.019 0.836 
hsa-miR-130b-3p 1.09 0.316 0.044 0.796 0.497 0.497 
hsa-miR-93-5p 1.09 0.534 0.066 0.701 0.039 0.674 
hsa-miR-30c-2-3p 0.70 0.015 −0.404 0.022 −0.123 0.341 
hsa-miR-503-5p 0.95 0.789 −0.174 0.309 0.120 0.103 
hsa-miR-30c-5p 0.80 0.046 −0.347 0.044 −0.075 0.490 
hsa-miR-340-5p 0.94 0.837 −0.100 0.554 0.172 0.0005 
hsa-miR-125b-1-3p 0.86 0.189 −0.327 0.058 0.088 0.408 
hsa-miR-205-5p 0.73 0.032 −0.200 0.241 −0.103 0.354 
Wilcoxon testLogistic regression results with adjustment for clinical variablesaCox regression, death as outcome with adjustment for age
miRNAsFCPLog(OR)PLog(HR)P
hsa-miR-30a-3p 0.73 0.029 −0.227 0.180 N.A. N.A. 
hsa-miR-30a-5p 0.81 0.073 −0.280 0.100 −0.276 0.022 
hsa-miR-301b-3p 1.22 0.118 0.062 0.726 0.024 0.782 
hsa-miR-196a-3p 1.43 0.082 0.157 0.350 0.004 0.969 
hsa-miR-196a-5p 1.22 0.149 −0.018 0.913 0.019 0.836 
hsa-miR-130b-3p 1.09 0.316 0.044 0.796 0.497 0.497 
hsa-miR-93-5p 1.09 0.534 0.066 0.701 0.039 0.674 
hsa-miR-30c-2-3p 0.70 0.015 −0.404 0.022 −0.123 0.341 
hsa-miR-503-5p 0.95 0.789 −0.174 0.309 0.120 0.103 
hsa-miR-30c-5p 0.80 0.046 −0.347 0.044 −0.075 0.490 
hsa-miR-340-5p 0.94 0.837 −0.100 0.554 0.172 0.0005 
hsa-miR-125b-1-3p 0.86 0.189 −0.327 0.058 0.088 0.408 
hsa-miR-205-5p 0.73 0.032 −0.200 0.241 −0.103 0.354 

Abbreviation: FC, fold change.

aOn the basis of miRNA standardized values.

We then constructed an miRNA score to examine whether the 13 candidate miRNAs collectively predicted risk of distant metastasis. The miRNA score was significantly associated with risk of distant metastasis, respectively, in models without (ORper unit increase in score = 1.40; 95% CI, 1.11–1.77; P = 0.0048) and with (ORper unit increase in score = 1.30; 95% CI, 1.03–1.66; P = 0.029) the clinical score (based on tumor size and number of positive lymph nodes). The AUCs for the miRNA score alone, the clinical score alone, and for the miRNA score and the clinical score combined were 0.64 (95% CI, 0.55–0.72), 0.67 (95% CI, 0.58–0.75), and 0.69 (95% CI, 0.61–0.77), respectively (Fig. 2A, B, C, respectively). When these analyses were repeated after exclusion of the one miRNA that was not validated by qPCR (hsa-miR-340–5p), the results were almost identical [the AUCs of the miRNA scores without and with the clinical scores were 0.63 (95% CI, 0.55–0.72) and 0.69 (95% CI, 0.61–0.77), respectively].

Figure 2.

ROC curves for the miRNA score for predicting the risk of distant metastasis. ROC analysis was performed for the miRNA score only (A), for the clinical score only (B), and for the miRNA and the clinical scores combined (C).

Figure 2.

ROC curves for the miRNA score for predicting the risk of distant metastasis. ROC analysis was performed for the miRNA score only (A), for the clinical score only (B), and for the miRNA and the clinical scores combined (C).

Close modal

In exploratory analyses, we estimated the AUCs of the miRNA scores for the molecular subtypes of breast cancer assigned according to the St. Gallen Consensus (Supplementary Table S5). Although the subgroup analysis was limited by sample size, the results suggest that the miRNA score performed well in all except the Luminal B (HER2+) subgroup.

In the external validation, a total of 12 of the 13 miRNAs identified in our study were found in the TCGA miRNA dataset (the exception being hsa-miR-30a-3p). Among them, two miRNAs were significantly associated with risk of death after adjustment for age (hsa-miR-30a-5p, HRper unit increase in expression level = 0.76, 95% CI = 0.60–0.96, P = 0.022; hsa-miRNA-340–5p, HRper unit increase in expression level = 1.18, 95% CI = 1.08–1.31, P = 0.0005; see Table 2). The miRNA score was also associated with risk of death (HRper unit increase in score = 1.14; 95% CI, 1.00–1.31; P = 0.042). The AUC of the model at 5 years with age only was 0.63 (95% CI, 0.55–0.71). After adding the miRNA score to the model, the AUC increased to 0.66 (95% CI, 0.58–0.74).

The results of this population-based study suggest that in women with invasive breast cancer, an miRNA score that incorporates both clinical variables (tumor size, number of positive lymph nodes) and expression levels in breast tumor tissue of the 13 miRNAs that we identified initially by sequencing and subsequently further evaluated using qPCR (with additional external validation for 12 of the 13 miRNAs), is moderately predictive of risk of subsequent distant metastasis. When the 13 miRNAs were examined individually (each in conjunction with the clinical variables), only two miRNAs (hsa-miR30c-2-3p and hsa-miR-30c-5p) were significantly associated with risk.

Although our DIANA-miRPath pathway enrichment analysis identified several cancer-relevant miRNA-targeted signaling pathways, evidence regarding the mechanism of action of the individual miRNAs in relation to metastasis is somewhat lacking. Specifically, of the 13 miRNAs included in our signature, there is limited evidence regarding the role of hsa-miR-125-5p in breast cancer progression, while there is some evidence to suggest that miR-301-5p stimulates proliferation and invasion (26) and that hsa-miR-196-5p and hsa-miR-340-5p inhibit progression (27, 28). For hsa-miR-30-5p, hsa-miR-130-5p, hsa-miR-93-5p, hsa-miR-503-5p, and hsa-miR-205-5p, there is conflicting evidence regarding their roles in breast cancer progression, with some reports suggesting functions that operate to increase the risk of progression and others suggesting the reverse (29–35). The apparently conflicting evidence regarding the roles of specific miRNAs in breast cancer prognosis is consistent with the observation that some miRNAs exert both oncogenic and tumor suppressive effects (36, 37). This reflects the fact that individual miRNAs, which regulate gene expression by suppressing the translation and reducing the stability of mRNA, can influence the expression of a large number of genes (on the order of hundreds to thousands; refs. 36, 38). The overall net oncogenic or tumor suppressive effect of an miRNA depends in part upon the balance between miRNA-mediated upregulation or downregulation of oncogenic and tumor suppressive pathways, as well as tumor–immune system interactions and tumor-modifying extrinsic factors (36). Of note, in this study, all four miRNAs (hsa-miR-30a-3p, hsa-miR-30c-2-3p, hsa-miR-30c-5p, and hsa-miR-205-5p) that were significantly differentially expressed between cases and controls when reevaluated using qPCR were downregulated in those who developed distant metastasis. Nevertheless, there is a need for further work, perhaps entailing in vitro or in vivo experiments, to confirm the relevance of the miRNAs identified here to breast cancer progression.

Several previous studies (summarized in Supplementary Table S6) involving discovery and validation stages have identified miRNA signatures related to breast cancer prognosis (39–47). These signatures have involved anywhere from two (40) to 22 miRNAs (44), with little overlap in the miRNAs included in the various signatures. The reason(s) for the lack of overlap are unclear, but may include between-study differences in the composition of the study populations by breast cancer type and ethnicity, and differences in the assays used for miRNA detection. Of note, however, when we performed DIANA-miRPath pathways enrichment analysis using the miRNAs in the signatures identified in these studies, the top ranked pathway in our study [proteoglycans in cancer (hsa05205)] was also predicted in all but the study of Du and colleagues (see ref. 6 in footnote of Supplementary Table S6), in which a signature with only two miRNAs was generated. This is of interest because altered expression of proteoglycans affects cell signaling, growth and survival, cell adhesion, migration and angiogenesis, and has been associated with breast cancer metastasis (48).

Strengths of this study include the large, well-characterized study population; use in the discovery stage of a state-of-the-art assay for miRNA sequencing and use in the validation stage of qPCR, the current gold standard for determining miRNA expression levels; intensive training of the laboratory technicians in the study methods prior to each stage of the project to minimize assay variability; strict quality control for RNA extraction, preparation, and quantification, and built-in controls to monitor the accurate performance of our assays; and demonstration that the sequencing and the qPCR results were highly repeatable. Study limitations include the observation that RNA obtained from FFPE tissue is often degraded. Although this may have resulted in some individuals being misclassified with respect to their miRNA expression profile, we have demonstrated that it is possible to identify and characterize miRNA expression patterns using RNA extracted from FFPE tissue, and we have shown excellent agreement between miRNA expression levels in fresh and matched FFPE tissue (18); furthermore, as indicated above, both the miRNA sequencing and the qPCR assays used in this study were repeatable. Finally, although the sample size was substantial, there is a need for larger studies using a similar approach to that described here.

In conclusion, although the results of this study did not provide strong discrimination between those who developed distant metastasis and those who did not, replication of the approach described here (including further validation of the current findings), focused on breast cancer cases for whom archived FFPE invasive breast cancer tissue plus long-term clinical follow-up is available, may both lead to the development of a prognostic miRNA signature that can be used in the clinical management of breast cancer (e.g., by helping to identify patients who need enhanced surveillance and early aggressive treatment) and foster the development of novel therapeutic agents (37).

The interpretation and reporting of these data are the sole responsibility of the authors.

No potential conflicts of interest were disclosed.

Conception and design: T.E. Rohan, T. Wang, O. Loudig

Development of methodology: T.E. Rohan, T. Wang, O. Loudig

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): T.E. Rohan, S. Weinmann, O. Loudig

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): T.E. Rohan, T. Wang, Y. Wang, J. Lin, M. Ginsberg, O. Loudig

Writing, review, and/or revision of the manuscript: T.E. Rohan, T. Wang, S. Weinmann, Y. Wang, M. Ginsberg, O. Loudig

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): T.E. Rohan, S. Weinmann, M. Ginsberg, O. Loudig

Study supervision: T.E. Rohan, O. Loudig

This study was supported by a grant to T.E. Rohan from the Breast Cancer Research Foundation (BCRF-17-138). We thank Minerva Manickchand for her dedicated work as the project coordinator for this study. We would also like to thank the following staff at the Kaiser Center for Health Research who worked on this project for several years: Nicole Bennett, Kristine Bennett, Donna Gleason, Kathy Pearson, Tracy Dodge, Stacy Harsh, and Kevin Winn. Finally, we also gratefully acknowledge the contribution of Dr. Andrew G. Glass (deceased).

1.
Ries
LAG
,
Melbert
D
,
Krapcho
M
,
Mariotto
A
,
Miller
BA
,
Feuer
EJ
, et al
SEER Cancer Statistics Review. 1975-2004
.
Bethesda, MD
:
National Cancer Institute
.
Available from:
http://seer.cancer.gov/csr/1975_2004/,
based on November 2006 SEER data submission, posted to the SEER web site, 2007
.
2.
Weigelt
B
,
Peterse
JL
,
van 't Veer
LJ
. 
Breast cancer metastasis: markers and models
.
Nat Rev Cancer
2005
;
5
:
591
602
.
3.
Lin
S
,
Gregory
RI
. 
MicroRNA biogenesis pathways in cancer
.
Nat Rev Cancer
2015
;
15
:
321
33
.
4.
Acunzo
M
,
Romano
G
,
Wernicke
D
,
Croce
CM
. 
MicroRNA and cancer–a brief overview
.
Adv Biol Regul
2015
;
57
:
1
9
.
5.
van Schooneveld
E
,
Wildiers
H
,
Vergote
I
,
Vermeulen
PB
,
Dirix
LY
,
Van Laere
SJ
. 
Dysregulation of microRNAs in breast cancer and their potential role as prognostic and predictive biomarkers in patient management
.
Breast Cancer Res
2015
;
17
:
21
.
6.
Pencheva
N
,
Tavazoie
SF
. 
Control of metastatic progression by microRNA regulatory networks
.
Nat Cell Biol
2013
;
15
:
546
54
.
7.
Takahashi
RU
,
Miyazaki
H
,
Ochiya
T
. 
The roles of microRNAs in breast cancer
.
Cancers
2015
;
7
:
598
616
.
8.
Ma
L
,
Reinhardt
F
,
Pan
E
,
Soutschek
J
,
Bhat
B
,
Marcusson
EG
, et al
Therapeutic silencing of miR-10b inhibits metastasis in a mouse mammary tumor model
.
Nat Biotechnol
2010
;
28
:
341
7
.
9.
Huang
Q
,
Gumireddy
K
,
Schrier
M
,
le Sage
C
,
Nagel
R
,
Nair
S
, et al
The microRNAs miR-373 and miR-520c promote tumour invasion and metastasis
.
Nat Cell Biol
2008
;
10
:
202
10
.
10.
Su
CM
,
Wang
MY
,
Hong
CC
,
Chen
HA
,
Su
YH
,
Wu
CH
, et al
miR-520h is crucial for DAPK2 regulation and breast cancer progression
.
Oncogene
2016
;
35
:
1134
42
.
11.
Tavazoie
SF
,
Alarcon
C
,
Oskarsson
T
,
Padua
D
,
Wang
Q
,
Bos
PD
, et al
Endogenous human microRNAs that suppress breast cancer metastasis
.
Nature
2008
;
451
:
147
52
.
12.
Valastyan
S
,
Reinhardt
F
,
Benaich
N
,
Calogrias
D
,
Szasz
AM
,
Wang
ZC
, et al
A pleiotropically acting microRNA, miR-31, inhibits breast cancer metastasis
.
Cell
2009
;
137
:
1032
46
.
13.
Cheng
CJ
,
Bahal
R
,
Babar
IA
,
Pincus
Z
,
Barrera
F
,
Liu
C
, et al
MicroRNA silencing for cancer therapy targeted to the tumour microenvironment
.
Nature
2015
;
518
:
107
10
.
14.
Rohan
TE
,
Xue
X
,
Lin
HM
,
D'Alfonso
TM
,
Ginter
PS
,
Oktay
MH
, et al
Tumor microenvironment of metastasis and risk of distant metastasis of breast cancer
.
J Natl Cancer Inst
2014
;
106
:dju136.
15.
Elston
CW
,
Ellis
IO
. 
Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up
.
Histopathology
1991
;
19
:
403
10
.
16.
Genestie
C
,
Zafrani
B
,
Asselain
B
,
Fourquet
A
,
Rozan
S
,
Validire
P
, et al
Comparison of the prognostic value of Scarff-Bloom-Richardson and Nottingham histological grades in a series of 825 cases of breast cancer: major importance of the mitotic count as a component of both grading systems
.
Anticancer Res
1998
;
18
:
571
6
.
17.
Loudig
O
,
Milova
E
,
Brandwein-Gensler
M
,
Massimi
A
,
Belbin
TJ
,
Childs
G
, et al
Molecular restoration of archived transcriptional profiles by complementary-template reverse-transcription (CT-RT)
.
Nucleic Acids Res
2007
;
35
:
e94
.
18.
Kotorashvili
A
,
Ramnauth
A
,
Liu
C
,
Lin
J
,
Ye
K
,
Kim
R
, et al
Effective DNA/RNA co-extraction for analysis of microRNAs, mRNAs, and genomic DNA from formalin-fixed paraffin-embedded specimens
.
PLoS One
2012
;
7
:
e34683
.
19.
Giricz
O
,
Reynolds
PA
,
Ramnauth
A
,
Liu
C
,
Wang
T
,
Stead
L
, et al
Hsa-miR-375 is differentially expressed during breast lobular neoplasia and promotes loss of mammary acinar polarity
.
J Pathol
2012
;
226
:
108
19
.
20.
Streichert
T
,
Otto
B
,
Lehmann
U
. 
MicroRNA expression profiling in archival tissue specimens: methods and data processing
.
Mol Biotechnol
2012
;
50
:
159
69
.
21.
Loudig
O
,
Wang
T
,
Ye
K
,
Lin
J
,
Wang
Y
,
Ramnauth
A
, et al
Evaluation and Adaptation of a Laboratory-Based cDNA library preparation protocol for retrospective sequencing of archived MicroRNAs from up to 35-Year-Old Clinical FFPE Specimens
.
Int J Mol Sci
2017
;
18
:
627
.
22.
Farazi
TA
,
Brown
M
,
Morozov
P
,
Ten Hoeve
JJ
,
Ben-Dov
IZ
,
Hovestadt
V
, et al
Bioinformatic analysis of barcoded cDNA libraries for small RNA profiling by next-generation sequencing
.
Methods
2012
;
58
:
171
87
.
23.
Benjamini
Y
,
Hochberg
Y
. 
Controlling the false discovery rate: a practical and powerful approach to multiple testing
.
J Roy Stat Soc Ser B Stat Methodol
1995
;
57
:
289
300
.
24.
Goldhirsch
A
,
Winer
EP
,
Coates
AS
,
Gelber
RD
,
Piccart-Gebhart
M
,
Thurlimann
B
, et al
Personalizing the treatment of women with early breast cancer: highlights of the St Gallen international expert consensus on the primary therapy of early breast cancer 2013
.
Ann Oncol
2013
;
24
:
2206
23
.
25.
Wan
YW
,
Allen
GI
,
Liu
Z
. 
TCGA2STAT: simple TCGA data access for integrated statistical analysis in R
.
Bioinformatics
2016
;
32
:
952
4
.
26.
Shi
W
,
Gerster
K
,
Alajez
NM
,
Tsang
J
,
Waldron
L
,
Pintilie
M
, et al
MicroRNA-301 mediates proliferation and invasion in human breast cancer
.
Cancer Res
2011
;
71
:
2926
37
.
27.
Li
Y
,
Zhang
M
,
Chen
H
,
Dong
Z
,
Ganapathy
V
,
Thangaraju
M
, et al
Ratio of miR-196s to HOXC8 messenger RNA correlates with breast cancer cell migration and metastasis
.
Cancer Res
2010
;
70
:
7894
904
.
28.
Maskey
N
,
Li
D
,
Xu
H
,
Song
H
,
Wu
C
,
Hua
K
, et al
MicroRNA-340 inhibits invasion and metastasis by downregulating ROCK1 in breast cancer cells
.
Oncol Lett
2017
;
14
:
2261
7
.
29.
Greene
SB
,
Herschkowitz
JI
,
Rosen
JM
. 
The ups and downs of miR-205: identifying the roles of miR-205 in mammary gland development and breast cancer
.
RNA Biol
2010
;
7
:
300
4
.
30.
Li
N
,
Miao
Y
,
Shan
Y
,
Liu
B
,
Li
Y
,
Zhao
L
, et al
MiR-106b and miR-93 regulate cell progression by suppression of PTEN via PI3K/Akt pathway in breast cancer
.
Cell Death Dis
2017
;
8
:
e2796
.
31.
Long
J
,
Ou
C
,
Xia
H
,
Zhu
Y
,
Liu
D
. 
MiR-503 inhibited cell proliferation of human breast cancer cells by suppressing CCND1 expression
.
Tumour Biol
2015
;
36
:
8697
702
.
32.
Shyamasundar
S
,
Lim
JP
,
Bay
BH
. 
miR-93 inhibits the invasive potential of triple-negative breast cancer cells in vitro via protein kinase WNK1
.
Int J Oncol
2016
;
49
:
2629
36
.
33.
Yang
SJ
,
Yang
SY
,
Wang
DD
,
Chen
X
,
Shen
HY
,
Zhang
XH
, et al
The miR-30 family: versatile players in breast cancer
.
Tumour Biol
2017
;
39
:1010428317692204.
34.
Zhang
HD
,
Jiang
LH
,
Sun
DW
,
Li
J
,
Ji
ZL
. 
The role of miR-130a in cancer
.
Breast Cancer
2017
;
24
:
521
7
.
35.
Zhao
Z
,
Fan
X
,
Jiang
L
,
Xu
Z
,
Xue
L
,
Zhan
Q
, et al
miR-503–3p promotes epithelial-mesenchymal transition in breast cancer by directly targeting SMAD2 and E-cadherin
.
J Genet Genom
2017
;
44
:
75
84
.
36.
Svoronos
AA
,
Engelman
DM
,
Slack
FJ
. 
OncomiR or tumor suppressor? The duplicity of microRNAs in cancer
.
Cancer Res
2016
;
76
:
3666
70
.
37.
Weidle
UH
,
Dickopf
S
,
Hintermair
C
,
Kollmorgen
G
,
Birzele
F
,
Brinkmann
U
. 
The role of micro RNAs in breast cancer metastasis: preclinical validation and potential therapeutic targets
.
Cancer Genomics Proteomics
2018
;
15
:
17
39
.
38.
Ritchie
W
,
Rasko
JE
,
Flamant
S
. 
MicroRNA target prediction and validation
.
Adv Exp Med Biol
2013
;
774
:
39
53
.
39.
Chen
X
,
Wang
YW
,
Zhu
WJ
,
Li
Y
,
Liu
L
,
Yin
G
, et al
A 4-microRNA signature predicts lymph node metastasis and prognosis in breast cancer
.
Hum Pathol
2018
;
76
:
122
32
.
40.
Du
F
,
Yuan
P
,
Zhao
ZT
,
Yang
Z
,
Wang
T
,
Zhao
JD
, et al
A miRNA-based signature predicts development of disease recurrence in HER2 positive breast cancer after adjuvant trastuzumab-based treatment
.
Sci Rep
2016
;
6
:
33825
.
41.
Gasparini
P
,
Cascione
L
,
Fassan
M
,
Lovat
F
,
Guler
G
,
Balci
S
, et al
microRNA expression profiling identifies a four microRNA signature as a novel diagnostic and prognostic biomarker in triple negative breast cancers
.
Oncotarget
2014
;
5
:
1174
84
.
42.
Gong
C
,
Tan
W
,
Chen
K
,
You
N
,
Zhu
S
,
Liang
G
, et al
Prognostic value of a BCSC-associated microRNA signature in hormone receptor-positive HER2-negative breast cancer
.
EBioMedicine
2016
;
11
:
199
209
.
43.
Hironaka-Mitsuhashi
A
,
Matsuzaki
J
,
Takahashi
RU
,
Yoshida
M
,
Nezu
Y
,
Yamamoto
Y
, et al
A tissue microRNA signature that predicts the prognosis of breast cancer in young women
.
PLoS One
2017
;
12
:
e0187638
.
44.
Miller
PC
,
Clarke
J
,
Koru-Sengul
T
,
Brinkman
J
,
El-Ashry
D
. 
A novel MAPK-microRNA signature is predictive of hormone-therapy resistance and poor outcome in ER-positive breast cancer
.
Clin Cancer Res
2015
;
21
:
373
85
.
45.
Perez-Rivas
LG
,
Jerez
JM
,
Carmona
R
,
de Luque
V
,
Vicioso
L
,
Claros
MG
, et al
A microRNA signature associated with early recurrence in breast cancer
.
PLoS One
2014
;
9
:
e91884
.
46.
Volinia
S
,
Croce
CM
. 
Prognostic microRNA/mRNA signature from the integrated analysis of patients with invasive breast cancer
.
Proc Natl Acad Sci U S A
2013
;
110
:
7413
7
.
47.
Zhou
X
,
Wang
X
,
Huang
Z
,
Xu
L
,
Zhu
W
,
Liu
P
. 
An ER-associated miRNA signature predicts prognosis in ER-positive breast cancer
.
J Exp Clin Cancer Res
2014
;
33
:
94
.
48.
Theocharis
AD
,
Skandalis
SS
,
Neill
T
,
Multhaupt
HA
,
Hubo
M
,
Frey
H
, et al
Insights into the key roles of proteoglycans in breast cancer biology and translational medicine
.
Biochim Biophys Acta
2015
;
1855
:
276
300
.