Purpose:

Multigene assays provide useful prognostic information regarding hormone receptor (HR)-positive breast cancer. Next-generation sequencing (NGS)-based platforms have numerous advantages including reproducibility and adaptability in local laboratories. This study aimed to develop and validate an NGS-based multigene assay to predict the distant recurrence risk.

Experimental Design:

In total, 179 genes including 30 reference genes highly correlated with the 21-gene recurrence score (RS) algorithm were selected from public databases. Targeted RNA-sequencing was performed using 250 and 93 archived breast cancer samples with a known RS in the training and verification sets, respectively, to develop the algorithm and NGS–Prognostic Score (NGS-PS). The assay was validated in 413 independent samples with long-term follow-up data on distant metastasis.

Results:

In the verification set, the NGS-PS and 21-gene RS displayed 91.4% concurrence (85/93 samples). In the validation cohort of 413 samples, area under the receiver operating characteristic curve plotted using NGS-PS values classified for distant recurrence was 0.76. The best NGS-PS cut-off value predicting distant metastasis was 20. Furthermore, 269 and 144 patients were classified as low- and high-risk patients in accordance with the cut-off. Five- and 10-year estimates of distant metastasis–free survival (DMFS) for low- versus high-risk groups were 97.0% versus 77.8% and 93.2% versus 64.4%, respectively. The age-related HR for distant recurrence without chemotherapy was 9.73 (95% CI, 3.59–26.40) and 3.19 (95% CI, 1.40–7.29) for patients aged ≤50 and >50 years, respectively.

Conclusions:

The newly developed and validated NGS-based multigene assay can predict the distant recurrence risk in ER-positive, HER2-negative breast cancer.

Translational Relevance

Multigene assays provide useful prognostic information regarding hormone receptor–positive breast cancer. Currently available genomic tests use data derived from reverse transcription PCR- or microarray-based gene expression analyses, and some studies have reported inconsistent results among young patients. Herein, we developed a multigene assay using a next-generation sequencing (NGS) platform and validated its potential to assess the risk of distant recurrence using samples obtained through long-term follow-up data. Specific pipelines were generated to normalize RNA-sequencing data to reflect the gene expression profiles of individual tumors. This multigene assay has independent prognostic potential to categorize patients into groups of low or high distant recurrence risk, especially among patients aged ≤50 years. The NGS technology used herein has advantages including reproducibility and adaptability in local laboratories, thus potentially revolutionizing clinically used genomic testing platforms.

Breast cancer is a heterogeneous disease with different biological and clinical characteristics among individual tumors. Along with standard clinicopathologic variables including patient age, tumor size, grade, and number of metastatic lymph nodes (LN), surrogate markers including estrogen receptor (ER), progesterone receptor (PR), and HER2 status to determine molecular subtypes of breast cancer are considered as prognostic and predictive biomarkers (1–5). In particular, for ER-positive and HER2-negative breast cancers, various multigene expression assays have been developed and validated to have prognostic and/or predictive value (6–14). Such assays are considered to elucidate prognostic factors per the American Joint Committee on Cancer (AJCC) Cancer Staging Manual, 8th Edition (15), and are widely used clinically per the treatment guidelines for early-stage breast cancer (16–18).

However, these genomic tests have been developed in the United States or Europe and may not completely reflect the characteristics of breast cancers in young women. The peak age in Western countries is much older and only about 15%–30% of patients are premenopausal, whereas approximately half of the patients in Asia are premenopausal (19, 20). This is potentially important, considering that the increased relapse and mortality rates are associated with young age among breast cancers with the Luminal subtype (21, 22). A comparative genomic analysis between predominantly premenopausal versus postmenopausal populations revealed higher proportions of Luminal B subtypes and ER downregulation in ER-positive subtypes among younger women (23). To better delineate the relapse risk in these populations, it is necessary to develop a multigene assay and validate it in a cohort comprising a higher proportion of young patients.

In the era of high-throughput next-generation sequencing (NGS) technologies, RNA-sequencing (RNA-seq)-based gene expression analysis would be suitable in developing a new assay. RNA-seq has advantages including higher flexibility, sensitivity, and accuracy for gene expression analysis than those of quantitative reverse transcription PCR (qRT-PCR) and microarray used in previously developed assays (24). RNA-seq helps analyze the transcriptome in an unbiased manner with a very low background signal, encompassing a large dynamic range of expression levels through which transcripts can be detected (25, 26). Furthermore, RNA-seq analysis is highly reproducible and currently easier to implement than microarray in individual laboratories, allowing the decentralization of this assay (27–29).

This study aimed to develop and validate an NGS-based multigene assay to predict the distant recurrence risk in ER-positive, HER2-negative early breast cancer.

Ethical approval

All procedures performed in this study were in accordance with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards, and with the ethical standards approved by the Institutional Review Board of Seoul National University Hospital (SNUH; H1401–150–623), Asan Medical Center (AMC; S2014–1828–0011), and Korea University Guro Hospital (16010–001). The investigators obtained written consent from patients included in this study according to requirements judged by the Institutional Review Board of each institution.

Selection of genes included in the multigene assay

The newly developed assay required a unique composition of genes potentially reflecting the gene expression profiles of the individual tumors and those that are stable in terms of RNA-seq data generation. In total, 179 genes were selected in this multigene assay (Supplementary Appendix). In brief, gene expression data from microarray data of 1,754 ER-positive and LN-negative breast cancer samples in seven Gene Expression Omnibus (GEO) datasets (30) and RNA-seq data of 1,025 ER-positive breast cancer samples in The Cancer Genome Atlas (TCGA; ref. 31) and a previous study performed at SNUH (32) were used herein. The 21-gene recurrence score (RS) was determined from these data, as reported previously (7). In total, 149 genes including those with a Pearson correlation coefficient of >0.5 with the calculated RS in individual samples and 16 non-reference genes in the 21-gene assay (33) were selected. Furthermore, 30 reference genes including 10 most upregulated, intermediately expressed, and downregulated genes with low variation, selected from RNA-seq data in TCGA, were included herein to constitute the final panel of 179 genes.

Segregation of patients in training and verification cohorts

Formalin-fixed paraffin-embedded (FFPE) samples from 377 patients with available 21-gene RSs were harvested at SNUH and AMC. The patients had ER-positive and HER2-negative early-stage breast cancer and underwent surgery between March 2010 and June 2016. After RNA extraction, targeted RNA-seq was successfully performed on 343 (91.0%) patients. The interval between time of surgery to RNA extraction and RNA-seq ranged from 3 to 78 months. These patients were divided into independent training and verification sets of 250 and 93 patients, respectively, to be used for model development and verification. The two cohorts had an equal RS distribution, as determined through the Wilcoxon rank sum test with a significance level of 0.05. Clinicopathologic characteristics including age, tumor size, nuclear and histologic grade, PR status, chemotherapy administration, and type of endocrine therapy used were reviewed. Clinical risk was categorized based on the Adjuvant! Algorithm as used in the MINDACT (Microarray in Node-Negative Disease May Avoid Chemotherapy) trial (13). Integrated RS and clinical risk was categorized on the basis of an analysis performed with the TAILORx (Trial Assigning Individualized Options for Treatment) trial data (34). Women aged ≤ 50 years having RS 0–10 with any clinical risk and RS 11–20 with low clinical risk are categorized as low integrated risk, whereas those having RS 16–20 with high clinical risk and RS 21–100 with any clinical risk are categorized as high integrated risk. Women aged > 50 years are categorized according to RS 0–25 (low risk) versus RS 26–100 (high risk) regardless of clinical risk.

Sample preparation and targeted RNA-seq

Detailed protocols for RNA extraction, ribosomal RNA depletion, cDNA library construction, hybridization capture, amplification of target-captured libraries, and NGS are described in the Supplementary Appendix. In brief, total RNA from 10 (10-μm thick) FFPE tissue sections with a minimum 30% tumor fraction was extracted using the RNeasy FFPE Kit (QIAGEN GmbH) in accordance with the manufacturer's instructions. After quantitative and qualitative analysis of extracted RNA, using the Qubit 4 Fluorometer (Thermo Fisher Scientific) and 4200 TapeStation (Agilent Technologies), rRNA was depleted and cDNA libraries were constructed using the KAPA Stranded RNA-Seq Kit with RiboErase (HMR; Kapa Biosystems) in accordance with the manufacturer's instructions. DNA fragments in the library were then hybridized with 179 gene probes, and target capture and amplification of the captured library were carried out with the Target Capture Solution Box #1, #2, and #3 (Celemics Inc.) and target capture probes synthesized by Celemics Inc. Each captured library was then validated prior to pooling. On accurate calculation accounting for the target size and the amount of data needed and the concentration of each sample, pooled libraries were denatured and diluted in accordance with the Illumina sequencing instrument's User Guide. Paired-end targeted RNA-seq was carried out on Illumina NextSeq 500 platform (Illumina). After the sequencing run, raw FASTQ files were analyzed.

Normalization of expression data and the conserved-exon panel

A well-known transcriptome analysis tool edgeR-v3.26.8 (35) was used to process the RNA-seq samples generated herein. This step was carried out to quantify the expression levels of 179 genes and to normalize them across samples. A computational framework was developed using (i) the Trimmed Mean of M values (TMM) method (36) and (ii) the conserved-exon panel (CEP) approach to generate a more robust gene expression profile. TMM is one of the most effective normalization methods (37) implemented in edgeR, which was used herein. Furthermore, we devised a read selection method named CEP, which aimed to focus the conserved exons among splicing isoforms of each gene. The detailed protocol for the development of the computational framework is described in the Supplementary Appendix.

Training and verification of the RS prediction model

The RS prediction model was developed using normalized gene expression values of the 179 genes derived from 250 samples in the training set. The least absolute shrinkage and selection operator (LASSO) regression analysis method was used for selecting variables (genes) and determining the coefficients in the model to derive a value that best matches the RS of each sample (38, 39). Twenty-one genes significantly influenced the prediction of the RS and are incorporated in the algorithm to obtain an NGS–Prognostic Score (NGS-PS) of range 0 to 100.

NGS-PS = 0.62 × KIF23 + 0.60 × SLC7A5 + 0.59 × KPNA2 + 0.53 × AURKA + 0.34 × E2F8 + 0.34 × MMP11 + 0.24 × SHCBP1 + 0.20 × CTSL2 + 0.16 × CENPE + 0.16 × TFRC + 0.15 × KIF18A + 0.14 × CCNE2 + 0.04 × KIF14 + 0.04 × RRM2 − 0.04 × CX3CR1 − 0.05 × JMJD5 − 0.33 × CACNA1D − 0.36 × ESR1 − 0.48 × GSTM1 − 1.45 × PGR − 2.04 × SCUBE2 + 41.16

The remaining 158 genes served as reference genes. Verification analyses were carried out using 93 independent samples. During classification analysis, the accuracy of high- and low-risk categorization was compared between RS and NGS-PS values of ≤ 25 (low risk) versus > 25 (high risk) for each sample. Pearson correlation coefficient was determined using the RS and NGS-PS values.

Model validation with clinical samples and long-term follow-up

To validate the prognostic potential of the NGS-PS, FFPE samples of patients with ER-positive and HER2-negative invasive breast cancer with at least 60 months of follow-up at SNUH and AMC were obtained for analysis. These patients underwent surgery between 1998 and 2012 for LN-negative breast cancer with a tumor size of >0.5 cm. Patients who did not receive hormonal therapy or displayed only locoregional recurrence were excluded. Patients who did not receive adjuvant chemotherapy were included primarily. However, patients with distant metastasis despite adjuvant chemotherapy were included for the purpose of analyzing high-risk samples. Clinicopathologic characteristics including distant metastasis–free survival (DMFS), age, tumor size, nuclear and histologic grade, PR status, chemotherapy administration, and type of endocrine therapy used were reviewed. Clinical risk was categorized according to the MINDACT trial criteria (13).

Feasibility study of the NGS-based multigene assay

To assess the feasibility of the NGS-based multigene assay for clinical application, 100 FFPE samples of patients with ER-positive and HER2-negative invasive breast cancer who had breast surgery within 2 months were obtained from SNUH for testing. RNA was extracted from 10-μm–thick FFPE tissue sections and the targeted RNA-seq data was used to generate NGS-PS. Clinicopathologic characteristics including age, tumor size, number of tissue sections used, and NGS-PS were analyzed.

Statistical analyses

Statistical analysis was performed at the Medical Research Collaborating Center of SNUH Biomedical Research Institute. The receiver operating characteristic (ROC) curve was plotted using NGS-PS values classified as distant recurrence versus no recurrence. To designate a cut-off NGS-PS value that best differentiated patients into low- and high-risk groups, unbiased recursive partitioning (40) and the Contal & O'Quigley minimum P-value approach were used (41). The value with the highest HR was selected as the cut-off value. The validity of the optimal cut-off value was verified using the 2-fold cross-validation approach (Supplementary Appendix).

DMFS rates were estimated using the Kaplan–Meier method and the log-rank test was performed to assess differences in DMFS rates in each risk group. C-Statistic analysis to derive Harrell C-indices based on NGS-PS, age, tumor size, histologic grade, PR status, and clinical risk was used herein to assess the classification performance (42). A P value of <0.05 (two-sided) was considered statistically significant. DMFS rates were plotted as a continuous indicator as a function of NGS-PS using Cox regression estimates. All statistical analyses were conducted using SAS 9.4 Version (SAS Institute Inc.).

Verification of the model to determine NGS-PSs to predict the 21-gene RS

The clinicopathologic characteristics of individuals in the training and verification cohorts are summarized in Table 1. The characteristics of these two groups were similar. The proportion of patients aged ≤50 years was lower in the verification set than in the training set. In the verification set comprising 93 samples, the mean NGS-PS and RS was 18.2 (range, 1.9–47.0) and 18.0 (range, 2–69), respectively. On setting the dichotomization cut-off as >25 (high) versus ≤25 (low) for both NGS-PS and RS, the two test categories displayed concurrence in 91.4% (85/93) of samples. The NGS-PS was high in 7.1% (6/84) of low-RS samples, and NGS-PS was low in 22.2% (2/9) of high-RS samples. On regression analysis, Pearson correlation coefficient was 0.84 between the two sets (Supplementary Fig. S1).

Table 1.

Clinicopathologic characteristics of the training and verification sets.

Training set (N = 250)Verification set (N = 93)
n (%)n (%)P
Age (years) Mean ± SD 46.53 ± 7.79 49.04 ± 9.20 0.021 
 Median (range) 46 (29–73) 46 (28–72)  
 ≤50 182 (72.8) 57 (61.3) 0.039 
 >50 68 (27.2) 36 (38.7)  
RS Mean ± SD 17.93 ± 9.03 17.98 ± 10.30 0.965 
 Low risk (RS ≤ 25) 217 (86.8) 84 (90.3) 0.376 
 High risk (RS > 25) 33 (13.2) 9 (9.7)  
Clinical risk Low 118 (47.2) 37 (39.8) 0.220 
 High 132 (52.8) 56 (60.2)  
Integrated RS & CR Low 135 (65.5) 51 (68.9) 0.597 
 High 71 (34.5) 23 (31.1)  
Tumor size (cm) Mean 2.04 1.97 0.475 
T Stage T1 150 (60.0) 51 (54.8) 0.350 
 T2 97 (38.8) 42 (45.2)  
 T3 3 (1.2) 0 (0)  
N Stage N0 206 (82.4) 74 (79.6) 0.818 
 N1mi 29 (11.6) 13 (14.0)  
 N1 (except N1mi) 15 (6.0) 6 (6.5)  
Nuclear grade 1 or 2 212 (85.1) 77 (82.8) 0.594 
 37 (14.9) 16 (17.2)  
 Unknown 1 (0.0) 0 (0.0)  
Histologic grade 1 or 2 211 (84.7) 81 (87.1) 0.583 
 38 (15.3) 12 (12.9)  
 Unknown 1 (0.0) 0 (0.0)  
PR status Positive 235 (94.0) 86 (92.5) 0.608 
 Negative 15 (6.0) 7 (7.5)  
Chemotherapy No 197 (78.8) 76 (81.7) 0.551 
 Yes 53 (21.2) 17 (18.3)  
Endocrine therapy SERMa 210 (84.0) 63 (67.7) 0.003 
 AIb 36 (14.4) 28 (30.1)  
 None 4 (1.6) 2 (2.2)  
 SERM+OFS (% of SERM) 38 (15.2) 12 (12.9) 0.592 
Training set (N = 250)Verification set (N = 93)
n (%)n (%)P
Age (years) Mean ± SD 46.53 ± 7.79 49.04 ± 9.20 0.021 
 Median (range) 46 (29–73) 46 (28–72)  
 ≤50 182 (72.8) 57 (61.3) 0.039 
 >50 68 (27.2) 36 (38.7)  
RS Mean ± SD 17.93 ± 9.03 17.98 ± 10.30 0.965 
 Low risk (RS ≤ 25) 217 (86.8) 84 (90.3) 0.376 
 High risk (RS > 25) 33 (13.2) 9 (9.7)  
Clinical risk Low 118 (47.2) 37 (39.8) 0.220 
 High 132 (52.8) 56 (60.2)  
Integrated RS & CR Low 135 (65.5) 51 (68.9) 0.597 
 High 71 (34.5) 23 (31.1)  
Tumor size (cm) Mean 2.04 1.97 0.475 
T Stage T1 150 (60.0) 51 (54.8) 0.350 
 T2 97 (38.8) 42 (45.2)  
 T3 3 (1.2) 0 (0)  
N Stage N0 206 (82.4) 74 (79.6) 0.818 
 N1mi 29 (11.6) 13 (14.0)  
 N1 (except N1mi) 15 (6.0) 6 (6.5)  
Nuclear grade 1 or 2 212 (85.1) 77 (82.8) 0.594 
 37 (14.9) 16 (17.2)  
 Unknown 1 (0.0) 0 (0.0)  
Histologic grade 1 or 2 211 (84.7) 81 (87.1) 0.583 
 38 (15.3) 12 (12.9)  
 Unknown 1 (0.0) 0 (0.0)  
PR status Positive 235 (94.0) 86 (92.5) 0.608 
 Negative 15 (6.0) 7 (7.5)  
Chemotherapy No 197 (78.8) 76 (81.7) 0.551 
 Yes 53 (21.2) 17 (18.3)  
Endocrine therapy SERMa 210 (84.0) 63 (67.7) 0.003 
 AIb 36 (14.4) 28 (30.1)  
 None 4 (1.6) 2 (2.2)  
 SERM+OFS (% of SERM) 38 (15.2) 12 (12.9) 0.592 

Abbreviations: CR, clinical risk; OFS, ovarian function suppression; PR, progesterone receptor; RS, recurrence score; SERM, selective estrogen receptor modulator.

a272 patients received tamoxifen and 1 patient received toremifene.

b53 patients received letrozole and 11 patients received anastrozole.

Characteristics of the patients in the independent validation cohort

FFPE tissues of 542 patients undergoing long-term follow-up evaluation met our inclusion criteria, of which, 76.2% (413/542) were subjected to successful RNA-seq and data analysis with adequate RNA quality and quantity. The clinicopathologic characteristics and NGS-PS of 413 samples are summarized in Table 2. The median follow-up duration was 141 months and patients aged ≤50 years constituted 63.0% (260/413) of the cohort. Among 82 patients with distant recurrence, 36 had received adjuvant chemotherapy after surgery.

Table 2.

Clinicopathologic characteristics of the validation set.

NGS-PS ≤20 (n = 269)NGS-PS >20 (n = 144)
Total (n = 413)n (%)n (%)P
Age (years) Mean ± SD 49.13 ± 9.60 49.15 ± 9.22 49.08 ± 10.30 0.946 
 Median (range) 47 (26–82) 47 (29–82) 48 (26–77)  
 ≤50 (%) 260 (63.0) 176 (65.4) 84 (58.3) 0.155 
 >50 (%) 153 (37.0) 93 (34.6) 60 (41.7)  
NGS-PS Mean ± SD 18.87 ± 7.53 14.52 ± 3.63 27.00 ± 6.03  
 ≤20 (%) 269 (65.1)    
 >20 (%) 144 (34.9)    
Clinical risk Low 293 (70.9) 216 (73.7) 77 (53.5) <0.001 
 High 114 (27.6) 48 (17.8) 66 (45.8)  
 Unknown 6 (1.5) 5 (1.9) 1 (0.7)  
Tumor size (cm) Mean 1.78 ± 0.95 1.68 ± 0.92 1.97 ± 0.96 0.003 
T Stage T1 321 (77.7) 223 (82.9) 98 (68.1) 0.001 
 T2 85 (20.6) 41 (15.2) 44 (30.6)  
 T3 7 (1.7) 5 (1.9) 2 (1.4)  
Nuclear grade 1 or 2 324 (78.5) 227 (84.4) 97 (67.4) <0.001 
 58 (14.0) 23 (8.6) 35 (24.3)  
 Unknown 31 (7.5) 19 (7.1) 12 (8.3)  
Histologic grade 1 or 2 329 (79.7) 230 (85.5) 99 (68.8) <0.001 
 56 (13.6) 16 (5.9) 40 (27.8)  
 Unknown 28 (6.8) 23 (8.6) 5 (3.5)  
PR status Positive 369 (89.3) 256 (95.2) 113 (78.5) <0.001 
 Negative 44 (10.7) 13 (4.8) 31 (21.5)  
Chemotherapy No 377 (91.3) 260 (96.7) 117 (81.3)  
 Yes 36 (8.7) 9 (3.3) 27 (18.8)  
Endocrine therapy SERMa 367 (88.9) 236 (87.8) 131 (91.0) 0.319 
 AIb 46 (11.1) 33 (12.3) 13 (9.0)  
 SERM+OFS (% of SERM) 196 (53.4) 129 (54.7) 67 (51.1) 0.782 
Distant recurrence No 331 (80.1) 246 (91.4) 85 (59.0) <0.001 
 Yes 82 (19.9) 23 (8.6) 59 (41.0)  
DMFS (months) Median 141 144 132  
NGS-PS ≤20 (n = 269)NGS-PS >20 (n = 144)
Total (n = 413)n (%)n (%)P
Age (years) Mean ± SD 49.13 ± 9.60 49.15 ± 9.22 49.08 ± 10.30 0.946 
 Median (range) 47 (26–82) 47 (29–82) 48 (26–77)  
 ≤50 (%) 260 (63.0) 176 (65.4) 84 (58.3) 0.155 
 >50 (%) 153 (37.0) 93 (34.6) 60 (41.7)  
NGS-PS Mean ± SD 18.87 ± 7.53 14.52 ± 3.63 27.00 ± 6.03  
 ≤20 (%) 269 (65.1)    
 >20 (%) 144 (34.9)    
Clinical risk Low 293 (70.9) 216 (73.7) 77 (53.5) <0.001 
 High 114 (27.6) 48 (17.8) 66 (45.8)  
 Unknown 6 (1.5) 5 (1.9) 1 (0.7)  
Tumor size (cm) Mean 1.78 ± 0.95 1.68 ± 0.92 1.97 ± 0.96 0.003 
T Stage T1 321 (77.7) 223 (82.9) 98 (68.1) 0.001 
 T2 85 (20.6) 41 (15.2) 44 (30.6)  
 T3 7 (1.7) 5 (1.9) 2 (1.4)  
Nuclear grade 1 or 2 324 (78.5) 227 (84.4) 97 (67.4) <0.001 
 58 (14.0) 23 (8.6) 35 (24.3)  
 Unknown 31 (7.5) 19 (7.1) 12 (8.3)  
Histologic grade 1 or 2 329 (79.7) 230 (85.5) 99 (68.8) <0.001 
 56 (13.6) 16 (5.9) 40 (27.8)  
 Unknown 28 (6.8) 23 (8.6) 5 (3.5)  
PR status Positive 369 (89.3) 256 (95.2) 113 (78.5) <0.001 
 Negative 44 (10.7) 13 (4.8) 31 (21.5)  
Chemotherapy No 377 (91.3) 260 (96.7) 117 (81.3)  
 Yes 36 (8.7) 9 (3.3) 27 (18.8)  
Endocrine therapy SERMa 367 (88.9) 236 (87.8) 131 (91.0) 0.319 
 AIb 46 (11.1) 33 (12.3) 13 (9.0)  
 SERM+OFS (% of SERM) 196 (53.4) 129 (54.7) 67 (51.1) 0.782 
Distant recurrence No 331 (80.1) 246 (91.4) 85 (59.0) <0.001 
 Yes 82 (19.9) 23 (8.6) 59 (41.0)  
DMFS (months) Median 141 144 132  

Abbreviations: CR, clinical risk; DMFS, distant metastasis–free survival; OFS, ovarian function suppression; PR, progesterone receptor; RS, recurrence score; SERM, selective estrogen receptor modulator.

a343 patients received tamoxifen and 24 patients received toremifene.

b13 patients received letrozole and 33 patients received anastrozole.

Prognostic ability of NGS-PS and role as a continuous indicator of distant recurrence

The median NGS-PS was 17.69 (range 0–55.4) and the area under the ROC curve was 0.760 (Fig. 1). The probability of 5- and 10-year distant recurrence increased continuously with an increase in the NGS-PS (Fig. 2). The data from patients who did not receive adjuvant chemotherapy were included for analysis in the probability plot.

Figure 1.

Receiver operating characteristic curve of NGS–Prognostic Score classified for distant recurrence.

Figure 1.

Receiver operating characteristic curve of NGS–Prognostic Score classified for distant recurrence.

Close modal
Figure 2.

Probability of distant recurrence at 5 and 10 years based on NGS–Prognostic Scores. DMFS, distant metastasis–free survival.

Figure 2.

Probability of distant recurrence at 5 and 10 years based on NGS–Prognostic Scores. DMFS, distant metastasis–free survival.

Close modal

Determination of the NGS-PS cut-off value

The optimal cut-off value for distant metastasis on unbiased recursive partitioning was 20.295. On using the Contal & O'Quigley minimum P-value approach, the P value was <0.0001 in PS 13–31 and the HR was the highest at an NGS-PS cut-off of 20. Accordingly, we classified samples with an NGS-PS of ≤ 20 as low risk and those with an NGS-PS of >20 as high risk in subsequent analyses.

Risk of distant recurrence in the low– and high–NGS-PS groups

The 413 samples obtained herein were classified into 269 low- and 144 high-risk groups in accordance with an NGS-PS cut-off value of 20. The Kaplan–Meier curve of all samples is shown in Fig. 3A. Five- and 10-year DMFS estimates for low- versus high-risk groups were 97.0% (95% CI, 94.1%–98.5%) versus 77.8% (95% CI, 70.1%–83.7%) and 93.2% (95% CI, 89.4%–95.6%) versus 64.4% (95% CI, 56.0%–71.7%), respectively. Kaplan–Meier subgroup analysis of patients aged ≤50 and >50 years are shown in Fig. 3B and C, respectively.

Figure 3.

Association between risk category and DMFS. Kaplan–Meier estimates of survival rates of low- versus high-risk groups based on NGS–Prognostic Score are shown for all patients (A–C) and those who did not receive chemotherapy (D–F). Separate curves are shown for patients aged ≤50 years (B and E) and >50 years (C and F). HRs are for high-risk group versus low-risk group.

Figure 3.

Association between risk category and DMFS. Kaplan–Meier estimates of survival rates of low- versus high-risk groups based on NGS–Prognostic Score are shown for all patients (A–C) and those who did not receive chemotherapy (D–F). Separate curves are shown for patients aged ≤50 years (B and E) and >50 years (C and F). HRs are for high-risk group versus low-risk group.

Close modal

Further analysis excluded patients who received adjuvant chemotherapy, all presenting with distant recurrence (Fig. 3D–F). In total, 377 samples were classified into 260 low- and 117 high-risk groups. Five- and 10-year DMFS estimates for low- versus high-risk groups were 98.5% (95% CI, 96.0%–99.4%) versus 89.7% (95% CI, 82.7%–94.0%) and 96.1% (95% CI, 92.8%–97.9%) versus 77.7% (95% CI, 69.0%–84.2%), respectively. Only 5.4% (14/260) of patients in the low-risk group developed distant recurrence, whereas 27.3% (32/117) of high-risk patients developed distant recurrence. The age-based HR for distant recurrence was 9.73 (95% CI, 3.59–26.40) and 3.19 (95% CI, 1.40–7.29) for patients aged ≤50 and >50 years, respectively.

C-Statistic analysis and multivariate Cox proportional hazards model

Harrell C-indices were derived for NGS-PS, age, tumor size, histologic grade, PR status, and clinical risk (Table 3). The corrected optimism C-index of NGS-PS in the validation cohort was 0.7497 and 0.7196 for all patients and those without chemotherapy, respectively, suggesting the model being good for risk categorization. A multivariate Cox proportional hazards model was also developed and is described in the Supplementary Appendix.

Table 3.

C-Statistic analysis of NGS-PS, age, tumor size, histologic grade, PR status, and clinical risk in predicting distant recurrence in the validation cohort.

Clinical variablesC-index (95% CI)Corrected optimism C-index (95% CI)
All patients (n = 413) 
NGS-PS 0.7493 (0.69–0.80) 0.7497 (0.70–0.80) 
Age 0.5132 (0.44–0.58)  
Tumor size 0.6994 (0.65–0.75)  
Histologic grade 0.6086 (0.55–0.66)  
PR status 0.5341 (0.49–0.57)  
Clinical risk 0.7048 (0.65–0.76)  
No chemotherapy (n = 377) 
NGS-PS 0.7171 (0.64–0.80) 0.7196 (0.64–0.80) 
Age 0.598 (0.51–0.69)  
T stage 0.6256 (0.55–0.70)  
Histologic grade 0.5600 (0.49–0.63)  
PR status 0.5319 (0.48–0.58)  
Clinical risk 0.6170 (0.54–0.69)  
Clinical variablesC-index (95% CI)Corrected optimism C-index (95% CI)
All patients (n = 413) 
NGS-PS 0.7493 (0.69–0.80) 0.7497 (0.70–0.80) 
Age 0.5132 (0.44–0.58)  
Tumor size 0.6994 (0.65–0.75)  
Histologic grade 0.6086 (0.55–0.66)  
PR status 0.5341 (0.49–0.57)  
Clinical risk 0.7048 (0.65–0.76)  
No chemotherapy (n = 377) 
NGS-PS 0.7171 (0.64–0.80) 0.7196 (0.64–0.80) 
Age 0.598 (0.51–0.69)  
T stage 0.6256 (0.55–0.70)  
Histologic grade 0.5600 (0.49–0.63)  
PR status 0.5319 (0.48–0.58)  
Clinical risk 0.6170 (0.54–0.69)  

Abbreviation: NGS-PS, NGS–Prognostic Score.

Risk of late distant recurrence in the low and high NGS-PS groups

Among 361 patients who did not receive adjuvant chemotherapy, 30 developed distant recurrence after 5 years since primary surgery. 10-year DMFS estimates for the low–NGS-PS versus high–NGS-PS groups in this cohort were 97.6% (95% CI, 95.95–99.42) versus 88.89% (95% CI, 81.64–93.39), respectively (P < 0.0001; Supplementary Fig. S2).

Feasibility of the NGS-based multigene assay for clinical application

NGS-PS was successfully generated in 100% of the 100 FFPE samples from 98 patients. The median patient age and tumor size of the samples were 50.5 years (range 26–78) and 1.9 cm (range 0.8–4.6), respectively. A median three (range 2–9) 10-μm FFPE tissue sections were used for the assay to derive NGS-PS ranging from 4 to 40, with a median 16.8. The 100 samples were categorized into 71 low– and 29 high–NGS-PS groups.

In this study, we developed an NGS-based multigene assay that derives an NGS-PS that presents distant recurrence risk in ER-positive and HER2-negative breast cancers. A cut-off NGS-PS value of 20 was able to differentiate tumors into low- and high-risk categories. We used RNA-seq data from samples with known 21-gene RS values to initially construct a model best correlated with the 21-gene RS. We subsequently evaluated the model using samples with actual long-term clinical follow-up data.

To our knowledge, this study is the first to report a multigene assay for early breast cancer, originally developed using the NGS platform. Mittempergher and colleagues (27) reported an NGS version of MammaPrint and BluePrint, which are equivalent to their original counterparts; however, their indices are determined from the same gene composition and algorithm used in microarray-based tests. The assay developed herein contains several distinct components unique to an NGS-based test, including rRNA depletion during RNA extraction and using only the conserved exons of the splice variants during normalization to quantify gene expression. The whole process of RNA-seq and normalization is highly reproducible and robust, permitting decentralization of the assay for use in individual laboratories. Furthermore, RNA-seq facilitates high-throughput sequencing in a cost-effective manner to generate more data for this algorithm. The majority of the panel of 179 genes markedly contributed as reference genes in the normalization of gene expression values, which otherwise would have yielded too wide a range to be effectively used in the algorithm. This new NGS-based gene expression assay can potentially revolutionize clinically used genomic testing platforms.

We demonstrated the independent prognostic ability of the NGS-PS in predicting distant recurrence, especially among patients aged ≤50 years, wherein an NGS-PS of >20 indicated a 9.7-fold risk of distant recurrence compared to an NGS-PS of ≤20, owing to the high proportion of breast cancer samples obtained from young women herein. The most widely used genomic test 21-gene assay has been developed using samples predominantly comprising older patients and has been clinically validated in trials conducted in the United States and Europe, where over 70% of the women were postmenopausal (19). Consequently, younger women were underrepresented in these trials and treatment decisions were based on data obtained from older women. This is reflected in the TAILORx trial, which reported conflicting results that women aged ≤50 years somewhat benefited from chemotherapy in an RS of 16 to 25, as opposed to no benefits among the entire population including older women (12, 14). For patients aged ≤50 years, a clinical-risk stratification has been suggested in addition to the 21-gene RS (34), which ironically was originally developed to provide prognostic information without clinicopathologic variables. A similar trend was also observed in the MINDACT trial of the 70-gene assay during an analysis based on age groups, suggesting that chemotherapy benefits women aged 40–50 years (43). The assay developed herein can potentially play an important role in guiding treatment decisions, particularly among patients in the Asia-Pacific region, where the peak age is younger and breast cancer incidence is increasing rapidly.

Among patients who had not received chemotherapy, the 5-year and 10-year DMFS of the low-risk group (NGS-PS ≤20) was 98.5% and 96.1%, respectively, being similar to that reported in the TAILORx trial, wherein an RS of 11–25 without chemotherapy yielded an estimated 5- and 9-year DMFS of 98.0% and 94.5%, respectively. Regarding the high-risk group (NGS-PS, >20) not receiving chemotherapy, the 5-year and 10-year DMFS was 88.9% and 75.7%, respectively, which are comparable with those in the TAILORx trial, wherein those with an RS of ≥26 and receiving chemotherapy were estimated to have a higher 5-year and 9-year DMFS of 93.0% and 86.8%, respectively, suggesting the effect of chemotherapy in the high-risk group. Further prospective clinical trials are required to investigate the predictive value of our assay regarding the benefits of chemotherapy.

Several studies have compared the prognostic potentials of various commercially available multigene assays. Compared to the 21-gene RS, other genomic tests have provided additional prognostic information in different analyses such as those for late distant recurrence or LN-positive disease (8, 10, 44–46). Late distant recurrence is especially important in ER-positive breast cancers, where the recurrence risk remains high beyond 5 years of treatment (3, 47). Herein, NGS-PS differentiated between low- and high-risk patients in a patient cohort free of distant metastasis at 5 years of surgery, suggesting its ability to also stratify the late recurrence risk. In general, while prognostic information provided by current risk stratification tools are broadly equivalent for all ER-positive breast cancers, risk categorization may differ at the level of individual tumors (44).

Currently, use of multigene assays is recommended in international clinical practice guidelines including the National Comprehensive Cancer Network (NCCN) Guidelines for Breast Cancer. The 21-gene assay (Oncotype Dx) for pN0 and the 70-gene assay (MammaPrint) for negative and 1–3 positive nodes are classified under evidence category 1, and the 21-gene assay for pN+, 50-gene assay (PAM50), 12-gene assay (EndoPredict), and Breast Cancer Index are classified under evidence category 2A (48). However, not all of these assays are available for worldwide use. Experimental and analytical procedures in most of these assays were carried out at the central laboratory of each company providing these services. High prices of the assays compared with low medical costs and a long turnaround time are the most important barriers for the general use of these assays, especially in developing countries. The cost of NGS has markedly decreased, such that the present NGS-based assay is more economically viable than other current commercial assays. Furthermore, the present assay can be performed in individual laboratories with a standard protocol and with NGS capture probes in the future.

RNA-seq was unsuccessful in 9.0% and 23.8% of FFPE tissues selected for training/verification and validation analyses, respectively. It is well known from previous studies that tissue storage time and conditions, in addition to fixation time and specimen size, affect the integrity and usability of RNA obtained from FFPE samples (49). In this study, the archival times of 3–78 months and 5–19 years for training/verification and validation samples, respectively, is suggested to be the reason for the high failure rate. However, FFPE samples with less than 2 months of storage time are typically used for application of multigene assays in the clinic. We were able to demonstrate feasibility of the assay for clinical application by observing a 100% RNA-seq success rate when using recent samples, while using only a median three 10-μm FFPE tissue sections.

The limitation of this study is that FFPE samples used for RNA-seq were retrospectively obtained from a prospectively maintained database. Ideally, samples collected in prospective clinical trials including patients receiving only hormonal therapy would have minimized the potential bias. However, owing to the low prevalence of distant recurrence in the luminal subtype and the small number of patients with long-term follow-up data, it was even more difficult to have an adequate number of patients that had not received adjuvant chemotherapy in addition to hormonal therapy. To compensate, we had to deliberately add patients who had received adjuvant chemotherapy and developed distant recurrence to the validation set to constitute approximately 20% that represents high-risk patients. This resulted in an unbalance in the proportion of low- and high-risk patients aged ≤50 and >50 years. The relatively lower HRs among patients aged >50 years is suggesting that additional analyses with further cases may lead to a different cut-off PS value for this age group. Furthermore, the risk categories segregated by the cut-off value of NGS-PS 20 determined with the validation cohort in this study could be supplemented with an additional analysis using a different cohort, such as those from a large prospective trial.

In summary, this study described a new NGS-based multigene assay that derives an NGS-PS that presents distant recurrence risk in ER-positive and HER2-negative breast cancers and differentiates patients into low- and high-risk groups for developing distant recurrence. This assay uses NGS technology, which has emerged as an essential tool for clinically applicable genomic analysis. Korean (Asian) samples used in the development and validation sets reflected the exceptional prognostic value of this assay among patients aged ≤50 years, better reflecting the characteristics of Asian patients. Further validation studies are required to develop yield supporting evidence for the prognostic and predictive value of this assay.

H.-B. Lee reports being a member on the board of directors of and holding stock and ownership interests at DCGen, Co., Ltd., and is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. S.B. Lee is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. M. Kim is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. S. Kwon is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. J. Jo is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. J. Kim is currently an employee of DCGen, Co., Ltd. H.J. Lee is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. H.-S. Ryu is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. J.W. Lee is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. C. Kim is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. J. Jeong is an employee of Celemics, Inc. H. Kim reports being a member on the board of directors of and holding stock and ownership interests at Celemics, Inc., and is listed as a co-inventor on a patent for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. I.-A. Park is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. S.-H. Ahn is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. S. Kim is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. S. Yoon is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. A. Kim is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. W. Han reports being a member on the board of directors of and holding stock and ownership interests at DCGen, Co., Ltd., and is listed as a co-inventor on patents for the NGS-based assay developed in this study, owned by and royalties paid from DCGen, Co. Ltd. No potential conflicts of interest were disclosed by the other author.

The funding sources had no role in the design and conduct of the study, collection, management, analysis, interpretation of the data, preparation, review, or approval of the manuscript and the decision to submit the manuscript for publication.

H.-B. Lee: Conceptualization, resources, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. S.B. Lee: Resources, data curation, formal analysis, investigation, writing-review and editing. M. Kim: Data curation, software, formal analysis, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. S. Kwon: Conceptualization, data curation, formal analysis, validation, investigation, methodology, writing-original draft, writing-review and editing. J. Jo: Data curation, validation, investigation, writing-review and editing. J. Kim: Data curation, formal analysis, investigation, methodology, writing-original draft, writing-review and editing. H.J. Lee: Resources, data curation, formal analysis, investigation, methodology, writing-review and editing. H.-S. Ryu: Resources, data curation. J.W. Lee: Resources, data curation. C. Kim: Resources, formal analysis, investigation. J. Jeong: Data curation, formal analysis, investigation, methodology. H. Kim: Data curation, formal analysis, investigation, methodology. D.-Y. Noh: Resources, data curation, supervision, project administration. I.-A. Park: Resources, data curation. S.-H. Ahn: Resources, data curation, supervision. S. Kim: Conceptualization, supervision, methodology, writing-review and editing. S. Yoon: Conceptualization, software, formal analysis, supervision, methodology, writing-review and editing. A. Kim: Resources, data curation, supervision, methodology. W. Han: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, methodology, writing-original draft, project administration, writing-review and editing.

The authors would like to sincerely thank Dr. Hyeong-Gon Moon (Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea) for critical advice and the Medical Research Collaborating Center at Seoul National University Hospital Biomedical Research Institute for statistical analysis and consultation. This study was supported by grants of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant numbers: HI14C3405 and HI14C1277).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Perou
CM
,
Sørlie
T
,
Eisen
MB
,
van de Rijn
M
,
Jeffrey
SS
,
Rees
CA
, et al
Molecular portraits of human breast tumours
.
Nature
2000
;
406
:
747
52
.
2.
Sørlie
T
,
Perou
CM
,
Tibshirani
R
,
Aas
T
,
Geisler
S
,
Johnsen
H
, et al
Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications
.
Proc Natl Acad Sci U S A
2001
;
98
:
10869
74
.
3.
Early Breast Cancer Trialists' Collaborative Group
. 
Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials
.
Lancet
2005
;
365
:
1687
717
.
4.
Early Breast Cancer Trialists' Collaborative Group
,
Davies
C
,
Godwin
J
,
Gray
R
,
Clarke
M
,
Cutter
D
, et al
Relevance of breast cancer hormone receptors and other factors to the efficacy of adjuvant tamoxifen: patient-level meta-analysis of randomised trials
.
Lancet
2011
;
378
:
771
84
.
5.
Early Breast Cancer Trialists' Collaborative Group
,
Peto
R
,
Davies
C
,
Godwin
J
,
Gray
R
,
Pan
HC
, et al
Comparisons between different polychemotherapy regimens for early breast cancer: meta-analyses of long-term outcome among 100,000 women in 123 randomised trials
.
Lancet
2012
;
379
:
432
44
.
6.
van't Veer
LJ
,
Dai
H
,
van de Vijver
MJ
,
He
YD
,
Hart
AAM
,
Mao
M
, et al
Gene expression profiling predicts clinical outcome of breast cancer
.
Nature
2002
;
415
:
530
6
.
7.
Paik
S
,
Shak
S
,
Tang
G
,
Kim
C
,
Baker
J
,
Cronin
M
, et al
A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer
.
N Engl J Med
2004
;
351
:
2817
26
.
8.
Nielsen
TO
,
Parker
JS
,
Leung
S
,
Voduc
D
,
Ebbert
M
,
Vickery
T
, et al
A comparison of PAM50 intrinsic subtyping with immunohistochemistry and clinical prognostic factors in tamoxifen-treated estrogen receptor-positive breast cancer
.
Clin Cancer Res
2010
;
16
:
5222
32
.
9.
Filipits
M
,
Rudas
M
,
Jakesz
R
,
Dubsky
P
,
Fitzal
F
,
Singer
CF
, et al
A new molecular predictor of distant recurrence in ER-positive, HER2-negative breast cancer adds independent information to conventional clinical risk factors
.
Clin Cancer Res
2011
;
17
:
6012
20
.
10.
Cuzick
J
,
Dowsett
M
,
Pineda
S
,
Wale
C
,
Salter
J
,
Quinn
E
, et al
Prognostic value of a combined estrogen receptor, progesterone receptor, Ki-67, and human epidermal growth factor receptor 2 immunohistochemical score and comparison with the genomic health recurrence score in early breast cancer
.
J Clin Oncol
2011
;
29
:
4273
8
.
11.
Jerevall
PL
,
Ma
XJ
,
Li
H
,
Salunga
R
,
Kesty
NC
,
Erlander
MG
, et al
Prognostic utility of HOXB13:IL17BR and molecular grade index in early-stage breast cancer patients from the Stockholm trial
.
Br J Cancer
2011
;
104
:
1762
9
.
12.
Sparano
JA
,
Gray
RJ
,
Makower
DF
,
Pritchard
KI
,
Albain
KS
,
Hayes
DF
, et al
Prospective validation of a 21-gene expression assay in breast cancer
.
N Engl J Med
2015
;
373
:
2005
14
.
13.
Cardoso
F
,
van't Veer
LJ
,
Bogaerts
J
,
Slaets
L
,
Viale
G
,
Delaloge
S
, et al
70-gene signature as an aid to treatment decisions in early-stage breast cancer
.
N Engl J Med
2016
;
375
:
717
29
.
14.
Sparano
JA
,
Gray
RJ
,
Makower
DF
,
Pritchard
KI
,
Albain
KS
,
Hayes
DF
, et al
Adjuvant chemotherapy guided by a 21-gene expression assay in breast cancer
.
N Engl J Med
2018
;
379
:
111
21
.
15.
Edge
SB
,
Edge
SB
.,
American Joint Committee on Cancer
.
AJCC cancer staging manual 8th ed
.
New York, NY
:
Springer
; 
2017
.
16.
Harris
LN
,
Ismaila
N
,
McShane
LM
,
Andre
F
,
Collyar
DE
,
Gonzalez-Angulo
AM
, et al
Use of biomarkers to guide decisions on adjuvant systemic therapy for women with early-stage invasive breast cancer: American Society of Clinical Oncology Clinical practice guideline
.
J Clin Oncol
2016
;
34
:
1134
50
.
17.
Senkus
E
,
Kyriakides
S
,
Ohno
S
,
Penault-Llorca
F
,
Poortmans
P
,
Rutgers
E
, et al
Primary breast cancer: ESMO clinical practice guidelines for diagnosis, treatment and follow-up
.
Ann Oncol
2015
;
26
:
v8
30
.
18.
Burstein
HJ
,
Curigliano
G
,
Loibl
S
,
Dubsky
P
,
Gnant
M
,
Poortmans
P
, et al
Estimating the benefits of therapy for early-stage breast cancer: the St. Gallen international consensus guidelines for the primary therapy of early breast cancer 2019
.
Ann Oncol
2019
;
30
:
1541
57
.
19.
Bray
F
,
Ferlay
J
,
Soerjomataram
I
,
Siegel
RL
,
Torre
LA
,
Jemal
A
. 
Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries
.
CA Cancer J Clin
2018
;
68
:
394
424
.
20.
Kang
SY
,
Kim
YS
,
Kim
Z
,
Kim
H-Y
,
Lee
SK
,
Jung
K-W
, et al
Basic findings regarding breast cancer in Korea in 2015: data from a breast cancer registry
.
J Breast Cancer
2018
;
21
:
1
10
.
21.
Partridge
AH
,
Hughes
ME
,
Warner
ET
,
Ottesen
RA
,
Wong
Y-N
,
Edge
SB
, et al
Subtype-dependent relationship between young age at diagnosis and breast cancer survival
.
J Clin Oncol
2016
;
34
:
3308
14
.
22.
Ahn
SH
,
Son
BH
,
Kim
SW
,
Kim
SI
,
Jeong
J
,
Ko
S-S
, et al
Poor outcome of hormone receptor-positive breast cancer at very young age is due to tamoxifen resistance: nationwide survival data in Korea–a report from the Korean Breast Cancer Society
.
J Clin Oncol
2007
;
25
:
2360
8
.
23.
Kan
Z
,
Ding
Y
,
Kim
J
,
Jung
HH
,
Chung
W
,
Lal
S
, et al
Multi-omics profiling of younger Asian breast cancers reveals distinctive molecular signatures
.
Nat Commun
2018
;
9
:
1725
.
24.
Byron
SA
,
Van Keuren-Jensen
KR
,
Engelthaler
DM
,
Carpten
JD
,
Craig
DW
. 
Translating RNA sequencing into clinical diagnostics: opportunities and challenges
.
Nat Rev Genet
2016
;
17
:
257
71
.
25.
Han
Y
,
Gao
S
,
Muegge
K
,
Zhang
W
,
Zhou
B
. 
Advanced applications of RNA sequencing and challenges
.
Bioinform Biol Insights
2015
;
9
:
29
46
.
26.
Wang
Z
,
Gerstein
M
,
Snyder
M
. 
RNA-Seq: a revolutionary tool for transcriptomics
.
Nat Rev Genet
2009
;
10
:
57
63
.
27.
Mittempergher
L
,
Delahaye
L
,
Witteveen
AT
,
Spangler
JB
,
Hassenmahomed
F
,
Mee
S
, et al
MammaPrint and BluePrint molecular diagnostics using targeted RNA next-generation sequencing technology
.
J Mol Diagn
2019
;
21
:
808
23
.
28.
Slembrouck
L
,
Darrigues
L
,
Laurent
C
,
Mittempergher
L
,
Delahaye
LJ
,
Vanden Bempt
I
, et al
Decentralization of next-generation RNA sequencing-based MammaPrint(R) and BluePrint(R) kit at University Hospitals Leuven and Curie Institute Paris
.
Transl Oncol
2019
;
12
:
1557
65
.
29.
t Hoen
PA
,
Friedländer
MR
,
Almlöf
J
,
Sammeth
M
,
Pulyakhina
I
,
Anvar
SY
, et al
Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories
.
Nat Biotechnol
2013
;
31
:
1015
22
.
30.
Györffy
B
,
Lanczky
A
,
Eklund
AC
,
Denkert
C
,
Budczies
J
,
Li
Q
, et al
An online survival analysis tool to rapidly assess the effect of 22,277 genes on breast cancer prognosis using microarray data of 1,809 patients
.
Breast Cancer Res Treat
2010
;
123
:
725
31
.
31.
Cancer Genome Atlas Network
. 
Comprehensive molecular portraits of human breast tumours
.
Nature
2012
;
490
:
61
70
.
32.
Lee
J-H
,
Zhao
X-M
,
Yoon
I
,
Lee
JY
,
Kwon
NH
,
Wang
Y-Y
, et al
Integrative analysis of mutational and transcriptional profiles reveals driver mutations of metastatic breast cancers
.
Cell Discov
2016
;
2
:
1
14
.
33.
Kim
J
,
Kim
A
,
Kim
C
. 
Examination of the biomark assay as an alternative to oncotype DX for defining chemotherapy benefit
.
Oncol Lett
2019
;
17
:
1812
8
.
34.
Sparano
JA
,
Gray
RJ
,
Ravdin
PM
,
Makower
DF
,
Pritchard
KI
,
Albain
KS
, et al
Clinical and genomic risk to guide the use of adjuvant therapy for breast cancer
.
N Engl J Med
2019
;
380
:
2395
405
.
35.
Robinson
MD
,
McCarthy
DJ
,
Smyth
GK
. 
edgeR: a bioconductor package for differential expression analysis of digital gene expression data
.
Bioinformatics
2010
;
26
:
139
40
.
36.
Robinson
MD
,
Oshlack
A
. 
A scaling normalization method for differential expression analysis of RNA-seq data
.
Genome Biol
2010
;
11
:
R25
.
37.
Dillies
MA
,
Rau
A
,
Aubert
J
,
Hennequet-Antier
C
,
Jeanmougin
M
,
Servant
N
, et al
A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis
.
Brief Bioinform
2013
;
14
:
671
83
.
38.
Tibshirani
R
. 
Regression shrinkage and selection via the lasso
.
J R Stat Soc Series B
1996
;
58
:
267
88
.
39.
Angermueller
C
,
Pärnamaa
T
,
Parts
L
,
Stegle
O
. 
Deep learning for computational biology
.
Mol Syst Biol
2016
;
12
:
878
.
40.
Hothorn
T
,
Hornik
K
,
Zeileis
A
. 
Unbiased recursive partitioning: a conditional inference framework
.
J Comput Graph Stat
2006
;
15
:
651
74
.
41.
Contal
C
,
O'Quigley
J
. 
An application of changepoint methods in studying the effect of age on survival in breast cancer
.
Comp Stat Data An
1999
;
30
:
253
70
.
42.
Harrell
FE
 Jr
,
Califf
RM
,
Pryor
DB
,
Lee
KL
,
Rosati
RA
. 
Evaluating the yield of medical tests
.
JAMA
1982
;
247
:
2543
6
.
43.
Piccart
MJ
,
Poncet
C
,
Cardoso
F
,
van't Veer
L
,
Delaloge
S
, et al
Abstract GS4–05: should age be integrated together with clinical and genomic risk for adjuvant chemotherapy decision in early luminal breast cancer? MINDACT results compared to those of TAILOR-X
.
Cancer Res
2020
;
80
:
GS4
05
.
44.
Bartlett
JM
,
Bayani
J
,
Marshall
A
,
Dunn
JA
,
Campbell
A
,
Cunningham
C
, et al
Comparing breast cancer multiparameter tests in the OPTIMA prelim trial: no test is more equal than the others
.
J Natl Cancer Inst
2016
;
108
:
djw050
.
45.
Dowsett
M
,
Sestak
I
,
Lopez-Knowles
E
,
Sidhu
K
,
Dunbier
AK
,
Cowens
JW
, et al
Comparison of PAM50 risk of recurrence score with oncotype DX and IHC4 for predicting risk of distant recurrence after endocrine therapy
.
J Clin Oncol
2013
;
31
:
2783
90
.
46.
Sestak
I
,
Buus
R
,
Cuzick
J
,
Dubsky
P
,
Kronenwett
R
,
Denkert
C
, et al
Comparison of the performance of 6 prognostic signatures for estrogen receptor-positive breast cancer: a secondary analysis of a randomized clinical trial
.
JAMA Oncol
2018
;
4
:
545
53
.
47.
Saphner
T
,
Tormey
DC
,
Gray
R
. 
Annual hazard rates of recurrence for breast cancer after primary therapy
.
J Clin Oncol
1996
;
14
:
2738
46
.
48.
National Comprehensive Cancer Network
. 
Breast cancer (version 3.2020)
; 
2020
.
Available from
: https://www.nccn.org/professionals/physician_gls/pdf/breast.pdf.
49.
Choi
Y
,
Kim
A
,
Kim
J
,
Lee
J
,
Lee
SY
,
Kim
C
. 
Optimization of RNA extraction from formalin-fixed paraffin-embedded blocks for targeted next-generation sequencing
.
J Breast Cancer
2017
;
20
:
393
9
.