Purpose:

The current stratification system for acute promyelocytic leukemia (APL) is based on the white blood cell (WBC) and the platelet counts (i.e., Sanz score) over the past two decades. However, the borderlines among different risk groups are sometimes ambiguous, and for some patients, early death and relapse remained challenges. Besides, with the evolving of the treatment strategy from all-trans-retinoic acid (ATRA) and chemotherapy to ATRA–arsenic trioxide-based synergistic targeted therapy, the precise risk stratification with molecular markers is needed.

Experimental Design:

This study performed a systematic analysis of APL genomics and transcriptomics to identify genetic abnormalities in 348 patients mainly from the APL2012 trial (NCT01987297) to illustrate the potential molecular background of Sanz score and further optimize it. The least absolute shrinkage and selection operator algorithm was used to analyze the gene expression in 323 cases to establish a scoring system (i.e., APL9 score).

Results:

Through combining NRAS mutations, APL9 score, and WBC, 321 cases can be stratified into two groups with significantly different outcomes. The estimated 5-year overall (P = 0.00031), event-free (P < 0.0001), and disease-free (P = 0.001) survival rates in the revised standard-risk group (95.6%, 93.8%, and 98.1%, respectively) were significantly better than those in the revised high-risk group (82.9%, 77.4%, and 88.4%, respectively), which could be validated using The Cancer Genome Atlas dataset.

Conclusions:

We have proposed a two-category system for improving prognosis in patients with APL. Molecular markers identified in this study may also provide genomic insights into the disease mechanism for improved therapy.

Translational Relevance

Acute promyelocytic leukemia (APL), which used to be one of the deadliest forms of cancer, has gradually become highly curable. However, in a subset of patients with APL, early death and relapse remained challenges for the treatment. Moreover, with the treatment strategy evolved from all-trans-retinoic acid (ATRA) chemotherapy–based to ATRA–arsenic trioxide–based synergistic targeted therapy, the classic risk stratification needs further improvement. In this study, we identified a feasible and reliable way and established an integrated system to discriminate patients with APL into high-risk (HR) and standard-risk (SR) groups. The revised HR group displayed significantly worse outcomes than the revised SR one. In summary, we could better define the risks of APL, add predictive value to the choice of currently available therapeutic tools, and make the clinical decision more convenient and clearer.

Acute promyelocytic leukemia (APL) is a unique subtype of acute myeloid leukemia (AML), characterized by the expansion and accumulation of leukemic cells blocked at the promyelocytic stage of granulocytic differentiation and the presence of a specific disease driver fusion gene encoding the PML–RARA oncoprotein. In the last two decades, the all-trans-retinoic acid (ATRA) and arsenic trioxide (ATO)-based combination (ATRA–ATO) showed a considerable advantage over traditional ATRA and chemotherapy combination (ATRA–chemotherapy)-based protocol and has become the standard APL core treatment regimen nowadays (1–3). Meanwhile, the Sanz score was widely used to classify patients with APL into three risk categories according to white blood cell (WBC) and platelet (PLT) counts [low risk (LR): WBC count ≤ 10 × 109/L and PLT count > 40 × 109/L; intermediate risk (MR): WBC count ≤ 10 × 109/L and PLT count ≤ 40 × 109/L; and HR: WBC count >10 × 109/L; ref. 4].

Although the Sanz score represents a very useful first-generation tool to stratify APL, we propose to further enrich the current APL risk evaluation system taking into consideration several factors. First, although the hemogram criteria of the risks of APL seem to be easily applicable, such criteria appear not so straightforward for some patients, especially due to the fluctuation in WBC and PLT levels caused by infection and drugs. Notably, now ATRA is routinely used to treat APL, often causing leukocytosis syndrome (5). The inclusion of reliable molecular markers can be of great value in this aspect. Second, the data to generate the Sanz risk score were mostly from the clinical trials of ATRA–chemotherapy, which showed significant inferior results as compared with current ATRA–ATO treatment in LR and MR categories (6). Third, the identification of biological markers related to distinct disease outcomes may allow not only a better understanding of the molecular mechanisms underlying different clinical phenotypes as those in the Sanz score but also a precision use of novel therapeutic methods, such as targeted drugs or immune therapies.

Therefore, we generated gene mutations and expression profiles of 348 patients with APL enrolled in the APL2012 trial (NCT01987297), the largest cohort ever reported (7). Integration of genomics and transcriptomics allowed us to panoramically clarify the importance of gene mutations and expression profiles on the risk assessment and the prognosis of APL.

Patients

A total of 348 patients with APL with available bone marrow (BM) samples were included in this study, including a majority part of the patients (n = 304) coming from the APL2012 clinical trial (NCT01987297) from December 6, 2012, to December 31, 2017, and a small group of the patients (n = 44) from the historical cases between June 1, 2012, and November 31, 2012, who received the similar treatment (Table 1). It is noteworthy that the key part in the APL2012 trial for randomization into two groups with or without ATO was at the phase of consolidation while all patients received ATRA–ATO treatment for remission induction and maintenance. As a result, there were no statistically significant differences in terms of 3-year disease-free survival (DFS) as well as estimated 7-year DFS and overall survival (OS) between the two groups (7). When the 348 cases investigated in the present work were scrutinized for their treatment and outcome data, there were 22 cases of early death during remission induction and 2 cases who were lost during follow-up. Among the remaining 324 cases, no significant differences were observed in terms of the 5-year OS, event-free survival (EFS), and DFS between the two groups receiving consolidation with (n = 172) or without (n = 152) ATO (Supplementary Table S1; Supplementary Fig. S1). The collection and the preservation of the samples were approved by the Institutional Review Boards of all participating centers, and written informed consent for sample collection and research was obtained following the Declaration of Helsinki (8).

Table 1.

Clinical and molecular characteristics of 348 patients with APL in this study.

CharacteristicsAll patients (n = 348)WES + WGS (n = 204)RNA-seq (n = 324)aTraining cohort (n = 158)Validation cohort (n = 104)Training vs. validation comparison, Pb
Sex      0.408 
 Women 159 (45.7%) 92 (45.1%) 151 (46.6%) 68 (43.0%) 51 (49.0%)  
 Men 189 (54.3%) 112 (54.9%) 173 (53.4%) 90 (57.0%) 53 (51.0%)  
Age (at diagnosis, year)      0.066 
 <40 193 (55.5%) 113 (55.4%) 180 (55.6%) 88 (55.7%) 57 (54.8%)  
 40–60 138 (39.7%) 80 (39.2%) 127 (39.2%) 62 (39.2%) 44 (42.3%)  
 ≥60 17 (4.9%) 11 (5.4%) 17 (5.2%) 8 (5.1%) 3 (2.9%)  
WBC count (×109/L) 3.3 (1.3–16.6) 3.2 (1.4–18.7) 3.4 (1.4–16.9) 3.4 (1.3–12.9) 2.3 (1.4–13.5) 0.664 
PLT count (×109/L) 28.0 (16.0–41.2) 28.0 (17.0–43.5) 28.0 (16.0–43.0) 28.0 (17.0–38.0) 26.0 (13.0–37.8) 0.507 
Hemoglobin level (g/L) 87.0 (68.0–106.2) 89.0 (70.5–110.0) 87.0 (68.0–107.2) 84.0 (66.0–107.0) 84.0 (69.0–105.0) 0.975 
BM blast count (%) 87.0 (80.5–91.5) 87.3 (81.1–91.3) 87.5 (81.2–91.5) 88.5 (82.5–91.8) 85.5 (79.2–90.0) 0.012 
Activated partial thromboplastin time (s) 30.2 (27.2–36.4) 28.9 (26.5–34.5) 30.3 (27.3–36.4) 30.3 (27.0–36.4) 29.7 (27.3–38.0) 0.903 
Prothrombin time (s) 14.5 (12.6–16.5) 13.9 (12.4–16.0) 14.5 (12.6–16.5) 14.6 (12.6–16.3) 14.2 (12.4–16.7) 0.687 
Fibrinogen (g/L) 1.4 (1.0–2.0) 1.3 (1.0–2.0) 1.4 (1.0–2.0) 1.4 (1.0–2.0) 1.3 (0.9–2.0) 0.787 
Sanz risk group      0.697 
 LR group 65 (18.7%) 42 (20.6%) 63 (19.4%) 19 (12.0%) 16 (15.4%)  
 MR group 173 (49.7%) 92 (45.1%) 157 (48.5%) 95 (60.1%) 62 (59.6%)  
 HR group 110 (31.6%) 70 (34.3%) 104 (32.1%) 44 (27.8%) 26 (25.0%)  
Consolidation strategy      0.288 
 Without ATO 152 (43.7%) 95 (46.6%) 142 (43.8%) 71 (44.9%) 41 (39.4%)  
 With ATO 172 (49.4%) 103 (50.5%) 160 (49.4%) 72 (45.6%) 57 (54.8%)  
PMLRARA transcriptsc      0.451 
 L-type 209 (60.1%) 107 (52.5%) 209 (64.5%) 100 (63.3%) 70 (67.3%)  
 S-type 89 (25.6%) 60 (29.4%) 89 (27.5%) 41 (25.9%) 28 (26.9%)  
 V-type 24 (6.9%) 11 (5.4%) 24 (7.4%) 16 (10.1%) 6 (5.8%)  
 Missing data 26 (7.5%) 26 (12.7%) 2 (0.6%) 1 (0.6%)  
Early death 22 (6.3%) 6 (2.9%) 22 (6.8%) 15 (9.5%) 6 (5.8%) 0.355 
Relapse 13 (3.7%) 13 (6.4%) 11 (3.4%) 5 (3.2%) 4 (3.8%) 0.744 
CharacteristicsAll patients (n = 348)WES + WGS (n = 204)RNA-seq (n = 324)aTraining cohort (n = 158)Validation cohort (n = 104)Training vs. validation comparison, Pb
Sex      0.408 
 Women 159 (45.7%) 92 (45.1%) 151 (46.6%) 68 (43.0%) 51 (49.0%)  
 Men 189 (54.3%) 112 (54.9%) 173 (53.4%) 90 (57.0%) 53 (51.0%)  
Age (at diagnosis, year)      0.066 
 <40 193 (55.5%) 113 (55.4%) 180 (55.6%) 88 (55.7%) 57 (54.8%)  
 40–60 138 (39.7%) 80 (39.2%) 127 (39.2%) 62 (39.2%) 44 (42.3%)  
 ≥60 17 (4.9%) 11 (5.4%) 17 (5.2%) 8 (5.1%) 3 (2.9%)  
WBC count (×109/L) 3.3 (1.3–16.6) 3.2 (1.4–18.7) 3.4 (1.4–16.9) 3.4 (1.3–12.9) 2.3 (1.4–13.5) 0.664 
PLT count (×109/L) 28.0 (16.0–41.2) 28.0 (17.0–43.5) 28.0 (16.0–43.0) 28.0 (17.0–38.0) 26.0 (13.0–37.8) 0.507 
Hemoglobin level (g/L) 87.0 (68.0–106.2) 89.0 (70.5–110.0) 87.0 (68.0–107.2) 84.0 (66.0–107.0) 84.0 (69.0–105.0) 0.975 
BM blast count (%) 87.0 (80.5–91.5) 87.3 (81.1–91.3) 87.5 (81.2–91.5) 88.5 (82.5–91.8) 85.5 (79.2–90.0) 0.012 
Activated partial thromboplastin time (s) 30.2 (27.2–36.4) 28.9 (26.5–34.5) 30.3 (27.3–36.4) 30.3 (27.0–36.4) 29.7 (27.3–38.0) 0.903 
Prothrombin time (s) 14.5 (12.6–16.5) 13.9 (12.4–16.0) 14.5 (12.6–16.5) 14.6 (12.6–16.3) 14.2 (12.4–16.7) 0.687 
Fibrinogen (g/L) 1.4 (1.0–2.0) 1.3 (1.0–2.0) 1.4 (1.0–2.0) 1.4 (1.0–2.0) 1.3 (0.9–2.0) 0.787 
Sanz risk group      0.697 
 LR group 65 (18.7%) 42 (20.6%) 63 (19.4%) 19 (12.0%) 16 (15.4%)  
 MR group 173 (49.7%) 92 (45.1%) 157 (48.5%) 95 (60.1%) 62 (59.6%)  
 HR group 110 (31.6%) 70 (34.3%) 104 (32.1%) 44 (27.8%) 26 (25.0%)  
Consolidation strategy      0.288 
 Without ATO 152 (43.7%) 95 (46.6%) 142 (43.8%) 71 (44.9%) 41 (39.4%)  
 With ATO 172 (49.4%) 103 (50.5%) 160 (49.4%) 72 (45.6%) 57 (54.8%)  
PMLRARA transcriptsc      0.451 
 L-type 209 (60.1%) 107 (52.5%) 209 (64.5%) 100 (63.3%) 70 (67.3%)  
 S-type 89 (25.6%) 60 (29.4%) 89 (27.5%) 41 (25.9%) 28 (26.9%)  
 V-type 24 (6.9%) 11 (5.4%) 24 (7.4%) 16 (10.1%) 6 (5.8%)  
 Missing data 26 (7.5%) 26 (12.7%) 2 (0.6%) 1 (0.6%)  
Early death 22 (6.3%) 6 (2.9%) 22 (6.8%) 15 (9.5%) 6 (5.8%) 0.355 
Relapse 13 (3.7%) 13 (6.4%) 11 (3.4%) 5 (3.2%) 4 (3.8%) 0.744 

Note: Data are median (IQR) or n (%).

aOne patient was identified with PML–RARG fusion, and one patient was detected with PML–RARA and BCR–ABL1 fusions and excluded in the subsequent analysis.

bχ2 or Fisher exact tests were performed to compare categorical variables, and Wilcoxon tests were performed to compare continuous variables between cohorts.

cThe PML–RARA transcript isoforms were identified in accordance with the breakpoint of the PML gene. The isoform information of 24 patients was missing due to lack of RNA-seq data.

Whole-exome sequencing, whole-genome sequencing, and RNA sequencing

The pretreatments of BM samples, genomic DNA extraction, library preparation, and detailed analysis for whole-exome sequencing (WES), whole-genome sequencing (WGS), and RNA sequencing (RNA-seq) are listed in the Supplementary Materials and Methods.

Establishment of a scoring system with penalty LASSO method to linearize the gene expression profiles

After exclusion of the patients (n = 61) from whom we performed differential analysis identifying a list of 389 differentially expressed genes (DEG), there were 262 patients with RNA-seq data in our cohort. Using the createDataPartition function in the caret (version 6.0-84; ref. 9) package, we divided these 262 patients with APL randomly into the training set (n = 158; three fifths) and the validation set (n = 104; two fifths) set. No significant difference was observed in such division in terms of Sanz risk group distribution (P = 0.697) or other clinical characteristics (Table 1).

To identify a core subset of DEGs that were relevant to distinct patient groups, we used a linear regression technique based on the least absolute shrinkage and selection operator (LASSO) algorithm as implemented in the glmnet package (version 2.0-16; ref. 10), choosing 10-fold cross-validation to fit a binomial regression model with optimized parameters. We repeated the process 10 times to test the stability of this method. A scoring model was built using the training set and tested in the validation set. The ROC curve was used to identify the best threshold of the score model, separating patients into two subgroups by the pROC package (version 1.13.0; ref. 11).

The definition of clinical outcome endpoints

We considered three endpoints including OS, DFS, and EFS. OS was defined for all patients in a trial and measured from the date of entry into a study until death from any cause. Morphologic relapse was defined as the reappearance of blasts after complete response (CR) in peripheral blood or BM, and molecular or cytogenetic relapse was defined as the reappearance of molecular or cytogenetic abnormality (12). DFS was defined as the time from the CR to any relapse, or persistent positivity of RT-PCR of PML–RARA transcripts after consolidation therapy, or secondary AML, or death of any reason. EFS was defined as the time from diagnosis until an event (i.e., induction failure, relapse, or death from any cause) or last follow-up. Early death was defined as death occurring either before treatment initiation or during induction. After the systematic introduction of ATRA in modern regimens, most early deaths have been recorded within the first 2 to 3 weeks. Patients who were lost to follow-up were censored at the time of last contact. OS, EFS, and DFS were evaluated using the log-rank test and the Cox proportional hazards (CPH) model. The violation of the CPH assumption was detected by examining the Schoenfeld residuals.

Statistical analysis

Analysis workflow was charted in Fig. 1. The multivariate Cox analysis was performed using all significant univariate predictors following the suggestion of TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) (13). The risk-revised categories were generated using the X-tile (version 3.6.1; ref. 14), a tool that assisted the marker cutoff point according to the minimal P value. Bias-corrected internal calibration was performed at 1, 2, and 5 years using 1,000-bootstrap resamples and fitting the relationship between the observed and the estimated survival probabilities by the rms package (version 5.1-4; ref. 15). Bias-corrected internal validation was also performed by evaluating the Harrell's concordance index (C-index) under 1,000-bootstrap resamples under the boot package (version 1.3-24; ref. 16). All survival analyses were performed using the survival package (version 2.42-6; ref. 17), with survival curves visualized using the survminer package (version 0.4.3; ref. 18). Net reclassification improvement (NRI; ref. 19) calculation was performed using the nricens package (version 1.6; ref. 20).

Figure 1.

The multistep workflow chart of the integrated prognosis model establishment. *In the 323 with RNA-seq data, 2 cases without survival data were excluded from the Cox regression.

Figure 1.

The multistep workflow chart of the integrated prognosis model establishment. *In the 323 with RNA-seq data, 2 cases without survival data were excluded from the Cox regression.

Close modal

Supplementary Materials and Methods

Detailed materials, methods, statistical analysis, and additional extended data display items are presented in the Supplementary Materials and Methods and Supplementary Tables S2 to S8.

Availability of data and materials

RNA-seq data have been deposited to the Gene Expression Omnibus database with accession number GSE172057. The detailed information of gene mutations from WES data has been summarized in Supplementary Table S9. All data can also be viewed in The National Omics Data Encyclopedia (http://www.biosino.org/node) by pasting the accession (OEP001919) or through the URL http://www.biosino.org/node/project/detail/OEP001919.

Clinical characteristics and genome-wide profilings of patients with APL

In our cohort involving 348 patients with APL, the median age at diagnosis was 38 years, and the median follow-up time was 49.0 (interquartile range, IQR, 32.7–61.0) months. The occurrence of early death and relapse was recorded in 22 and 13 patients, respectively. According to the Sanz risk model, 110, 173, and 65 patients were classified as HR, MR, and LR, respectively (Table 1).

Among the 348 patients, 202 and 2 were subjected to WES and WGS, respectively, to detect genomic abnormalities, and 189 out of these 204 patients had their own matched control samples (complete remission BM samples). A total of 324 patients were subjected to RNA-seq for transcriptome analysis. However, one case was excluded in further analysis because of the co-occurrence of PML–RARA and BCR–ABL1 fusion genes (Table 1).

Gene mutation landscape of de novo APL

From the above-mentioned patients, we identified 1,481 nonsilent somatic mutations involving 940 genes, including 1,295 single-nucleotide variations and 186 insertions/deletions, with a median of 4 mutations (IQR, 2–7) found per patient (Supplementary Fig. S2; Supplementary Table S9). Recurrent mutations were observed in 81.03% (282/348) of patients. As illustrated in Fig. 2A, the most common APL-related variations were found in FLT3 [33%, including internal tandem duplication (ITD: 13%) and point mutations in the tyrosine kinase domain (TKD: 20%)] and WT1 (19%), followed by NRAS (9%), ARID1A (7%), and RREB1 (6%). The full list of recurrent gene mutations is detailed in Supplementary Table S10. In addition to these well-known genes, we also detected previously unidentified mutated genes in APL, including RREB1, EBF3, and UPF3B; they were detected with a mutation frequency of > 2%, likely being loss of function in nature (frameshift insertions/deletions and nonsense mutations; see Supplementary Fig. S3A). As comparison, the frequency of mutations of these genes was seldomly seen in non-APL AML or other diseases (Supplementary Table S11).

Figure 2.

Nonsilent recurrent gene mutations in primary APL. A, Genes mutated recurrently in three or more patients or altered twice in the same site are shown and color coded for different types of variant classification. Genes in rows are displayed in descending order in accordance with the mutational frequency except for FLT3-ITD. Other recurrent genes with two mutational events in different sites are compressed into one row for improved graphic presentation. Sanger sequencing is applied for validation, and the detailed information is listed in Supplementary Table S9. Samples are arranged in columns to accentuate gene co-occurrence and mutual exclusivity. Bars on the right plot indicate the exact number of mutations in each gene. B, Kaplan–Meier estimates of OS, EFS, and DFS in accordance with the NRAS mutation status. P value is calculated using the log-rank test.

Figure 2.

Nonsilent recurrent gene mutations in primary APL. A, Genes mutated recurrently in three or more patients or altered twice in the same site are shown and color coded for different types of variant classification. Genes in rows are displayed in descending order in accordance with the mutational frequency except for FLT3-ITD. Other recurrent genes with two mutational events in different sites are compressed into one row for improved graphic presentation. Sanger sequencing is applied for validation, and the detailed information is listed in Supplementary Table S9. Samples are arranged in columns to accentuate gene co-occurrence and mutual exclusivity. Bars on the right plot indicate the exact number of mutations in each gene. B, Kaplan–Meier estimates of OS, EFS, and DFS in accordance with the NRAS mutation status. P value is calculated using the log-rank test.

Close modal

Notably, FLT3-ITD was significantly associated with HR patients as compared with the two other groups (HR vs. MR: P = 0.0036; HR vs. LR: P < 0.0001). Although NRAS mutations were equally distributed in Sanz LR, MR, and HR groups, KRAS mutations showed a significant difference between HR and LR patients (P = 0.027). ARID1A, USP9X, and ARID1B were mutated more frequently in LR patients than in HR patients (P = 0.040, 0.013, and 0.011, respectively; Supplementary Fig. S3B).

The analysis of prognostic relevance of NRAS subclonal status

Besides, genomic mutations mentioned above were all assessed in univariate analysis for OS, EFS, and DFS. Of note, only NRAS mutations (n = 32; Fig. 2B) resulted in poor prognosis, as reflected by OS [HR = 4.01; 95% confidence interval (CI), 1.675–9.602; P = 0.0018], EFS (HR = 3.049; 95% CI, 1.389–6.692; P = 0.0055), and DFS (HR = 4.519; 95% CI, 1.416–14.420; P = 0.011).

The variant allele fraction (VAF) of NRAS was analyzed to identify the impact of subclonal status on the prognosis. First, the VAF of NRAS was treated as a numeric variable and added into the univariate Cox analysis. As shown in Supplementary Table S12, it was not related to OS, EFS, and DFS, respectively. Second, we divided patients with APL into three groups: NRAS wild-type, NRAS mutation with VAF less than the median, NRAS mutation with VAF greater than the median. To evaluate the impact of VAF, we treated NRAS mutation with VAF less than the median as a reference and performed the univariate Cox analysis. As shown in Supplementary Table S12, the VAF of NRAS mutation was not related to OS, EFS, and DFS.

Analysis of functional gene categories among different Sanz risk groups

The different distributions of gene mutations among the LR, MR, and HR patients were assessed by comparing the known functional gene groups (Supplementary Table S13). Notably, the mutations of activated signaling-related genes were convincingly enriched in HR patients (HR vs. MR: P < 0.0001; HR vs. LR: P < 0.0001), whereas those of epigenetic modifiers tended to be associated with LR patients (LR vs. MR: P = 0.052; LR vs. HR: P = 0.036; Supplementary Fig. S4A).

In univariate Cox analysis, only the mutations of activated signaling pathway were related to the clinical prognosis in DFS (HR = 6.362; 95% CI, 1.424–28.431; P = 0.015) in this study (Supplementary Table S14).

Pairwise co-occurrence and mutual exclusivity calculation

The genes with mutation frequency more than 2% were applied to pairwise co-occurrence and mutual exclusivity calculations (Supplementary Fig. S4B). The mutations in SF3B1 (n = 7) and CALR (n = 6), which were commonly detected in myelodysplastic syndrome and myeloproliferative neoplasms (21), showed a pattern of co-occurrence (P < 0.0001). Meanwhile, SMARCB1 and FLT3-ITD as well as EBF3 and SF3B1 tended to display co-occurrence of gene mutation (P = 0.033 and P = 0.024, respectively).

Identification of distinct gene expression patterns

To gain a panoramic view of gene expression profiles, we first performed the unsupervised clustering analysis focusing on a set of 1,090 genes with the highest variance across 323 patients (Supplementary Table S15). As illustrated in the heatmap (Supplementary Fig. S5), we observed a group of genes displaying distinct expression patterns associated with Sanz LR (left dotted box) and HR patients (right dotted box). This observation motivated us to further identify the gene expression pattern of different APL phenotypic features. DEGs were obtained from BM samples of 34 HR and 27 LR (Sanz score) patients who had no or only short exposure to ATRA (less than 48 hours). Heatmap was generated by the 389 DEGs that met the criteria (|Log2FC| > 2, adjP < 0.01, and baseMean > 100; Fig. 3A; Supplementary Table S16).

Figure 3.

Semisupervised hierarchical clustering identified the two distinct subcohorts (SC1 and SC2) of patients with APL. A, A total of 389 DEGs are clustered using the ward.D method in 262 patients with APL. Distances are calculated on the basis of the Pearson correlation coefficient. Samples are clustered into two subcohorts (SC1 and SC2). Columns indicate patients with APL, and rows indicate genes. The first bottom plot shows the clinical features and outcomes of patients with APL. The second bottom plot depicts patients' WBC count, PLT level, and hemoglobin (Hb) level at diagnosis. The third bottom plot displays the Sanz risk groups and the GATA1 expression level, and the last bottom plot presents the distribution of the top three most frequently altered genes. B, Gene Ontology (GO) enrichment analyses on 389 DEGs of gene cluster A. C, GO enrichment analyses on 389 DEGs of gene cluster B. D,GATA1 expression in accordance with different Sanz risk groups. E,GATA1 expression in accordance with different quantiles of WBC count. F,GATA1 expression in accordance with PLT count.

Figure 3.

Semisupervised hierarchical clustering identified the two distinct subcohorts (SC1 and SC2) of patients with APL. A, A total of 389 DEGs are clustered using the ward.D method in 262 patients with APL. Distances are calculated on the basis of the Pearson correlation coefficient. Samples are clustered into two subcohorts (SC1 and SC2). Columns indicate patients with APL, and rows indicate genes. The first bottom plot shows the clinical features and outcomes of patients with APL. The second bottom plot depicts patients' WBC count, PLT level, and hemoglobin (Hb) level at diagnosis. The third bottom plot displays the Sanz risk groups and the GATA1 expression level, and the last bottom plot presents the distribution of the top three most frequently altered genes. B, Gene Ontology (GO) enrichment analyses on 389 DEGs of gene cluster A. C, GO enrichment analyses on 389 DEGs of gene cluster B. D,GATA1 expression in accordance with different Sanz risk groups. E,GATA1 expression in accordance with different quantiles of WBC count. F,GATA1 expression in accordance with PLT count.

Close modal

Though identified from the 61 patients mentioned above, we observed that these 389 DEGs were mainly grouped into two gene clusters according to the expression pattern seen in the remaining 262 patients (Fig. 3A). Genes in cluster A were largely involved in erythrocytic functions, differentiation, and homeostasis (Fig. 3B; Supplementary Table S17). Genes in cluster B were mostly of functional relevance to T-cell activation, leukocyte differentiation, and lymphocyte activation (Fig. 3C; Supplementary Table S17). Gene set enrichment analysis also showed a number of gene sets or pathways at significantly lowered expression levels in HR patients compared with LR patients (Supplementary Fig. S6).

Two distinct subcohorts, namely, subcohorts 1 (SC1) and 2 (SC2), could be defined in accordance with the gene expression pattern without artificial intervention. SC1 (n = 128; 48.9%) comprised 21 Sanz LR, 90 MR, and 17 HR patients. Most genes in cluster A were highly expressed only in SC1. SC2 (n = 134; 51.1%) consisted of 14 Sanz LR, 67 MR, and 53 HR patients. The patients with WBC count > 50 × 109/L were primarily observed in SC2 (P = 0.0018). In terms of gene mutations, FLT3-ITD was significantly enriched in SC2 (P = 0.0024), whereas WT1 and NRAS were distributed equally between these two subcohorts (P = 0.085 and 0.539, respectively).

To investigate whether these DEGs were modulated by the same upstream regulator or the same pathway, we used Ingenuity Pathway Analysis which identified GATA-binding protein 1 (GATA1, P < 0.0001; Supplementary Table S18) as the most significant upstream regulator that might explain the different gene expression patterns observed between LR and HR patients. Low GATA1 expression was significantly associated with the HR group (Fig. 3D), and also accompanied with the tendency of increased WBC count, decreased PLT count (Fig. 3E and F). We then analyzed the correlation of GATA1 expression and APL blasts in 323 patients. Similarly, GATA1 was conversely associated with the blast percentage (P < 0.0001; R = −0.386; Supplementary Fig. S7A). We further classified the patients according to the quartile interval of GATA1 expression (< Q1, Q1–Q2, Q2–Q3, ≥ Q3), and a similar result was obtained (Supplementary Fig. S7B). Besides, we counted the number of all forms of megakaryocytes in BM. The median number per smear in patient samples was 1, with a 25% to 75% CI of 0 to 6. As shown in Supplementary Fig. S7C, in patients with elevated GATA1 expression, the number of megakaryocytes was increased (P < 0.0001; R = 0.333). It became much clearer in the quartile analysis showing that the elevation of GATA1 expression was correlated with the number of megakaryocytes per smear (Supplementary Fig. S7D). In support of our finding, negative correlations were also observed between the DNA methylation and the gene expression level of GATA1 in patients with APL from The Cancer Genome Atlas (TCGA) dataset (Supplementary Fig. S8).

Potential of APL9 score in determining risks in APL

Starting with 389 DEGs, we proceeded to perform a sparse regression analysis stratifying patients into molecularly LR or HR by a unified score (see workflow chart in Fig. 1). In brief, the remaining cohort (n = 262, excluding the 61 patients defining DEGs) was randomly divided into the training set (three fifths, n = 158) and the validation set (two fifths, n = 104). In the training phase, the LASSO algorithm was applied to 389 DEGs with SC2 as endpoint, which yielded a signature of 9 genes. The linear combination of these 9 genes weighted by regression coefficients was then calculated for each patient (i.e., the APL9 score; Fig. 4A).

Figure 4.

Establishment of APL9 score and the systemic stratification model. A, APL9 score model in the entire cohort (n = 323). The left plot shows the regression coefficients of 9 signature genes in the LASSO model. The top right plot shows the gene expression patterns of 9 relative genes across samples. The bottom plot shows the bar plot of the APL9 score in ascending order. These 9 genes weighted by regression coefficients are summed as follows: APL9 score = (GATA1 × −1.066) + (CLCN4 × −1.234) + (CACNA2D2 × −0.544) + (OPTN × −0.308) + (KRT1 × −0.245) + (LTF × −0.205) + (S100A12 × −0.192) + (LEF1 × −0.169) + (LGALS1 × 0.111) + 27.163. The GI and the GII subgroups are divided using the cutoff of 0.58 by the ROC curve (dotted line). B, Sankey plot for reclassification from the Sanz risk to the APL9 score groups in the entire cohort. C, Kaplan–Meier estimates of OS, EFS, and DFS in accordance with the APL9 score groups of the entire cohort. P value is calculated using the log-rank test.

Figure 4.

Establishment of APL9 score and the systemic stratification model. A, APL9 score model in the entire cohort (n = 323). The left plot shows the regression coefficients of 9 signature genes in the LASSO model. The top right plot shows the gene expression patterns of 9 relative genes across samples. The bottom plot shows the bar plot of the APL9 score in ascending order. These 9 genes weighted by regression coefficients are summed as follows: APL9 score = (GATA1 × −1.066) + (CLCN4 × −1.234) + (CACNA2D2 × −0.544) + (OPTN × −0.308) + (KRT1 × −0.245) + (LTF × −0.205) + (S100A12 × −0.192) + (LEF1 × −0.169) + (LGALS1 × 0.111) + 27.163. The GI and the GII subgroups are divided using the cutoff of 0.58 by the ROC curve (dotted line). B, Sankey plot for reclassification from the Sanz risk to the APL9 score groups in the entire cohort. C, Kaplan–Meier estimates of OS, EFS, and DFS in accordance with the APL9 score groups of the entire cohort. P value is calculated using the log-rank test.

Close modal

To test the stability of the APL9 score, we repeated the above division for 10 times to establish 10 models based on each of the 10 training sets. The ROCs of these 10 models in each training set and validation set are illustrated in Supplementary Fig. S9, showing the robustness of our proposed method.

The cutoff point of the APL9 score (0.58) was determined by the ROC curve maximizing the separation of two subcohorts (i.e., separating SC1 and SC2 only in the training set). We classified patients with APL9 score higher than 0.58 into the molecular HR group (i.e., GII), and otherwise into the molecular LR group (i.e., GI; Fig. 4B).

The GII patients had 5-year OS, EFS, and DFS rates of 87.1% (95% CI, 81.9%–92.6%), 82.7% (95% CI, 77.0%–88.9%), and 87.7% (95% CI, 88.2%–97.0%), respectively (Fig. 4C). Their outcome was significantly worse than that observed in GI patients who had 5-year OS, EFS, and DFS rates of 96.9% (95% CI, 94.2%–99.6%), 95.6% (95% CI, 92.4%–98.8%), and 98.7% (95% CI, 96.8%–100.0%), respectively (Fig. 4C). We also found that higher APL9 score was remarkably associated with worse prognosis (P = 0.0023, 0.00035, and 0.012 for OS, EFS, and DFS, respectively). The related clinical characteristics and the survival information are listed in Supplementary Tables S19 and S20; and Supplementary Fig. S10A–S10D. The AUCs of the APL9 score were 0.986 and 0.980 in the training and validation sets, respectively (Supplementary Fig. S10E).

Although the APL9 score was established by LASSO regression without considering its biological function, it was interesting to find that some of these genes still have functions related to hematologic or immune features or diseases (Supplementary Table S21).

Besides, in an attempt to explore simpler and less time-consuming and less costing method for gene expression score in the future, we have randomly selected 30 GI and 30 GII patients and, for each patient, used real-time RT-PCR to quantify the expression of all 9 genes in the APL9 score (Supplementary Fig. S11). The expression levels of these genes show the concordance between the real-time RT-PCR and RNA-seq results.

Establishment of a new prognosis model combining clinical characteristics, APL9 score, and NRAS mutations

We assessed clinical characteristics, genomic mutations, and APL9 score in univariate Cox analysis for OS, EFS, and DFS (Fig. 5A). Characteristics used for the univariate Cox analysis were summarized in Supplementary Table S22. High WBC level (> 10 × 109/L) remained significant as a classical outcome predictor (OS: HR = 2.419, 95% CI, 1.104–5.302, P = 0.027; EFS: HR = 2.548, 95% CI, 1.324–4.902, P = 0.0051; DFS: HR = 4.396, 95% CI, 1.473–13.117, P = 0.0080). In addition to NRAS mutation mentioned above, high APL9 score also predicted a poor outcome (OS: HR = 4.087, 95% CI, 1.534–10.890, P = 0.005; EFS: HR = 4.056, 95% CI, 1.766–9.315, P = 0.001; DFS: HR = 5.629, 95% CI, 1.233–25.693, P = 0.026).

Figure 5.

Forest plot exhibiting the univariate and multivariate Cox regression of OS, EFS, and DFS. A, Forest plots of multivariable CPH models showing WBC, APL9 score, and NRAS as independent prognostic factors of OS, EFS, and DFS. HRs and 95% CIs are listed next to each variable. Within the forest plot, HR for each variable is depicted as a box, and 95% CIs are shown as horizontal lines. The vertical line crossing the value of 1 represents nonstatistically significant effect, and odds greater than one indicate worse effects. 95% CI. P value was calculated using the Wald test. §n = 321, for RNA-seq patients with available survival data. B, The chart is presented to show the cumulative score of the revised risk model. In each cell of the chart, the risk for a person is estimated with the values of each predictor in accordance with the multivariate Cox model. Then, the cells of the chart are colored in accordance with the risk status: 0–2 and 3–5. C, ROC curves of the revised risk score to predict the OS, EFS, and DFS status in the entire cohort. D, Sankey plot for reclassification from Sanz risk to the revised risk model. E, Kaplan–Meier estimates of OS, EFS, and DFS in accordance with the revised risk groups in the entire cohort. P value is calculated using the log-rank test.

Figure 5.

Forest plot exhibiting the univariate and multivariate Cox regression of OS, EFS, and DFS. A, Forest plots of multivariable CPH models showing WBC, APL9 score, and NRAS as independent prognostic factors of OS, EFS, and DFS. HRs and 95% CIs are listed next to each variable. Within the forest plot, HR for each variable is depicted as a box, and 95% CIs are shown as horizontal lines. The vertical line crossing the value of 1 represents nonstatistically significant effect, and odds greater than one indicate worse effects. 95% CI. P value was calculated using the Wald test. §n = 321, for RNA-seq patients with available survival data. B, The chart is presented to show the cumulative score of the revised risk model. In each cell of the chart, the risk for a person is estimated with the values of each predictor in accordance with the multivariate Cox model. Then, the cells of the chart are colored in accordance with the risk status: 0–2 and 3–5. C, ROC curves of the revised risk score to predict the OS, EFS, and DFS status in the entire cohort. D, Sankey plot for reclassification from Sanz risk to the revised risk model. E, Kaplan–Meier estimates of OS, EFS, and DFS in accordance with the revised risk groups in the entire cohort. P value is calculated using the log-rank test.

Close modal

In univariate Cox analysis, WBC remained an important predictive factor, whereas PLT did not show significant value in outcome analysis. Hence, the WBC count instead of the Sanz score was used in the multivariate Cox regression to predict the EFS (β coefficient = 0.406; HR = 1.501; 95% CI, 0.723–3.116; P = 0.276). The other significant univariate predictors of EFS were all included in the multivariate Cox model (APL9 score GII: β coefficient = 1.173, HR = 3.230, 95% CI, 1.311–7.959, P = 0.011; NRAS mutation: β coefficient = 1.056, HR = 2.875, 95% CI, 1.297–6.375, P = 0.009). The β coefficients of the model were multiplied by two and rounded off to the nearest integer to form the scores for each predictor. The revised risk score was formulated as such: WBC > 10 (× 109/L) × 1 + APL9 score (GII) × 2 + NRAS (mutation) × 2 (Fig. 5B).

We quantified the degree of the agreement between the predicted and the actual EFS of the new score system using 1,000-sample bootstrapped calibration plots at 1, 2, and 5 years (Supplementary Fig. S12). The C-index was 0.700 (95% CI, 0.613–0.787), and the optimism-corrected C statistic was 0.704 (95% CI, 0.659–0.753), indicative of good discrimination. The AUCs of the revised risk score predicting the OS, EFS, and DFS status were 0.721, 0.712, and 0.743, respectively (Fig. 5C). Patients with APL possessing sufficient follow-up data were separated into three groups similar to the Sanz score by X-tile (14) based on the revised risk score (Supplementary Fig. S13A): revised HR (n = 93, score = 3–5), revised MR (n = 81, score = 2), and revised LR (n = 147, score = 0–1) groups (Supplementary Fig. S13B). The revised HR group displayed significantly poor outcomes when compared with the revised LR and MR groups, but the difference between MR and LR was not significant (Supplementary Fig. S13C); this finding was similar to the Sanz stratification (Supplementary Fig. S13D). For this reason, we combined the revised LR and MR groups into a single group, called the revised SR group (n = 228, score = 0–2). As a result, two and five patients of Sanz LR and MR groups, respectively, who had high APL9 score and NRAS mutation were reclassified into the revised HR group (Fig. 5D). These seven patients showed much inferior OS (P = 0.0049) and EFS (P = 0.02) to that in the other 210 patients (Supplementary Fig. S14A). Eighteen patients of Sanz HR group with neither high APL9 score nor NRAS mutation were assigned into the revised SR group (Fig. 5D). The 18 crossover patients had superior OS (P = 0.085) and EFS (P = 0.038) over that in the other patients (Supplementary Fig. S14B).

NRI is commonly used to quantify whether adding a new predictor to an existing model is of benefit (22–25). We combined the Sanz low and intermediate groups as the non-HR group (Supplementary Fig. S15A–S15C) and compared the NRI between the revised model and the combined Sanz model. The 1,000-sample bootstrapped NRI values in OS, EFS, and DFS still showed 12.16% (95% CI, 0.00%–35.00%), 10.27% (95% CI, 1.41%–30.47%), and 4.27% (95% CI, 0.00%–25.31%) higher, respectively, compared with those in the combined Sanz groups. Besides, the 2,000-sample bootstrap resampling strategy was also used to compare ROCs of the Sanz score and the revised score (Supplementary Fig. S15D–S15F).

Using our newly introduced system, we next evaluated 321 patients in accordance with the three key prognostic parameters, namely, NRAS mutation, APL9 score, and WBC count. The revised SR category comprising 228 patients had a 5-year OS, EFS, and DFS rates of 95.6% (95% CI, 93.0%–98.3%), 93.8% (95% CI, 90.7%–97.0%), and 98.1% (95% CI, 96.3%–100.0%), respectively (Supplementary Table S20). The revised HR category comprising 91 patients had a 5-year OS, EFS, and DFS rates of 82.9% (95% CI, 75.3%–91.3%), 77.4% (95% CI, 69.1%–86.8%), and 88.4% (95% CI, 81.1%–96.4%), respectively, indicating worst outcomes. The revised SR category had superior OS (P = 0.00031), EFS (P < 0.0001), and DFS (P = 0.001) expectation over the revised HR group (Fig. 5E). The characteristics of the revised groups are shown in Supplementary Table S23.

The correlation of minimal residual disease and other forms of relapse to the revised score

In APL, the most important minimal residual disease endpoint is the achievement of PCR negativity for PML–RARA at the end of consolidation treatment. During the period of treatment, any detection of PML–RARA (the lower limit of quantification for the quantitative PML–RARA assay is 1:10,000) would be considered as relapse (26, 27). In this study, relapse was defined as occurrence of either molecular, hematologic, or central nervous system (CNS) relapse. As shown in Supplementary Table S24, in the 11 relapsed patients, 3 showed hematologic relapse, 5 had CNS relapse, and the remaining 3 patients only had a detectable PML–RARA, which was considered as molecular relapse.

In our final revised score, relapse was lower in patients of revised SR group (n = 4, 1.754%) than those of revised HR ones (n = 7, 7.527%; HR = 5.117; 95% CI, 1.498–17.48; P = 0.009). Since we noticed that too small number of relapses occurred in our study, we call for much larger validation sets to be generated in the future to confirm our findings.

External validation using publicly available datasets

Detailed searching process of available external validation datasets was demonstrated in Supplementary Appendix. APL is a rare disease, and only two datasets were available for external validation: one from TCGA dataset (28) and the other from GSE6891 (29). In the TCGA dataset, the gene expression profiles were available for 14 patients with APL, together with the information on WBC count, NRAS mutation status, and survival information. This dataset was then used for validation (Supplementary Table S25). The APL9 score was utilized in TCGA data, which generated GI (n = 9) and GII (n = 5) groups (Supplementary Fig. S16A). Patients in the GII group had significantly inferior OS to patients in the GI group (P = 0.025; Supplementary Fig. S16B). Furthermore, when the revised risk model applied to TCGA patients, the AUC of OS was 0.850 (Supplementary Fig. S16C). Patients were divided into the revised SR (n = 11) and revised HR (n = 3) groups. The revised HR group showed a remarkably inferior OS to the revised SR group (P < 0.0001; Supplementary Fig. S16D), thus providing the support to our revised risk model. Similarly, using the GSE6891 dataset, we demonstrated the predictive ability of the APL9 score (n = 8; Supplementary Table S26; Supplementary Fig. S17A). These eight patients were stratified into GII (n = 3) and GI (n = 5) groups, and a higher APL9 score (GII) was associated with shorter OS (P = 0.022; Supplementary Fig. S17B).

From the beginning of the APL trial, investigators have noticed that APL should not be regarded as the same whole, but a group of heterogeneous diseases including distinct clinical behaviors and risk burdens. Various models were used to identify the risk factors of the APL, and the most famous model was the Sanz risk stratification (4). More recent efforts tried to examine the value of gene mutations and expression additional to PML–RARA, including FLT3, epigenetic modifier genes (30), and the ISAPL (Integrative Score in APL) model established by Lucena-Araujo and colleagues (31). These systems aimed at effectively discriminating the dangerous cases and cases with a good outcome at disease onset. The advantage of Sanz risk stratification is its convenience, which needs only the information of WBC and PLT count; however, its molecular mechanism and background are far from being known. Establishing molecular models to illustrate the genomic and transcriptomic difference between different groups of patients with APL with various prognoses, and more ambitiously, to optimize the current system is of research interest.

Therefore, this genomic and transcriptomic analysis was performed to identify new biomarkers for a better classification of risk groups. Our study served as the largest integrated analysis in APL at present. In the panoramic view of somatic mutations, some of them were associated with the Sanz risk groups. Of note, NRAS mutations were highly prevalent in hematologic malignancies, which function to promote self-renewal and invasiveness of leukemia cells (32) as well as to induce a spectrum of fatal hematologic disorders (33). Previous studies have demonstrated that NRAS mutations bear potential prognostic value in AML (34) and acute lymphoblastic leukemia (35). In this work, despite its relatively low number (n = 32; 9%), NRAS showed a strong correlation with early death and relapse events, resulting in an extremely poor prognosis. Notably, NRAS mutations were nearly equally distributed in the Sanz groups and should be considered an independent molecular marker without any relationship with clinical phenotypes.

Evidence suggested that the phenotypic features of leukemia are strongly associated with the gene expression signature of malignant cells (36, 37). Through analyzing the gene expression of HR and LR APL, we have generated an APL9 score to predict the outcome of the disease. The functions of the 9 genes involved in this score are shown in Supplementary Table S21. Interestingly, high expression levels of erythroid- and megakaryocyte-specific function genes tended to be enriched in the LR APL patients, including relatively high GATA1 expression levels, a well-known master erythroid transcription factor that regulates almost all erythroid-specific genes through a dual zinc finger domain (38). Previous studies had proven that a part of erythroid elements was derived from the PML–RARA-expressing progenitor cells (39). As shown in our data, the high expression of GATA1 might promote the proliferation of megakaryocyte, while causing a decrease of common myeloid progenitor toward the differentiation to the granulocyte/monocyte lineage (40).

We used the LASSO algorithm translating transcriptome features into a useful tool for prognosis. This algorithm was chosen for its ability to prevent overfitting as well as to create a simple but practical model. It has been successfully used in the development of prognostic biomarkers for AML (41) and T-lymphoblastic lymphoma (42).

We demonstrated the prognostic value of APL9 score reliably predicting outcomes in terms of OS and EFS, but for DFS, the NRAS mutation was the only independent predictor. Combining the APL9 score, NRAS, and WBC into a final model allowed us to predict all three clinical outcomes (OS, EFS, and DFS as well), and could enrich the current stratification. First, it provided the understanding of the molecular mechanisms of Sanz score. Second, NRAS, which was identified as an important biomarker through this study, may serve as a novel potential biological target for those refractory and/or relapse APL patients.

We simplified the system into a two-category classification to accommodate with survival distribution and clinical meaning: a larger revised SR category (n = 228), in which the ATRA–ATO without chemotherapy consolidation seems to be the ideal and safe treatment to minimize the toxicity of chemotherapy, and a smaller revised HR category (n = 93), in which the combination of ATRA, traditional chemotherapy, prolonged ATO, or other novel therapeutic options should be used to reduce relapse rates, and more efficient supportive care should be provided to prevent early death.

We recognize that it is challenging nowadays to widely employ techniques such as RNA-seq in daily clinical practice. In comparison with the information on gene mutations and fusions that could be easily reproduced using different genetic assays, gene expression levels may vary across platforms. For instance, as compared with RNA-seq (43, 44), microarray is unable to precisely detect genes with very low expression level. However, we believe that, with the further reduction of cost and the emerging of next-generation technologies, gene expression patterns may be found robustly associated with the treatment outcome of the disease. A composite multigene signature will strongly supplement the current guideline that mainly considers cytogenetic and gene mutational factors. We would also like to emphasize the importance of the Sanz score regarding its feasibility and clinical value. On the other hand, we are on the agenda to simplify the technique in the future by using real-time RT-PCR for gene expression score which would be more convenient, less time-consuming, and less costing.

Patients (n = 304) with available BM in the APL2012 clinical trial (NCT01987297) and a small group of 44 historical cases treated with similar protocol were all enrolled in this study to make the study representative by including as many samples as possible. Even though the distribution of patients in this study showed no difference with the patients in the trial, caution should still be taken for potential selection bias. However, the number of patients with APL in the publicly available datasets is very small, and we report the largest cohort. We also feel our findings can be further confirmed/extended by scientific community, when larger external validation sets are available. The present work has made a trial of integrating the molecular markers (WES/WGS for gene mutation detection and RNA-seq for transcriptomic analysis) and clinical parameters in a large series of APL for a better prognosis prediction. The intriguing correlation between the molecular and clinical heterogeneities of APL may also provide genomic insights into this disease model for improved therapy.

No disclosures were reported.

X. Lin: Conceptualization, data curation, software, formal analysis, validation, visualization, methodology, writing–original draft. N. Qiao: Conceptualization, data curation, software, formal analysis, validation, visualization, methodology, writing–original draft. Y. Shen: Conceptualization, formal analysis, supervision, funding acquisition, investigation, writing–original draft, project administration. H. Fang: Writing–review and editing. Q. Xue: Data curation, validation. B. Cui: Software. L. Chen: Resources. H. Zhu: Resources. S. Zhang: Resources. Y. Chen: Resources. L. Jiang: Resources. S. Wang: Data curation. J. Li: Resources. B. Wang: Software, supervision, methodology. B. Chen: Conceptualization, formal analysis, supervision, funding acquisition, investigation, writing–original draft, project administration. Z. Chen: Conceptualization, resources, data curation, supervision, funding acquisition, investigation, methodology, writing–original draft, project administration. S. Chen: Conceptualization, resources, data curation, supervision, funding acquisition, investigation, methodology, writing–original draft, project administration.

This work was supported by the Center for High-Performance Computing at Shanghai Jiao Tong University. The authors thank all members of the Shanghai Institute of Hematology and National Research Center for Translational Medicine at Shanghai. This study was funded by the National High-tech Research and Development (863) Program of China (2012AA02A505), the Overseas Expertise Introduction Project for Discipline Innovation (111 Project, B17029), the Double First-Class Project (WF510162602) of Shanghai Jiao Tong University and Shanghai Collaborative Innovation Program on Regenerative Medicine and Stem Cell Research (2019CXJQ01), the National Natural Science Foundation of China (81670137, 81770141, and 81890994), the National Key R&D Program of China (2016YFE0202800), Shanghai Municipal Education Commission-Gaofeng Clinical Medicine Grant Support (20152501 and 20161406), and innovative research team of high-level local universities in Shanghai and Shanghai Guangci Translational Medical Research Development Foundation.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Shen
ZX
,
Shi
ZZ
,
Fang
J
,
Gu
BW
,
Li
JM
,
Zhu
YM
, et al
All-trans retinoic acid/As2O3 combination yields a high quality remission and survival in newly diagnosed acute promyelocytic leukemia
.
Proc Natl Acad Sci U S A
2004
;
101
:
5328
35
.
2.
Hu
J
,
Liu
YF
,
Wu
CF
,
Xu
F
,
Shen
ZX
,
Zhu
YM
, et al
Long-term efficacy and safety of all-trans retinoic acid/arsenic trioxide-based therapy in newly diagnosed acute promyelocytic leukemia
.
Proc Natl Acad Sci U S A
2009
;
106
:
3342
7
.
3.
Lo-Coco
F
,
Avvisati
G
,
Vignetti
M
,
Thiede
C
,
Orlando
SM
,
Iacobelli
S
, et al
Retinoic acid and arsenic trioxide for acute promyelocytic leukemia
.
N Engl J Med
2013
;
369
:
111
21
.
4.
Sanz
MA
,
Coco
FL
,
Martın
G
,
Avvisati
G
,
Rayón
C
,
Barbui
T
, et al
Definition of relapse risk and role of nonanthracycline drugs for consolidation in patients with acute promyelocytic leukemia: a joint study of the PETHEMA and GIMEMA cooperative groups
.
Blood
2000
;
96
:
1247
.
5.
Sanz
MA
,
Fenaux
P
,
Tallman
MS
,
Estey
EH
,
Lowenberg
B
,
Naoe
T
, et al
Management of acute promyelocytic leukemia: updated recommendations from an expert panel of the European LeukemiaNet
.
Blood
2019
;
133
:
1630
43
.
6.
Burnett
AK
,
Russell
NH
,
Hills
RK
,
Bowen
D
,
Kell
J
,
Knapper
S
, et al
Arsenic trioxide and all-trans retinoic acid treatment for acute promyelocytic leukaemia in all risk groups (AML17): results of a randomised, controlled, phase 3 trial
.
Lancet Oncol
2015
;
16
:
1295
305
.
7.
Chen
L
,
Zhu
HM
,
Li
Y
,
Liu
QF
,
Hu
Y
,
Zhou
JF
, et al
Arsenic trioxide replacing or reducing chemotherapy in consolidation therapy for acute promyelocytic leukemia (APL2012 trial)
.
Proc Natl Acad Sci U S A
2021
;
118
:
e2020382118
.
8.
World Medical Association
. 
World Medical Association declaration of Helsinki: ethical principles for medical research involving human subjects
.
JAMA
2013
;
310
:
2191
4
.
9.
Kuhn
M
. 
Building predictive models in R using the caret package
.
J Stat Softw
2008
;
28
.
10.
Friedman
J
,
Hastie
T
,
Tibshirani
R
. 
Regularization paths for generalized linear models via coordinate descent
.
J Stat Softw
2010
;
33
:
1
22
.
11.
Robin
X
,
Turck
N
,
Hainard
A
,
Tiberti
N
,
Lisacek
F
,
Sanchez
JC
, et al
pROC: an open-source package for R and S+ to analyze and compare ROC curves
.
BMC Bioinformatics
2011
;
12
:
77
.
12.
Cheson
BD
,
Bennett
JM
,
Kopecky
KJ
,
Buchner
T
,
Willman
CL
,
Estey
EH
, et al
Revised recommendations of the International Working Group for Diagnosis, standardization of response criteria, treatment outcomes, and reporting standards for therapeutic trials in acute myeloid leukemia
.
J Clin Oncol
2003
;
21
:
4642
9
.
13.
Moons
KG
,
Altman
DG
,
Reitsma
JB
,
Ioannidis
JP
,
Macaskill
P
,
Steyerberg
EW
, et al
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration
.
Ann Intern Med
2015
;
162
:
W1
73
.
14.
Camp
RL
,
Dolled-Filhart
M
,
Rimm
DL
. 
X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization
.
Clin Cancer Res
2004
;
10
:
7252
9
.
15.
Harrell
F
. 
The rms package
.
In: Regression modeling strategies
.
New York
:
Springer
; 
2015
. p.
130
41
.
16.
Canty
A
,
Ripley
B
.
Boot: Bootstrap R (S-Plus) functions
.
R package version 1
.
3
28
.
17.
Therneau
TM
,
Grambsch
PM
. 
The Cox model
.
In: Modeling survival data: extending the Cox model
.
New York
:
Springer
; 
2000
. p.
39
77
.
18.
Kassambara
A
,
Kosinski
M
,
Biecek
P
. 
Survminer: drawing survival curves using ‘ggplot2’
.
Available from
: https://rpkgs.datanovia.com/survminer/index.html.
19.
Alba
AC
,
Agoritsas
T
,
Walsh
M
,
Hanna
S
,
Iorio
A
,
Devereaux
PJ
, et al
Discrimination and calibration of clinical prediction models: users' guides to the medical literature
.
JAMA
2017
;
318
:
1377
84
.
20.
Inoue E. Nricens. 
NRI for risk prediction models with time to event and binary response data
.
Available from
: https://cran.r-project.org/web/packages/nricens/nricens.pdf.
21.
Deininger
MWN
,
Tyner
JW
,
Solary
E
. 
Turning the tide in myelodysplastic/myeloproliferative neoplasms
.
Nat Rev Cancer
2017
;
17
:
425
40
.
22.
Pencina
MJ
,
D'Agostino
RB
 Sr
,
Steyerberg
EW
. 
Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers
.
Stat Med
2011
;
30
:
11
21
.
23.
Pencina
MJ
,
D'Agostino
RB
,
Vasan
RS
. 
Statistical methods for assessment of added usefulness of new biomarkers
.
Clin Chem Lab Med
2010
;
48
:
1703
11
.
24.
Pencina
MJ
,
D'Agostino
RB
,
Pencina
KM
,
Janssens
AC
,
Greenland
P
. 
Interpreting incremental value of markers added to risk prediction models
.
Am J Epidemiol
2012
;
176
:
473
81
.
25.
Pencina
MJ
,
D'Agostino
RB
 Sr
,
D'Agostino
RB
 Jr
,
Vasan
RS
. 
Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond
.
Stat Med
2008
;
27
:
157
72
;
discussion 207–12
.
26.
Grimwade
D
,
Jovanovic
JV
,
Hills
RK
,
Nugent
EA
,
Patel
Y
,
Flora
R
, et al
Prospective minimal residual disease monitoring to predict relapse of acute promyelocytic leukemia and to direct pre-emptive arsenic trioxide therapy
.
J Clin Oncol
2009
;
27
:
3650
8
.
27.
Dohner
H
,
Estey
E
,
Grimwade
D
,
Amadori
S
,
Appelbaum
FR
,
Buchner
T
, et al
Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel
.
Blood
2017
;
129
:
424
47
.
28.
Cancer Genome Atlas Research
N
,
Ley
TJ
,
Miller
C
,
Ding
L
,
Raphael
BJ
,
Mungall
AJ
, et al
Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia
.
N Engl J Med
2013
;
368
:
2059
74
.
29.
Verhaak
RG
,
Wouters
BJ
,
Erpelinck
CA
,
Abbas
S
,
Beverloo
HB
,
Lugthart
S
, et al
Prediction of molecular subtypes in acute myeloid leukemia based on gene expression profiling
.
Haematologica
2009
;
94
:
131
4
.
30.
Shen
Y
,
Fu
YK
,
Zhu
YM
,
Lou
YJ
,
Gu
ZH
,
Shi
JY
, et al
Mutations of epigenetic modifier genes as a poor prognostic factor in acute promyelocytic leukemia under treatment with all-trans retinoic acid and arsenic trioxide
.
EBioMedicine
2015
;
2
:
563
71
.
31.
Lucena-Araujo
AR
,
Coelho-Silva
JL
,
Pereira-Martins
DA
,
Silveira
DM
,
Koury
LCA
,
Melo
RAM
, et al
Combining gene mutation with gene expression analysis improves outcomes prediction in acute promyelocytic leukemia
.
Blood
2019
;
134
:
951
9
.
32.
Sachs
Z
,
LaRue
RS
,
Nguyen
HT
,
Sachs
K
,
Noble
KE
,
M Hassan
NA
, et al
NRASG12V oncogene facilitates self-renewal in a murine model of acute myelogenous leukemia
.
Blood
2014
;
124
:
3274
83
.
33.
Li
Q
,
Haigis
KM
,
McDaniel
A
,
Harding-Theobald
E
,
Kogan
SC
,
Akagi
K
, et al
Hematopoiesis and leukemogenesis in mice expressing oncogenic NrasG12D from the endogenous locus
.
Blood
2011
;
117
:
2022
32
.
34.
Schlenk
RF
,
Dohner
K
,
Krauter
J
,
Frohling
S
,
Corbacioglu
A
,
Bullinger
L
, et al
Mutations and treatment outcome in cytogenetically normal acute myeloid leukemia
.
N Engl J Med
2008
;
358
:
1909
18
.
35.
Perentesis
JP
,
Bhatia
S
,
Boyle
E
,
Shao
Y
,
Shu
XO
,
Steinbuch
M
, et al
RAS oncogene mutations and outcome of therapy for childhood acute lymphoblastic leukemia
.
Leukemia
2004
;
18
:
685
92
.
36.
Li
JF
,
Dai
YT
,
Lilljebjorn
H
,
Shen
SH
,
Cui
BW
,
Bai
L
, et al
Transcriptional landscape of B cell precursor acute lymphoblastic leukemia based on an international study of 1,223 cases
.
Proc Natl Acad Sci U S A
2018
;
115
:
E11711
E20
.
37.
Liu
YF
,
Wang
BY
,
Zhang
WN
,
Huang
JY
,
Li
BS
,
Zhang
M
, et al
Genomic profiling of adult and pediatric B-cell acute lymphoblastic leukemia
.
EBioMedicine
2016
;
8
:
173
83
.
38.
Ferreira
R
,
Ohneda
K
,
Yamamoto
M
,
Philipsen
S
. 
GATA1 function, a paradigm for transcription factors in hematopoiesis
.
Mol Cell Biol
2005
;
25
:
1215
27
.
39.
Grignani
F
,
Valtieri
M
,
Gabbianelli
M
,
Gelmetti
V
,
Botta
R
,
Luchetti
L
, et al
PML/RARα fusion protein expression in normal human hematopoietic progenitors dictates myeloid commitment and the promyelocytic phenotype
.
Blood
2000
;
96
:
1531
7
.
40.
Graf
T
,
Enver
T
. 
Forcing cells to change lineages
.
Nature
2009
;
462
:
587
94
.
41.
Ng
SW
,
Mitchell
A
,
Kennedy
JA
,
Chen
WC
,
McLeod
J
,
Ibrahimova
N
, et al
A 17-gene stemness score for rapid determination of risk in acute leukaemia
.
Nature
2016
;
540
:
433
7
.
42.
Tian
XP
,
Huang
WJ
,
Huang
HQ
,
Liu
YH
,
Wang
L
,
Zhang
X
, et al
Prognostic and predictive value of a microRNA signature in adults with T-cell lymphoblastic lymphoma
.
Leukemia
2019
;
33
:
2454
65
.
43.
Wang
C
,
Gong
B
,
Bushel
PR
,
Thierry-Mieg
J
,
Thierry-Mieg
D
,
Xu
J
, et al
The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance
.
Nat Biotechnol
2014
;
32
:
926
32
.
44.
Peters
TJ
,
French
HJ
,
Bradford
ST
,
Pidsley
R
,
Stirzaker
C
,
Varinli
H
, et al
Evaluation of cross-platform and interlaboratory concordance via consensus modelling of genomic measurements
.
Bioinformatics
2019
;
35
:
560
70
.

Supplementary data