Purpose:

Undetectable measurable residual disease (MRD) is a surrogate of prolonged survival in multiple myeloma. Thus, treatment individualization based on the probability of a patient achieving undetectable MRD with a singular regimen could represent a new concept toward personalized treatment, with fast assessment of its success. This has never been investigated; therefore, we sought to define a machine learning model to predict undetectable MRD at the onset of multiple myeloma.

Experimental Design:

This study included 487 newly diagnosed patients with multiple myeloma. The training (n = 152) and internal validation cohorts (n = 149) consisted of 301 transplant-eligible patients with active multiple myeloma enrolled in the GEM2012MENOS65 trial. Two external validation cohorts were defined by 76 high-risk transplant-eligible patients with smoldering multiple myeloma enrolled in the Grupo Español de Mieloma(GEM)-CESAR trial, and 110 transplant-ineligible elderly patients enrolled in the GEM-CLARIDEX trial.

Results:

The most effective model to predict MRD status resulted from integrating cytogenetic [t(4;14) and/or del(17p13)], tumor burden (bone marrow plasma cell clonality and circulating tumor cells), and immune-related biomarkers. Accurate predictions of MRD outcomes were achieved in 71% of cases in the GEM2012MENOS65 trial (n = 214/301) and 72% in the external validation cohorts (n = 134/186). The model also predicted sustained MRD negativity from consolidation onto 2 years maintenance (GEM2014MAIN). High-confidence prediction of undetectable MRD at diagnosis identified a subgroup of patients with active multiple myeloma with 80% and 93% progression-free and overall survival rates at 5 years.

Conclusions:

It is possible to accurately predict MRD outcomes using an integrative, weighted model defined by machine learning algorithms. This is a new concept toward individualized treatment in multiple myeloma.

See related commentary by Pawlyn and Davies, p. 2482

Translational Relevance

There is great expectation of using diagnostic biomarkers to personalize treatment and improve outcomes in multiple myeloma. Because undetectable measurable residual disease (MRD) is a new endpoint of multiple myeloma therapy and an intermediate surrogate of prolonged survival, treatment individualization based on the probability of an individual patient to attain undetectable MRD with a singular regimen, and confirmation of predicted MRD outcomes in a relatively short time, could represent a new concept toward personalized treatment with faster assessment of its success. Here, we showed that classical prognostic factors have limited ability to predict MRD outcomes when used individually. By contrast, a machine learning model resulting from the integration of tumor burden, cytogenetic and immune-related biomarkers predicts MRD outcomes in up to 72% of newly diagnosed patients with multiple myeloma treated with different regimens. It also predicts survival with significant accuracy. We made the model available to facilitate its use in clinical practice at www.MRDpredictor.com.

Multiple myeloma is generally considered an incurable disease, but the therapeutic improvement observed in the last two decades made it possible for many patients to experience long-term survival (1, 2). Indeed, more than 10 new drugs were approved during this period and 15% of patients achieve or exceed expected survival compared with the matched general population (1, 3). Furthermore, several new agents are showing promising efficacy in relapsed multiple myeloma and may soon be approved (4). Whenever the new drugs reach the first-line setting, there will be numerous possible combinations to treat patients with newly diagnosed multiple myeloma, and these will be poised to deliver prolonged disease-free survival.

The choice of therapy for patients with newly diagnosed multiple myeloma has relied on clinical factors such as age and comorbidities (5). However, there is great expectation of using diagnostic biomarkers to personalize treatment and improve outcomes (5, 6). As such, patients would receive the most tolerable, cost-effective regimen that could, with high probability, offer a prolonged life expectancy. Yet, this scenario does not allow the confirmation of a successful treatment selection before 5 or 10 years of disease-free survival, nor to identify as soon as possible and before patients' progression, an unsuccessful regimen.

After decades of intensive research, there is growing consensus that undetectable measurable residual disease (MRD) is a new endpoint of multiple myeloma therapy and an intermediate surrogate of prolonged survival (2, 7–10). Intermediate, because it is assessed soon after treatment initiation and anticipates with reasonable accuracy the patient's outcome in the short and long-term (11–15). Thus, treatment individualization based on the probability of an individual patient to attain undetectable MRD with a singular regimen, and confirmation of predicted MRD outcomes in a relatively short time, could represent a new concept towards personalized treatment with faster assessment of its success. This idea has not been investigated previously (Fig. 1A).

Figure 1.

Workflow of the study. A, Hypothetical schematic scenarios for treatment selection and subsequent evaluation of the success of such selection. Treatment selection based on predicted PFS is depicted in scenario 1, whereas the proposed scenario 2 shows a selection of treatment based on predicted MRD status. Treatments A, B, C, D, Y, and Z denote hypothetical therapy choices for clinicians, which is expected to expand with increasing number of personalized biology-based drugs. B and C, Pipeline of the study and treatment regimens of GEM2012MENOS65, GEM-CESAR, and GEM-CLARIDEX trials. Data were gathered at diagnosis with the objective of predicting MRD status after the end of the treatment regimen. The dataset from GEM2012MENOS65 (n = 301) was randomly split in a 7:3 ratio among patients with complete annotations (n = 212), and a machine learning model was trained with 152 patients. The model was further validated in an internal test set of 149 patients (n = 60 with complete and n = 89 with partial annotations), and in external validation cohorts of GEM-CESAR (n = 76) and GEM-CLARIDEX (n = 110) trials.

Figure 1.

Workflow of the study. A, Hypothetical schematic scenarios for treatment selection and subsequent evaluation of the success of such selection. Treatment selection based on predicted PFS is depicted in scenario 1, whereas the proposed scenario 2 shows a selection of treatment based on predicted MRD status. Treatments A, B, C, D, Y, and Z denote hypothetical therapy choices for clinicians, which is expected to expand with increasing number of personalized biology-based drugs. B and C, Pipeline of the study and treatment regimens of GEM2012MENOS65, GEM-CESAR, and GEM-CLARIDEX trials. Data were gathered at diagnosis with the objective of predicting MRD status after the end of the treatment regimen. The dataset from GEM2012MENOS65 (n = 301) was randomly split in a 7:3 ratio among patients with complete annotations (n = 212), and a machine learning model was trained with 152 patients. The model was further validated in an internal test set of 149 patients (n = 60 with complete and n = 89 with partial annotations), and in external validation cohorts of GEM-CESAR (n = 76) and GEM-CLARIDEX (n = 110) trials.

Close modal

Patients, treatment, and MRD assessment

The Programa para el Estudio de la Terapéutica en Hemopatías Malignas(PETHEMA)/Grupo Español de Mieloma(GEM)2012MENOS65 is an open-label phase III clinical trial that included 458 newly diagnosed patients with active multiple myeloma. Patients received six induction cycles of bortezomib, lenalidomide, and dexamethasone (VRD), followed by an autologous stem cell transplant (ASCT) conditioned with Bu-Mel or Mel-200 high-dose therapy (HDT), and two consolidation cycles of VRD (Fig. 1B; ref. 16). Afterwards, patients were enrolled in the PETHEMA/GEM2014MAIN clinical trial that randomized maintenance with lenalidomide and dexamethasone (RD) or RD plus ixazomib for 2 years, after which patients continued with RD for 3 additional years if MRD-positive or stopped therapy if MRD-negative.

The PETHEMA/GEM-CESAR trial is an open-label phase II clinical trial that included 90 patients with high-risk smoldering multiple myeloma, defined by the presence of both ≥10% bone marrow (BM) plasma cells (PC) and MC ≥3 g/dL or, one of the two criterion plus >95% clonal PCs within the BM PC compartment by immunophenotyping and immunoparesis (17). Induction therapy consisted of six 4-week cycles of carfilzomib, lenalidomide, and dexamethasone (KRD). Mel-200 followed by ASCT was given as intensification therapy followed by two KRD consolidation cycles and maintenance with lenalidomide for 2 years (Fig. 1B; ref. 17).

The PETHEMA/GEM-CLARIDEX trial is an open-label phase III clinical trial that enrolled 286 unfit newly diagnosed, ≥65 years old previously untreated patients with multiple myeloma (18). Patients were randomly assigned (1:1 ratio) to receive clarithromycin plus lenalidomide and dexamethasone (C-Rd) or lenalidomide and dexamethasone alone (Rd) weekly on days 1, 8, 15, and 22 until disease progression or unacceptable toxicity (Fig. 1B).

Each study site's Independent Ethics Committee approved the protocol and the written informed consent forms required prior to patient enrollment. Studies were conducted as per the ethical principles of the Declaration of Helsinki and were registered at www.clinicaltrials.gov (Identifiers: NCT01916252 and NCT02406144, NCT02415413, and NCT02575144, respectively).

Patients without MRD assessment (generally due to disease progression; ref. 12) were imputed as MRD-positive if complete remission was not achieved, and as missing MRD if found in complete remission.

Next-generation flow cytometry

The EuroFlow next-generation flow (NGF) standard operation protocols (SOP) were used to assess the percentage of clonal PCs within the BM PC compartment at diagnosis (i.e., PC clonality), absolute number of circulating tumor cells (CTC) in peripheral blood (PB) at diagnosis, and MRD status in BM aspirates after treatment (19). The method relies on a modified SOP and an optimized antibody panel for accurate identification of clonal PCs in PB and BM with a sensitivity (limit of detection) of 2 × 10–6 (19, 20). The two-tube strategy allows detection of clonality with specific confirmation of light-chain restriction on phenotypically aberrant PCs, identified by either antigen under-expression (CD19, CD27, CD38, CD45, CD81) and/or antigen overexpression (CD56, CD117, CD138). Data acquisition was performed in a FACSCanto II flow cytometer [Becton Dickinson Biosciences (BDB)] using the FACSDiva 8.0.1 software (BDB; RRID:SCR_001456). Data analysis was performed using the Infinicyt software (Cytognos SL).

Immune profiling

We leveraged in the multidimensional combination of monoclonal antibodies used to assess clonality at diagnosis (CD138-BV421, CD27-BV510, CD38-FITC, CD56-PE, CD45-PerCPCy5.5, CD19-PECy7, CD117-APC, and CD81-APCH7), to analyze the BM immune composition as described previously (21, 22). Data was analyzed using FlowCT, a semiautomated pipeline that performs batch-analyses of flow cytometry data to avoid variability intrinsic to manual analysis, and unveils full cellular diversity based on unbiased clustering (23). After batch effect was removed with CCA, quality control was performed using flow to remove outliers, margin events, and doublets. Automated clustering was performed with FlowSOM (RRID:SCR_016899) and cluster annotation was executed by visualizing expression levels of each marker on two-dimensional biaxial plots, using the Infinicyt software (Cytognos). Lymphocytes were selected for downstream subclustering after supervised annotation. FlowCT yielded the identification of 21 immune cell types (Supplementary Fig. S1).

FISH

FISH was performed at diagnosis in CD138+ immunomagnetically enriched BM PCs. These were tested for del(1p32), 1q21 gain, IgH translocations, and del(17p13). Patients were considered positive for any of these abnormalities if present in ≥10% of CD138+ PCs.

Statistical analyses

Data preprocessing and feature selection

Filtering of relevant variables associated with patients' MRD status was performed in the complete GEM2012MENOS65 dataset (n = 458). Cut-offs (R package cutpointr) were calculated for each variable to maximize the area under the receiving operating characteristics curve (AUROC). ORs [R package Functions for Medical Statistics Book (fmsb)] for univariate analyses were calculated using optimal cut-offs and a total of 17 variables were associated with MRD (P < 0.05; Supplementary Table S1). A reduced set (n = 212) had data available for all 17 variables (Supplementary Fig. S2). This set was randomly divided into a train (n = 152) and an internal test (n = 60) set following a 7:3 ratio (Supplementary Fig. S2B).

The remaining 246 patients had incomplete data, with 89 patients showing partial annotations for cytogenetic abnormalities and/or tumor burden. Missing values were imputed in the latter, using a nearest-neighbor algorithm (scikit-learn KNNImputer; n neighbors = 6) for tumor burden numerical variables. Using the KNN imputer, missing values are replaced using the mean value from n neighbors found in the set cohort containing only information from variables included in the final model. Cytogenetic abnormalities were imputed using a simple most-frequent imputer (scikit-learn SimpleImputer). The 89 patients from the GEM2012MENOS65 trial with imputed data were excluded from the training set cohort to avoid information leakage (Supplementary Fig. S2), and were included in the internal test cohort, together with the 60 cases described above who had data available for all 17 variables (Fig. 1C). For external validation cohorts, imputations were performed individually in each cohort for 10 patients with partial annotations in the GEM-CESAR trial, and 8 patients in the GEM-CLARIDEX trial (Fig. 1C; Supplementary Fig. S2). No imputations were performed for patients missing immune microenvironment data, and no statistically significant differences were observed in baseline and outcome characteristics of patients enrolled in the GEM2012MENOS65 trial that were included (n = 301) and excluded (n = 157) from the study due to missing variables (Supplementary Table S2).

To avoid data leakage from the training to the test and validation sets, variables were selected based on optimal cut-offs that were recalculated after train-test split using data from the train set. Additionally, the model performed best internally when including numerical values for the percentage of PC clonality within the total PC compartment. Since machine learning algorithms, especially those that apply a penalty and weighted input, perform better when centering and scaling numerical data, the percentage of PC clonality was preprocessed with Scikit-learn standard scaler by fitting the train set for the same purpose (24). The standard scaler function calculates the standard score of a sample (z) by subtracting the mean (u) and dividing by the SD (s) of the training samples (Equation 1). Thus, a standard score was calculated by fitting the train set, where 11.01 was the mean and 13.71 the SD.

Equation 1:Scikit-learn standard scaler function employed to center and scale numerical variables. z = standard score; x = input variable; u = mean; and s = SD of training set.

Feature selection for variables included in the final model was performed by removing collinear variables and performing an OR test in the complete GEM2012MENOS65 set; selecting those variables that exhibited similar ORs (>1 or <1) in the train and test set and yielded greater accuracy using cross-validation (CV).

Machine learning algorithm

A logistic regression algorithm with ridge (l2) regularization was used to generate MRD predictions based on nine variables (eight categorized by cut-offs and one numerical) obtained at baseline. The model was trained with a slightly oversampled training set (n = 164, +12) using Synthetic Minority Oversampling Technique (SMOTE), where the minority class (in this case undetectable MRD samples) were duplicated from the minority data population. This technique balances the training set without adding new information or variation to the algorithm and is only used for training purposes.

A grid search with five-fold stratified CV was performed for hyperparameter-tuning, allowing for the selection of parameters that best fit the model to new data unseen by the training set without filtering information to the test and validation sets. The hyperparameters (in this case ridge l2 penalty of 0.5 and liblinear solver) that returned the best precision and recall cross-validated scores were selected. The model was retrained using the entire training set rendering weighted coefficients for each variable and an intercept for the logistic sigmoid function. The algorithm was evaluated on the test and validation sets obtaining final classification metrics. Machine learning analyses were performed using Python (RRID:SCR_008394; v 3.8.6) and scikit-learn (RRID:SCR_002577; v 0.23.2; ref. 25). ROC curves statistical analyses [p value and confidence intervals (CI)] were generated using R package pROC.

χ2 contingency test and model evaluation

To assess further model performance, a χ2 contingency test was performed from confusion matrices data. Briefly, the test evaluates if the model is a random guesser (null hypothesis; H0) or not (alternative hypothesis; H1) from the model's number of actual and predicted positives and negatives. Statistical analyses were performed using Python (v 3.8.6) and SciPy statistical functions (RRID:SCR_008058; v 1.5.4).

OR test

R package fmsb was used to calculate ORs, 95% CIs, and significancy for baseline variables. MRD positivity was considered as “disease” and MRD negativity as “healthy” for initial univariate feature selection. Values greater than the cut-off demonstrated exposure. Null-hypothesis testing was defined as an OR equal to 1, where exposure had no significant impact in the disease. An OR >1 showed increased odds of disease or persistent MRD, while <1 increased odds of undetectable MRD. Similarly, ORs were calculated to evaluate sustained MRD negativity in patients enrolled in the GEM2014MAIN trial that followed the GEM2012MENOS65 trial. In this case, sustained MRD negativity was defined as the targeted outcome, with ORs >1 showing increased odds of sustained MRD negativity.

Survival analyses

Survival probabilities were estimated using the Kaplan–Meier method. Briefly, differences were tested for statistical significance with the (two-sided) log-rank test and HRs (with its two-sided 95% CI) were estimated with a Cox regression model. Progression-free survival (PFS) was defined as the time from diagnosis until disease progression or death from any cause. Overall survival (OS) was defined as the time from diagnosis until death from any cause. Survival curves and c-index calculations were generated with R package survival (v 3.2–13).

Other statistical analyses

Unless otherwise specified, χ2 test and the Mann–Whitney U nonparametric test were used to evaluate statistical significance of differences observed between categorical and numerical variables, respectively, and MRD outcomes.

Code and data availability

Python code used in this study consisted in standard Scikit-learn published functions. Individual patient level data is available to researchers who provide a methodologically sound proposal to the corresponding author.

Variables associated with MRD status after treatment

Clinical and biological parameters from a total of 458 eligible patients were filtered using cut-offs to maximize the AUROC. Identification of potential parameters associated with MRD outcomes was performed using univariate analyses, represented as ORs with 95% CIs (Supplementary Table S1).

Of 37 clinical and biological parameters evaluated, 17 were associated with MRD outcomes in univariate analyses (Supplementary Table S1). Surprisingly, neither the International staging system (ISS; ref. 26) nor the revised-ISS (R-ISS; ref. 27) predicted significantly different MRD outcomes (Fig. 2A and B). Indeed, among clinical and cytogenetic parameters routinely assessed at diagnosis, only the presence of high lactate dehydrogenase (LDH) levels and del(17p13) was associated with lower rates of undetectable MRD (Fig. 2C and D). However, the accuracy of elevated LDH levels and del(17p13) of predicting MRD status was 52% and 50%, which was similar to that of parameters with no predictive value (Fig. 2E and F). Because of this and the fact that these two features are relative infrequent at diagnosis (≤15% of patients), an alternative approach had to be developed to predict MRD outcomes with greater accuracy, and in a broader population of patients with multiple myeloma.

Figure 2.

Assessment of MRD outcomes according to ISS and R-ISS, LDH levels, and cytogenetic alterations in patients from the GEM2012MENOS65 trial. Percentage of patients according to A, ISS; B, R-ISS; C, LDH levels; and D, cytogenetic profile [i.e., del(17p13), t(4;14), t(14;16), and gain(1q)]. E, Accuracy of predictions of MRD status according to the ISS and R-ISS Stage I versus Stage II & III, and Stage III versus Stage I & II; and F, normal versus elevated LDH levels and standard- versus high-risk cytogenetic alterations.

Figure 2.

Assessment of MRD outcomes according to ISS and R-ISS, LDH levels, and cytogenetic alterations in patients from the GEM2012MENOS65 trial. Percentage of patients according to A, ISS; B, R-ISS; C, LDH levels; and D, cytogenetic profile [i.e., del(17p13), t(4;14), t(14;16), and gain(1q)]. E, Accuracy of predictions of MRD status according to the ISS and R-ISS Stage I versus Stage II & III, and Stage III versus Stage I & II; and F, normal versus elevated LDH levels and standard- versus high-risk cytogenetic alterations.

Close modal

Developing a machine learning model to predict undetectable MRD

A final series of 487 patients were evaluated based on data availability of immune microenvironment data and MRD status (Fig. 1C; Supplementary Fig. S2). The training (n = 152) and internal validation cohort (n = 149) consisted of 301 patients with active multiple myeloma enrolled in the GEM2012MENOS65 clinical trial and were randomly divided following a 7:3 ratio of patients with complete set of annotations (n = 212). In 89 out of 149 patients in the internal validation cohort, missing values for tumor burden numerical values and cytogenetic abnormalities were imputed as described above. After adjusting for potential confounders (17 in total, P ≤ 0.05), cut-off values were recalculated after train-test split to avoid data leakage and were subsequently validated.

While logistic regression may be less powerful compared with other black box algorithms, it has the advantage of being highly interpretable, providing insights into the underlying relationship between input variables and outcome (28). Thus, a logistic regression with ridge (l2) regularization for machine learning classification was performed, where the sum of the weighted coefficients multiplied by its input variable, is transformed into a probability outcome that ranges from 0 to 1 using a logit sigmoid function (Fig. 3A). Data obtained for an individual patient can be substituted into the algorithm's formula, which results in a numerical probability of achieving undetectable (>0.5) versus persistent (<0.5) MRD after treatment (Fig. 3A). Probability outcomes of more than 0.687 or less than 0.370 for undetectable or persistent MRD respectively, were considered as high-confidence due to a significant decrease in false positive and negative predictions (Fig. 3B).

Figure 3.

A machine learning logistic regression model to predict MRD outcomes after therapy in newly diagnosed patients with multiple myeloma. A, Variables obtained at baseline are categorized as cytogenetic abnormalities [i.e., t(4;14) and/or del(17p13)], tumor burden (i.e., PC clonality in BM defined as the percentage of clonal PCs within the BM PC compartment, CTCs in PB), and immune microenvironment (i.e., myeloid precursors, mature B cells, intermediate neutrophils, eosinophils, CD27negCD38pos T cells, and CD56brightCD27neg NK cells). Cytogenetic abnormalities were dichotomized as positive or negative, whereas cut-offs were defined for remaining variables except PC clonality, since the numerical input performed best. The scikit-learn standard scaler function is used to obtain PC clonality standard score input, where the mean (11.01) is subtracted from the percentage of clonal PCs (X) and the result is divided by the SD (13.71) [function: (X – 11.01)/13.71], since machine learning algorithms perform better when centering and scaling numerical data. The sum of each input multiplied by its weighted coefficient plus the logistic regression intercept is then inserted into the logit sigmoid function to predict the conditional probability of an MRD outcome, given the inputs and coefficients for a selected patient. A probability value of more than 0.5 predicts undetectable MRD, and less than 0.5 persistent MRD. B, Distribution of prediction probabilities in patients enrolled in the GEM2012MENOS65 clinical trial with available immune microenvironment data (n = 301). High-confidence predictions have a probability of more than 0.687 for undetectable MRD, and less than 0.370 for persistent MRD. C, ROC curve in the standard-confidence GEM2012MENOS65 cohort (n = 301), GEM-CESAR external validation cohort for high-risk smoldering multiple myeloma (n = 76), and GEM-CLARIDEX external validation cohort for transplant-ineligible patients (n = 110).

Figure 3.

A machine learning logistic regression model to predict MRD outcomes after therapy in newly diagnosed patients with multiple myeloma. A, Variables obtained at baseline are categorized as cytogenetic abnormalities [i.e., t(4;14) and/or del(17p13)], tumor burden (i.e., PC clonality in BM defined as the percentage of clonal PCs within the BM PC compartment, CTCs in PB), and immune microenvironment (i.e., myeloid precursors, mature B cells, intermediate neutrophils, eosinophils, CD27negCD38pos T cells, and CD56brightCD27neg NK cells). Cytogenetic abnormalities were dichotomized as positive or negative, whereas cut-offs were defined for remaining variables except PC clonality, since the numerical input performed best. The scikit-learn standard scaler function is used to obtain PC clonality standard score input, where the mean (11.01) is subtracted from the percentage of clonal PCs (X) and the result is divided by the SD (13.71) [function: (X – 11.01)/13.71], since machine learning algorithms perform better when centering and scaling numerical data. The sum of each input multiplied by its weighted coefficient plus the logistic regression intercept is then inserted into the logit sigmoid function to predict the conditional probability of an MRD outcome, given the inputs and coefficients for a selected patient. A probability value of more than 0.5 predicts undetectable MRD, and less than 0.5 persistent MRD. B, Distribution of prediction probabilities in patients enrolled in the GEM2012MENOS65 clinical trial with available immune microenvironment data (n = 301). High-confidence predictions have a probability of more than 0.687 for undetectable MRD, and less than 0.370 for persistent MRD. C, ROC curve in the standard-confidence GEM2012MENOS65 cohort (n = 301), GEM-CESAR external validation cohort for high-risk smoldering multiple myeloma (n = 76), and GEM-CLARIDEX external validation cohort for transplant-ineligible patients (n = 110).

Close modal

The most effective model to predict MRD status after VRD induction, HDT/ASCT, and VRD consolidation, resulted from integrating cytogenetic, tumor burden, and immune-related biomarkers (Fig. 3A; Supplementary Table S1). Namely, the model included del(17p13) and/or concomitant t(4;14), the percentage of clonal PCs in BM assessed by NGF and the absolute count of CTCs in PB, and the frequency of six immune cell types: myeloid precursors, mature B cells, intermediate neutrophils, eosinophils, CD27negCD38pos T cells and CD56brightCD27neg NK cells. Of further interest, immune biomarkers displayed the highest (positive or negative) coefficient weights and, therefore, were determinant to predict patients' MRD status in this model (Fig. 3A). Any attempt to simplify the model by removing or adding one or more of the parameters described above, significantly impaired its accuracy (Supplementary Table S3).

Evaluating the model's sensitivity and specificity

Using standard-confidence values (i.e., >0.5 and <0.5), MRD status were accurately predicted in 214 of 301 (71%) patients enrolled in the GEM2012MENOS65 clinical trial (χ2 = 51.8, P < 0.001; Fig. 3B). Using high-confidence values (i.e., >0.687 and <0.370, which were observed in 138 of 301 patients), true prediction of undetectable versus persistent MRD after consolidation was achieved in 107 of 138 (78%) patients (χ2 = 36.11, P < 0.001; Fig. 3B).

The model performed similarly in the external validation sets, where accurate prediction of MRD outcomes using standard-confidence values was achieved in 55 of 76 (72%) patients in the GEM-CESAR trial (χ2 = 11.9, P < 0.001), and 79 of 110 (72%) patients in GEM-CLARIDEX trial (χ2 = 7.39, P = 0.007; Fig. 3C). Furthermore, ROC curves in the training-test cohort (n = 301 patients enrolled in GEM2012MENOS65), external validation GEM-CESAR cohort (n = 76), and external validation GEM-CLARIDEX set (n = 110), yielded significant and nearly identical prediction of MRD outcomes (Fig. 3C). Similar results were observed upon excluding patients with imputed data (Supplementary Fig. S3). Overall, an integrative and weighted model based on the cytogenetic, tumor burden, and immune-related biomarkers described above, accurately predicted MRD outcomes in nearly 3 of 4 (348/487) newly diagnosed patients with multiple myeloma with complete set of annotations for immune microenvironment and MRD outcome variables.

Predictors of sustained MRD negativity

Given the emerging role of sustained MRD negativity as a potential biomarker of exceptionally low rates of disease progression (15, 29), we next investigated if the machine learning model had the capacity to predict long-term MRD outcomes. For this purpose, we analyzed 254 patients who had MRD assessments at the end of consolidation (GEM2012MENOS65) and after 2 years of maintenance (GEM2014MAIN). Sustained MRD negativity was defined if undetectable MRD was observed after consolidation and during both years of maintenance.

As expected, the actual MRD-negative status after consolidation was the strongest predictor of sustained MRD negativity. Only 21 of 111 patients with undetectable MRD at consolidation showed nonsustained MRD negativity, while 143 patients remained with persistent MRD (P < 0.001). ORs were calculated for remaining variables, and the prediction of an MRD-negative status using high [log odds 2.3 (CI, 1.4–3.1), P < 0.0001] and standard-confidence [log odds 1.4 (CI, 0.9–2.0), P < 0.001] values obtained with the model described above remained the most accurate predictor at diagnosis of sustained MRD negativity (Fig. 4). Once more, immune related biomarkers at diagnosis were amongst the individual variables that were significantly associated with sustained MRD negativity, and PC clonality was the highest predictor [log odds −1.2 (CI, −1.9 to −0.5), P < 0.001]. By contrast, common cytogenetic abnormalities were not significantly associated with sustained MRD negativity (Fig. 4). The same applied for the ISS, LDH levels, and R-ISS. These findings reinforce the need of improved risk models at diagnosis to predict long-term outcomes, including sustained MRD negativity and survival.

Figure 4.

Log OR (logOR) with 95% CI for sustained undetectable MRD outcomes 2 years after MRD assessments postconsolidation for patients with predicted MRD status. Of 301 patients included in the study, 254 were enrolled in the GEM2014MAIN trial, where patients were randomized to receive maintenance with RD or RD plus ixazomib for 2 years. Sustained undetectable MRD until this time point was defined as the outcome, while nonsustained MRD comprised those patients with persistent MRD (median limit of detection, 2 × 10–6) or those who had previously progressed or died. Und, undetectable.

Figure 4.

Log OR (logOR) with 95% CI for sustained undetectable MRD outcomes 2 years after MRD assessments postconsolidation for patients with predicted MRD status. Of 301 patients included in the study, 254 were enrolled in the GEM2014MAIN trial, where patients were randomized to receive maintenance with RD or RD plus ixazomib for 2 years. Sustained undetectable MRD until this time point was defined as the outcome, while nonsustained MRD comprised those patients with persistent MRD (median limit of detection, 2 × 10–6) or those who had previously progressed or died. Und, undetectable.

Close modal

Prognostic value of the machine learning model

Being our aim the prediction of MRD status after treatment, and being MRD significantly associated with patients' outcome, the model we developed should per se be prognostic. Accordingly, patients predicted to achieve undetectable MRD using standard-confidence values (i.e., >0.5), showed longer PFS and OS than those with probability of persistent MRD (Fig. 5A and B). Afterwards, the same analysis was performed using high-confidence values. Patients with more than 0.687 probability of achieving undetectable MRD showed 80% PFS and 93% OS at 5 years, whereas those in whom persistent MRD was predicted (<0.370), the median PFS was 48 months and OS estimated at 5 years was 70% (Fig. 5C and D). Similar results were observed upon excluding patients with imputed data (Supplementary Fig. S4). Of note, risk stratification according to standard and (particularly) high-confidence values was similar to that observed based on patients' actual MRD status after consolidation (Table 1; Fig. 5E and F).

Figure 5.

PFS and OS according to predicted MRD status after consolidation in transplant-eligible patients with multiple myeloma compared with actual MRD risk stratification. Patients were stratified according to PFS (A and C) and OS (B and D) based on the probability of achieving undetectable versus persistent MRD using standard (A–B; >0.5 and <0.5, respectively) and high-confidence (C–D; >0.687 and <0.370, respectively) values. E and F, Risk stratification of patients according to actual MRD status after consolidation therapy. n/r, not reached; m, months.

Figure 5.

PFS and OS according to predicted MRD status after consolidation in transplant-eligible patients with multiple myeloma compared with actual MRD risk stratification. Patients were stratified according to PFS (A and C) and OS (B and D) based on the probability of achieving undetectable versus persistent MRD using standard (A–B; >0.5 and <0.5, respectively) and high-confidence (C–D; >0.687 and <0.370, respectively) values. E and F, Risk stratification of patients according to actual MRD status after consolidation therapy. n/r, not reached; m, months.

Close modal
Table 1.

Harrell C-index with SE of risk-stratification models for standard-confidence (n = 301) and high-confidence values (n = 138).

Standard confidenceHigh confidence
(n = 301)(n = 138)
Risk-stratification modelPFSOSPFSOS
ISS 0.578 (0.024) 0.607 (0.033) 0.600 (0.032) 0.644 (0.042) 
R-ISS 0.558 (0.025) 0.593 (0.032) 0.588 (0.035) 0.628 (0.041) 
LDH + del(17p13) 0.594 (0.023) 0.636 (0.035) 0.639 (0.035) 0.700 (0.049) 
Predicted MRD status 0.613 (0.022) 0.602 (0.025) 0.634 (0.026) 0.646 (0.029) 
Predicted MRD + ISS 0.644 (0.025) 0.655 (0.034) 0.666 (0.033) 0.700 (0.038) 
Predicted MRD + R-ISS 0.634 (0.028) 0.636 (0.038) 0.697 (0.036) 0.700 (0.040) 
Predicted MRD + LDH 0.642 (0.024) 0.636 (0.033) 0.668 (0.032) 0.703 (0.049) 
Actual MRD status postconsolidation therapy 0.693 (0.021) 0.709 (0.021) 0.672 (0.036) 0.685 (0.029) 
Standard confidenceHigh confidence
(n = 301)(n = 138)
Risk-stratification modelPFSOSPFSOS
ISS 0.578 (0.024) 0.607 (0.033) 0.600 (0.032) 0.644 (0.042) 
R-ISS 0.558 (0.025) 0.593 (0.032) 0.588 (0.035) 0.628 (0.041) 
LDH + del(17p13) 0.594 (0.023) 0.636 (0.035) 0.639 (0.035) 0.700 (0.049) 
Predicted MRD status 0.613 (0.022) 0.602 (0.025) 0.634 (0.026) 0.646 (0.029) 
Predicted MRD + ISS 0.644 (0.025) 0.655 (0.034) 0.666 (0.033) 0.700 (0.038) 
Predicted MRD + R-ISS 0.634 (0.028) 0.636 (0.038) 0.697 (0.036) 0.700 (0.040) 
Predicted MRD + LDH 0.642 (0.024) 0.636 (0.033) 0.668 (0.032) 0.703 (0.049) 
Actual MRD status postconsolidation therapy 0.693 (0.021) 0.709 (0.021) 0.672 (0.036) 0.685 (0.029) 

Note: The C-index accounts for outcome occurrence and timing, and a result of 1 denotes perfect concordance, while a result of 0.5 describes completely random assignments.

C-index [standard-error].

The performance of the machine learning model in predicting PFS according to Harrell concordance-index (C-index) was superior to that observed using the R-ISS (Table 1; Supplementary Fig. S5 with overlayed PFS curves). Similar results were observed upon excluding patients with imputed data (Supplementary Table S4). Interestingly, the combination of the logistic regression model's prediction of MRD status with the ISS, R-ISS, and LDH levels at diagnosis showed even superior survival predictions than each model by its own (Table 1; Supplementary Table S4), suggesting that biomarkers that are useful for MRD predictions could also complement and improve current prognostic models.

The ability to predict patients' MRD status with a given therapy based on disease features at diagnosis remains, to the best of our knowledge, unexplored. Here, we uncovered that except for LDH and del(17p13), other clinical and genetic parameters routinely assessed at diagnosis were not associated with MRD outcomes. We further demonstrated that the ability to predict patients' MRD status with significant accuracy could only be achieved using an integrative, weighted model based on machine learning algorithms. Therefore, we identified a new application to diagnostic biomarkers that, based on the progressive accumulation of data in these and other treatment scenarios, together with the growing interest in adopting machine learning for risk stratification, could help tailoring treatment in multiple myeloma.

The logistic regression model developed for the purpose of this study, required data on cytogenetics, tumor burden in BM and PB, as well as immune profiling of the tumor microenvironment. Indeed, any attempt to simplify the model by removing one or more of the parameters significantly impaired its performance. The role of genetic abnormalities in multiple myeloma is indisputable, and it can only improve with the implementation of higher-throughput methods in routine diagnostics and the assessment of a greater number of alterations (30–32). Accordingly, lack of data on additional copy-number abnormalities and mutations is a limitation of this study. The prognostic value of flow cytometric analysis of PC clonality is also established (33, 34), and there is growing interest in the quantification of CTCs as an optimal surrogate of active disease dissemination (20, 35). Not surprisingly, the combination of tumor burden and genetics is poised to improve outcome prediction in multiple myeloma.

It was surprising to observe that six of the nine disease features in the logistic regression model were immune biomarkers, and the relative distribution of distinct T- and natural killer (NK)-cell subsets achieved the highest coefficient weights to predict patients' MRD status. Although these parameters are not commonly used to stratify risk in multiple myeloma, the raw data from which these can be developed is generally obtained in diagnostic laboratories using flow cytometry to screen for PC clonality. Furthermore, the development of newer immunotherapies for the treatment of multiple myeloma urges investigating the full potential of immune profiling to identify determinants of response and resistance (36–40). Our results suggest complementarity between tumor and immune characterization to predict MRD outcomes and survival.

The link between patients' MRD status and risk of progression is well established (7, 11–15). Therefore, the model developed in this study showed utility beyond MRD prediction, since it yielded significant risk-stratification of patients with multiple myeloma. Of note, high-confidence prediction of undetectable MRD identified a subset of patients that, based on the nine disease features embedded in the model, may have a singular multiple myeloma biology, and showed unprecedented rates of PFS and OS (nearly 80–90% at 5 years). Interestingly, among patients with undetectable MRD after consolidation, a high-confidence (false-positive) prediction of persistent MRD was associated with inferior PFS (P = 0.03; Supplementary Fig. S6). Collectively, these data indicates that whereas assessment of MRD is mandatory to confirm the prediction at diagnosis, the identification of singular disease subgroups using machine learning could help identifying patients in whom undetectable MRD may not translate into long-term survival.

The seminal ISS developed in 2005 (26) to stratify patients with multiple myeloma remained in force until 2015, when the International Myeloma Working Group introduced the R-ISS. While its value is unquestionable, it should be noted that the R-ISS identified three patient subgroups whose median PFS differed by only 37 months (27). Thus, new prognostic models are warranted to improve risk stratification, which should leverage on novel cytogenetic prognostic indexes (32, 41), and biomarkers of tumor burden and immune microenvironment (39, 40). Our results demonstrated that a machine learning model weighing different disease features according to their complementary value, outperformed the ISS and the R-ISS in predicting sustained MRD negativity and patients' survival.

In conclusion, selecting a regimen based on probable MRD outcomes, and confirming soon after if that probability was accurate, is a possible new concept towards individualized treatment in multiple myeloma. For example, a patient with high-confidence probability of persistent MRD treated with a three-drug regimen such as those investigated here, could be candidate to receive upfront a quadruplet (1). Conversely, a fourth drug could be reserved to first relapse in patients with high-confidence probability of undetectable MRD, or in cases of persistent MRD after treatment (i.e., false-negative prediction). While smart use of big data and machine learning algorithms for precision medicine would have been in the realm of science fiction only a few years ago, it is now a realistic and achievable goal (42). Accordingly, we have made the model publicly available as an interactive webpage to facilitate its use in clinical practice (www.MRDpredictor.com).

N. Puig reports grants, personal fees, and other support from Amgen, Bristol-Myers Squibb, Janssen, and Takeda and personal fees from The Binding Site outside the submitted work. M.-T. Cedena reports personal fees from Bristol-Myers Squibb-Celgene and Janssen outside the submitted work. A. Oriol reports personal fees from Sanofi-Aventis, Amgen, Bristol-Myers Squibb, and GSK outside the submitted work. F. de Arriba reports personal fees from Celgene, Janssen, Takeda, Amgen, and GlaxoSmithKline outside the submitted work. L. Palomera reports other support from Amgen, Sanofi-Aventis, and Bristol-Myers Squibb-Celgene and personal fees and other support from Janssen outside the submitted work. A. Mosquera-Orgueira reports grants from Roche and Bristol-Myers Squibb; grants and other support from Amgen and Janssen; and other support from Gilead, Takeda, Incyte, and AbbVie outside the submitted work. L. Rosiñol reports other support from Janssen, Bristol-Myers Squibb-Celgene, Amgen, Takeda, Sanofi-Aventis, and GSK outside the submitted work. J. Blade reports personal fees from Janssen, Celgene, Amgen, Takeda, and Oncopeptides outside the submitted work. M.-V. Mateos reports personal fees from Janssen, Bristol-Myers Squibb-Celgene, Amgen, Takeda, GSK, AbbVie, Oncopeptides, Sanofi-Aventis, Pfizer, Regeneron, Roche, Sea-Gen, Bluebird bio, and Adaptive outside the submitted work. J.F. San-Miguel reports other support from Takeda, Sanofi-Aventis, Roche, GSK, Karyopharm Therapeutics, AbbVie, Haemalogix, Regeneron, and SecuraBio outside the submitted work. B. Paiva reports other support from Adaptive, Amgen, Becton Dickinson, Creative Biolabs, Janssen, and Takeda and grants and other support from Bristol-Myers Squibb-Celgene, Sanofi, Roche, and GSK during the conduct of the study. No disclosures were reported by the other authors.

C. Guerrero: Conceptualization, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. N. Puig: Resources. M.-T. Cedena: Resources. I. Goicoechea: Resources. C. Perez: Resources. J.-J. Garcés: Resources. C. Botta: Resources. M.-J. Calasanz: Resources. N.C. Gutierrez: Resources. M.-L. Martin-Ramos: Resources. A. Oriol: Resources. R. Rios: Resources. M.-T. Hernandez: Resources. R. Martinez-Martinez: Resources. J. Bargay: Resources. F. de Arriba: Resources. L. Palomera: Resources. A.P. Gonzalez-Rodriguez: Resources. A. Mosquera-Orgueira: Resources. M.-S. Gonzalez-Perez: Resources. J. Martinez-Lopez: Resources. J.-J. Lahuerta: Resources. L. Rosiñol: Resources. J. Blade: Resources. M.-V. Mateos: Resources. J.F. San-Miguel: Resources, supervision, funding acquisition. B. Paiva: Conceptualization, resources, formal analysis, supervision, funding acquisition, methodology, writing–original draft, writing–review and editing.

We thank all investigators and participating centers of the GEM/PETHEMA cooperative study group (list of investigators in the Supplementary Appendix). The authors would also like to acknowledge Alfonso Santiago and Carmen Carrero for the supportive administration by PETHEMA, as well as Roberto Maldonado and Arturo Touchard for data management. This study was supported by grants from the Centro de Investigación Biomédica en Red – Área de Oncología - from Instituto de Salud Carlos III (CIBERONC; CB16/12/00369, CB16/12/00400, and CB16/12/00284 to J.F. San-Miguel), Instituto de Salud Carlos III/Subdirección General de Investigación Sanitaria and cofinanced by the European Regional Development Fund-FEDER “A way to make Europe” (FIS No. PI15/01956, PI15/02049, FIS PI15/02062, PI18/01709, and PI19/01451 to J.F. San-Miguel), Instituto de Salud Carlos III/Subdirección General de Investigación Sanitaria and cofinanced by the European Social Fund Plus (FSE+) and the European Union (PFIS No. FI21/00293 to C. Guerrero), the Cancer Research UK (C355/A26819), FCAECC and AIRC under the Accelerator Award Programme (EDITOR; J.F. San-Miguel), the Black Swan Research Initiative of the International Myeloma Foundation and the European Research Council (ERC) 2015 Starting Grant (MYELOMANEXT 680200 to B. Paiva), and the CRIS Cancer Foundation (PR_EX_2020-02 to B. Paiva). This study was supported by the Riney Family Multiple Myeloma Research Program Fund (to B. Paiva, J.F. San-Miguel).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Dimopoulos
MA
,
Moreau
P
,
Terpos
E
,
Mateos
MV
,
Zweegman
S
,
Cook
G
, et al
.
Multiple myeloma: EHA-ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up
.
Ann Oncol
2021
;
32
:
309
22
.
2.
Mohty
M
,
Avet-Loiseau
H
,
Harousseau
JL
.
Requirements for operational cure in multiple myeloma
.
Blood
2021
;
138
:
1406
11
.
3.
Usmani
SZ
,
Hoering
A
,
Cavo
M
,
San Miguel
J
,
Goldschimdt
H
,
Hajek
R
, et al
.
Clinical predictors of long-term survival in newly diagnosed transplant eligible multiple myeloma — an IMWG Research Project
.
Blood Cancer J
2018
;
8
:
123
.
4.
Gulla
A
,
Anderson
KC
.
Multiple myeloma: The (r)evolution of current therapy and a glance into the future
.
Haematologica
2020
;
105
:
2358
67
.
5.
Pawlyn
C
,
Davies
FE
.
Toward personalized treatment in multiple myeloma based on molecular characteristics
.
Blood
2019
;
133
:
660
75
.
6.
Rajkumar
SV
,
Harousseau
JL
.
Next-generation multiple myeloma treatment: a pharmacoeconomic perspective
.
Blood
2016
;
128
:
2757
64
.
7.
Munshi
NC
,
Avet-Loiseau
H
,
Anderson
KC
,
Neri
P
,
Paiva
B
,
Samur
M
, et al
.
A large meta-analysis establishes the role of MRD negativity in long-term survival outcomes in patients with multiple myeloma
.
Blood Adv
2020
;
4
:
5988
99
.
8.
Harousseau
JL
,
Avet-Loiseau
H
.
Minimal residual disease negativity is a new end point of myeloma therapy
.
J Clin Oncol
2017
;
35
:
2863
5
.
9.
Costa
LJ
,
Derman
BA
,
Bal
S
,
Sidana
S
,
Chhabra
S
,
Silbermann
S
, et al
.
International harmonization in performing and reporting minimal residual disease assessment in multiple myeloma trials
.
Leukemia
2021
;
35
:
18
30
.
10.
Anderson
KC
,
Auclair
D
,
Adam
SJ
,
Agarwal
A
,
Anderson
M
,
Avet-Loiseau
H
, et al
.
Minimal residual disease in myeloma: application for clinical care and new drug registration
.
Clin Cancer Res
2021
;
27
:
5195
212
.
11.
Perrot
A
,
Lauwers-Cances
V
,
Corre
J
,
Robillard
N
,
Hulin
C
,
Chretien
ML
, et al
.
Minimal residual disease negativity using deep sequencing is a major prognostic factor in multiple myeloma
.
Blood
2018
;
132
:
2456
64
.
12.
Paiva
B
,
Puig
N
,
Cedena
MT
,
Rosiñol
L
,
Cordón
L
,
Vidriales
MB
, et al
.
Measurable residual disease by next-generation flow cytometry in multiple myeloma
.
J Clin Oncol
2020
;
38
:
784
92
.
13.
San-Miguel
JF
,
Avet-Loiseau
H
,
Paiva
B
,
Kumar
SK
,
Dimopoulos
MA
,
Facon
T
, et al
.
Sustained minimal residual disease negativity with daratumumab in newly diagnosed multiple myeloma: MAIA and ALCYONE
.
Blood
2021
Jul 16. [Epub ahead of print]
.
14.
Cavo
M
,
San-Miguel
JF
,
Usmani
SZ
,
Weisel
KC
,
Dimopoulos
MA
,
Avet-Loiseau
H
, et al
.
Prognostic value of minimal residual disease negativity in myeloma: combined analysis of POLLUX, CASTOR, ALCYONE, MAIA
.
Blood
2021
Jul 21. [Epub ahead of print]
.
15.
Avet-Loiseau
H
,
San-Miguel
J
,
Casneuf
T
,
Iida
S
,
Lonial
S
,
Usmani
SZ
, et al
.
Evaluation of sustained minimal residual disease negativity with daratumumab-combination regimens in relapsed and/or refractory multiple myeloma: Analysis of POLLUX and CASTOR
.
J Clin Oncol
2021
;
39
:
1139
49
.
16.
Rosiñol
L
,
Oriol
A
,
Rios
R
,
Sureda
A
,
Blanchard
MJ
,
Hernández
MT
, et al
.
Bortezomib, lenalidomide, and dexamethasone as induction therapy prior to autologous transplant in multiple myeloma
.
Blood
2019
;
134
:
1337
45
.
17.
Mateos
MV
,
Martinez-Lopez
J
,
Rodriguez Otero
P
,
Gonzalez-Calle
V
,
Gonzalez
MS
,
Oriol
A
, et al
.
Curative strategy (GEM-CESAR) for high-risk smoldering myeloma (SMM): Carfilzomib, Lenalidomide and Dexamethasone (KRd) as induction followed by HDT-ASCT, consolidation with Krd and maintenance with Rd
.
Blood
2019
;
134
:
781
.
18.
Puig
N
,
Hernández
MT
,
Rosiñol
L
,
González
E
,
de Arriba
F
,
Oriol
A
, et al
.
Lenalidomide and dexamethasone with or without clarithromycin in patients with multiple myeloma ineligible for autologous transplant: a randomized trial
.
Blood Cancer J
2021
;
11
:
101
.
19.
Flores-Montero
J
,
Sanoja-Flores
L
,
Paiva
B
,
Puig
N
,
García-Sánchez
O
,
Böttcher
S
, et al
.
Next generation flow for highly sensitive and standardized detection of minimal residual disease in multiple myeloma
.
Leukemia
2017
;
31
:
2094
103
.
20.
Sanoja-Flores
L
,
Flores-Montero
J
,
Garces
JJ
,
Paiva
B
,
Puig
N
,
García-Mateo
A
, et al
.
Next generation flow for minimally-invasive blood characterization of MGUS and multiple myeloma at diagnosis based on circulating tumor plasma cells (CTPC)
.
Blood Cancer J
2018
;
8
:
117
.
21.
Puig
N
,
Paiva
B
,
Lasa
M
,
Burgos
L
,
Perez
JJ
,
Merino
J
, et al
.
Flow cytometry for fast screening and automated risk assessment in systemic light-chain amyloidosis
.
Leukemia
2019
;
33
:
1256
67
.
22.
Kalina
T
,
Flores-Montero
J
,
van der Velden
VH
,
Martin-Ayuso
M
,
Böttcher
S
,
Ritgen
M
, et al
.
EuroFlow standardization of flow cytometer instrument settings and immunophenotyping protocols
.
Leukemia
2012
;
26
:
1986
2010
.
23.
Botta
C
,
Maia
C
,
Pérez Ruiz
C
,
Manrique
I
,
Garcés
JJ
,
Rodríguez
S
, et al
.
Flowct: a semi-automated workflow for deconvolution of immunophenotypic data and objective reporting on large datasets
.
Blood
2019
;
134
:
4355
.
24.
Kuhn
M
,
Johnson
K
.
Feature engineering and selection: A practical approach for predictive models
.
New York
:
Chapman and Hall/CRC Data Science Series
;
2019
.
25.
Pedregosa
F
,
Varoquaux
G
,
Gramfort
A
,
Michel
V
,
Thirion
B
,
Grisel
O
, et al
.
Scikit-learn: Machine learning in Python
.
J Machine Learning Res
2011
;
12
:
2825
30
.
26.
Greipp
PR
,
San-Miguel
JF
,
Durie
BGM
,
Crowley
JJ
,
Barlogie
B
,
Bladé
J
, et al
.
International staging system for multiple myeloma
.
J Clin Oncol
2005
;
23
:
3412
20
.
27.
Palumbo
A
,
Avet-Loiseau
H
,
Oliva
S
,
Lokhorst
HM
,
Goldschmidt
H
,
Rosiñol
L
, et al
.
Revised international staging system for multiple myeloma: a report from international myeloma working group
.
J Clin Oncol
2015
;
33
:
2863
9
.
28.
Molnar
C
.
Interpretable machine learning. A guide for making black box models explainable
.
2019
.
Available from
: https://christophm.github.io/interpretable-ml-book/
29.
Diamond
B
,
Korde
N
,
Lesokhin
AM
,
Smith
EL
,
Shah
U
,
Mailankody
S
, et al
.
Dynamics of minimal residual disease in patients with multiple myeloma on continuous lenalidomide maintenance: a single-arm, single-centre, phase 2 trial
.
Lancet Haematol
2021
;
8
:
e422
32
.
30.
Samur
MK
,
Aktas Samur
A
,
Fulciniti
M
,
Szalat
R
,
Han
T
,
Shammas
M
, et al
.
Genome-wide somatic alterations in multiple myeloma reveal a superior outcome group
.
J Clin Oncol
2020
;
38
:
3107
18
.
31.
Corre
J
,
Perrot
A
,
Caillot
D
,
Belhadj
K
,
Hulin
C
,
Leleu
X
, et al
.
del(17p) without TP53 mutation confers a poor prognosis in intensively treated newly diagnosed patients with multiple myeloma
.
Blood
2021
;
137
:
1192
5
.
32.
Perrot
A
,
Lauwers-Cances
V
,
Tournay
E
,
Hulin
C
,
Chretien
ML
,
Royer
B
, et al
.
Development and validation of a cytogenetic prognostic index predicting survival in multiple myeloma
.
J Clin Oncol
2019
;
37
:
1657
65
.
33.
Paiva
B
,
Vídriales
MB
,
Rosiñol
L
,
Martínez-López
J
,
Mateos
MV
,
Ocio
EM
, et al
.
A multiparameter flow cytometry immunophenotypic algorithm for the identification of newly diagnosed symptomatic myeloma with an MGUS-like signature and long-term disease control
.
Leukemia
2013
;
27
:
2056
61
.
34.
Paiva
B
,
Vídriales
MB
,
Mateo
G
,
Pérez
JJ
,
Montalbán
MA
,
Sureda
A
, et al
.
The persistence of immunophenotypically normal residual bone marrow plasma cells at diagnosis identifies a good prognostic subgroup of symptomatic multiple myeloma patients
.
Blood
2009
;
114
:
4369
72
.
35.
Gonsalves
WI
,
Jevremovic
D
,
Nandakumar
B
,
Dispenzieri
A
,
Buadi
FK
,
Dingli
D
, et al
.
Enhancing the R-ISS classification of newly diagnosed multiple myeloma by quantifying circulating clonal plasma cells
.
Am J Hematol
2020
;
95
:
310
5
.
36.
Perez
C
,
Botta
C
,
Zabaleta
A
,
Puig
N
,
Cedena
MT
,
Goicoechea
I
, et al
.
Immunogenomic identification and characterization of granulocytic myeloid-derived suppressor cells in multiple myeloma
.
Blood
2020
;
136
:
199
209
.
37.
Parmar
H
,
Gertz
M
,
Anderson
EI
,
Kumar
S
,
Kourelis
TV
.
Microenvironment immune reconstitution patterns correlate with outcomes after autologous transplant in multiple myeloma
.
Blood Adv
2021
;
5
:
1797
804
.
38.
Visram
A
,
Dasari
S
,
Anderson
E
,
Kumar
S
,
Kourelis
TV
.
Relapsed multiple myeloma demonstrates distinct patterns of immune microenvironment and malignant cell-mediated immunosuppression
.
Blood Cancer J
2021
;
11
:
45
.
39.
Coffey
DG
,
Maura
F
,
Gonzalez-Kozlova
E
,
Diaz-Mejia
J
,
Luo
P
,
Zhang
Y
, et al
.
Normalization of the immune microenvironment during lenalidomide maintenance is associated with sustained MRD negativity in patients with multiple myeloma
.
Blood
2021
;
138
:
329
.
40.
Maura
F
,
Boyle
EM
,
Diamond
B
,
Blaney
P
,
Ghamlouch
H
,
Ziccheddu
B
, et al
.
Genomic and immune signatures predict sustained MRD negativity in newly diagnosed multiple myeloma patients treated with Daratumumab, Carfilzomib, Lenalidomide, and Dexamethasone (D-KRd)
.
Blood
2021
;
138
:
325
.
41.
Walker
BA
,
Mavrommatis
K
,
Wardell
CP
,
Ashby
TC
,
Bauer
M
,
Davies
F
, et al
.
A high-risk, Double-Hit, group of newly diagnosed myeloma identified by genomic analysis
.
Leukemia
2019
;
33
:
159
70
.
42.
Fitzgerald
RC
.
Big data is crucial to the early detection of cancer
.
Nat Med
2020
;
26
:
19
20
.

Supplementary data