Introduction: Bayesian Belief Networks have been used in medicine to evaluate clinical data and develop predictive and prognostic models. As classification models, they allow us to represent pattern complexity beyond what can be accomplished with traditional Kaplan-Meier or regression models. We sought to evaluate the use of machine-learned Bayesian Belief Networks (ml-BBNs) to develop mortality models in breast cancer and to evaluate classification performance for this method.

Methods: A set of 2,300 breast bancer cases from a tumor registry at Thomas Jefferson University were used to train ml-BBNs. The registry set was broken into cohorts for modeling by follow-up times of 1 (n=2,202), 2 (n=2,183), 3 (n=2,157), and 5 (n=2,027) years. Each cohort was then used to train a ***m1-BBN and each model was evaluated for structure. Variables were recoded into categories: biomarkers (ER, PR, Ki67, HER2, p53) as positive or negative; grading, staging, and size were broken in categories; while race was recoded into Caucasian or African-American. Income and poverty level by census tract were also included. Models were evaluated for ability to classify mortality (yes/no) within the follow-up period using 10-fold cross-validation and Receiver Operating Characteristic curves.

Results: Area Under the Curve (AUC), Positive Predictive Value (PPV), and Negative Predictive Value (NPV) were calculated for each set of cohort training models and mean values and 95% confidence intervals were calculated for mortality (yes/no) within the follow-up period. AUCs (and CIs) for 1, 2, 3, and 5 years were: 0.81 (0.70 — 0.91), 0.74 (0.69 — 0.79), 0.81 (0.77 — 0.86), 0.77 (0.74 — 0.80). PPVs for 1, 2, 3, and 5 years were: 12.3% (7.5% — 17.1%), 18.8% (15.4% — 22.1%), 18.0% (15.1% — 20.9%), 28.2% (24.7% — 31.7%). NPVs for 1, 2, 3, and 5 years were: 99.2% (98.8% — 99.7%), 97.4% (96.9% — 97.8%), 96.4% (95.1% — 97.7%), 91.7% (89.1% — 94.3%). Predictors of mortality at 1 year were Tumor Stage, at 2 years were Estrogen Receptor and Tumor Stage, and at 3 and 5 years were Diagnosis Age, Tumor Stage, Estrogen Receptor status, and Ki-67 receptor status. Discussion / Conclusion: We were able to successfully train ***m1-BBNs to estimate mortality using breast cancer registry cohorts. Cross-validation showed the models to be robust. The structure of the models can inform us how different data elements contribute to the estimate of mortality. These models can be used to calculate individual probabilities for prognostic guidance given age, staging criteria, and biomarkers. Overall 5-year mortality in the study set is 15.2%, however we can derive subject-specific mortality estimates. For example, a 43-year old Stage 3, ER-Negative, Ki-67 Negative subject has a 19.9% probability of 5-year mortality, while the same subject with positive Ki-67 has a 37.8% probability of mortality. Meanwhile, the same probabilities for a 70 year old woman are 67.0% and 59.0%, respectively.

Citation Information: Cancer Res 2011;71(24 Suppl):Abstract nr P5-14-12.