Machine Learning/Network Science Late Breaking Abstracts

LB1282 - Machine learning of deep grey matter volumes on MRI for predicting new disease activity after a first clinical demyelinating event (ID 2182)

Speakers
  • M. Tayyab
Authors
  • M. Tayyab
  • L. Metz
  • A. Dvorak
  • S. Kolind
  • S. Au
  • R. Carruthers
  • A. Traboulsee
  • R. Tam
Presentation Number
LB1282
Presentation Topic
Machine Learning/Network Science

Abstract

Background

Deep grey matter (DGM) atrophy is a feature in all multiple sclerosis (MS) phenotypes. Studies have shown a strong relationship between DGM atrophy and clinical worsening but the utility of DGM volumes for predicting disease activity is largely unexplored, especially in early disease. Machine learning (ML) is a computational approach that can identify patterns that predict disease outcomes. In ML, the study dataset is divided into training and test subsets. The training set contains known outcomes, which the ML algorithm uses to form a prediction model, which is then evaluated on the test set.

Objectives

To develop an ML model for predicting new disease activity (clinical or MRI) within 2 years of a first clinical demyelinating event, using baseline DGM volumes. The motivation is to identify individuals at higher risk of new disease activity.

Methods

3D T1-weighted MRIs acquired within 90 days of a first clinical event in 140 subjects from a completed placebo-controlled trial of minocycline were used. Eighty subjects had new disease activity within 2 years, 28 were stable, and 32 withdrew early (unknown outcome). The stable and unknown groups were combined into 1 for ML training. Advanced Normalization Tools and FMRIB Software Library were used to segment the thalami, putamina, globi pallidi, and caudate nuclei. A random forest ML model was trained to predict new disease activity with feature vectors composed of individual DGM nuclei volumes and several other variables (e.g., minocycline vs. placebo, mono-focal vs. multi-focal CIS, normalized brain volume, and sex). Model performance was evaluated using 3-fold cross-validation, with 80% of the data used for training, and the rest for testing.

Results

Sequential elimination of variables ranked the least important by the trained model resulted in improved classification accuracy. Therefore, the less predictive variables were pruned from the feature vector. The best model used DGM volumes alone and achieved 82.1% accuracy, 87% precision, 81% recall and F1-score of 0.84 with area under the curve (AUC) of 0.76.

Conclusions

ML can learn patterns predictive of new disease activity within 2 years after a first clinical demyelinating event from baseline DGM volumes. This approach can potentially augment the many other clinical and demographic variables used in a typical MS clinical work up. Further investigation with larger data sets is warranted to determine generalizability of the approach.

Collapse