Case Western Reserve University, School of Medicine
Population and Quantitative Health Sciences

Author Of 1 Presentation

Invited Presentations Invited Abstracts

PS16.01 - Incorporating Machine Learning Approaches to Assess Putative Risk Factors for MS

Speakers
Authors
Presentation Number
PS16.01
Presentation Topic
Invited Presentations
Lecture Time
12:45 - 13:00

Abstract

Abstract

Multiple sclerosis (MS) susceptibility is multi-factorial with prominent genetic and non-genetic risk components, and there are complex interactions within and amongst these components that additively and synergistically contribute to MS risk. Efforts to characterize these risk components, and identify specific relationships underlying MS risk has significantly accelerated in the era of big-data. The challenge has been how best to analyze these rich and often-times unwieldy data, particularly when the number of predictors likely out-number the number of observations or where there are complex correlation patterns amongst predictors. Machine learning algorithms are well-suited for interrogating these complex big-data, as they rely on minimal assumptions. In general, a machine learning algorithm first parses the data, learns from it, and then assesses the prediction of what was learnt. We have successfully used machine learning to identify promising metabolomic candidates and complex genetic patterns contributing to MS risk. In both studies, Random forests (a supervised machine learning algorithm) was used to identify highly informative predictors for MS, and the relationships between these predictors and MS risk were formally tested using standard statistical models. Thus, we will present findings from two studies where machine learning was used as a means of data reduction which allowed for conservation of statistical power for association testing.

Collapse

Presenter Of 1 Presentation

Invited Presentations Invited Abstracts

PS16.01 - Incorporating Machine Learning Approaches to Assess Putative Risk Factors for MS

Speakers
Authors
Presentation Number
PS16.01
Presentation Topic
Invited Presentations
Lecture Time
12:45 - 13:00

Abstract

Abstract

Multiple sclerosis (MS) susceptibility is multi-factorial with prominent genetic and non-genetic risk components, and there are complex interactions within and amongst these components that additively and synergistically contribute to MS risk. Efforts to characterize these risk components, and identify specific relationships underlying MS risk has significantly accelerated in the era of big-data. The challenge has been how best to analyze these rich and often-times unwieldy data, particularly when the number of predictors likely out-number the number of observations or where there are complex correlation patterns amongst predictors. Machine learning algorithms are well-suited for interrogating these complex big-data, as they rely on minimal assumptions. In general, a machine learning algorithm first parses the data, learns from it, and then assesses the prediction of what was learnt. We have successfully used machine learning to identify promising metabolomic candidates and complex genetic patterns contributing to MS risk. In both studies, Random forests (a supervised machine learning algorithm) was used to identify highly informative predictors for MS, and the relationships between these predictors and MS risk were formally tested using standard statistical models. Thus, we will present findings from two studies where machine learning was used as a means of data reduction which allowed for conservation of statistical power for association testing.

Collapse

Invited Speaker Of 1 Presentation

Invited Presentations Invited Abstracts

PS16.01 - Incorporating Machine Learning Approaches to Assess Putative Risk Factors for MS

Speakers
Authors
Presentation Number
PS16.01
Presentation Topic
Invited Presentations
Lecture Time
12:45 - 13:00

Abstract

Abstract

Multiple sclerosis (MS) susceptibility is multi-factorial with prominent genetic and non-genetic risk components, and there are complex interactions within and amongst these components that additively and synergistically contribute to MS risk. Efforts to characterize these risk components, and identify specific relationships underlying MS risk has significantly accelerated in the era of big-data. The challenge has been how best to analyze these rich and often-times unwieldy data, particularly when the number of predictors likely out-number the number of observations or where there are complex correlation patterns amongst predictors. Machine learning algorithms are well-suited for interrogating these complex big-data, as they rely on minimal assumptions. In general, a machine learning algorithm first parses the data, learns from it, and then assesses the prediction of what was learnt. We have successfully used machine learning to identify promising metabolomic candidates and complex genetic patterns contributing to MS risk. In both studies, Random forests (a supervised machine learning algorithm) was used to identify highly informative predictors for MS, and the relationships between these predictors and MS risk were formally tested using standard statistical models. Thus, we will present findings from two studies where machine learning was used as a means of data reduction which allowed for conservation of statistical power for association testing.

Collapse

Author Of 3 Presentations

Machine Learning/Network Science Poster Presentation

P0003 - Combinatorial Genetic Interaction Analysis of Multiple Sclerosis Risk Variants (ID 1040)

Speakers
Authors
Presentation Number
P0003
Presentation Topic
Machine Learning/Network Science

Abstract

Background

Common genetic variation within the major histocompatibility complex (MHC), primarily HLA-DRB1*15:01 and HLA-A*02:01, and 200 non-MHC variants contribute to MS risk. It is unknown if specific combinations of these risk variants disproportionately confer elevated risk, as interactions between risk variants have not been extensively studied.

Objectives

To identify if there are specific combinations of risk variants that disproportionately confer increased MS risk using a novel machine learning approach.

Methods

We applied association rule mining (ARM), a combinatorial rule-based machine learning algorithm, to data from non-Hispanic white MS cases (N=207) and controls (N=179). The genetic data consisted of HLA-DRB1*15:01, HLA-A*02:01, and 200 non-MHC risk variants assuming a dominant model. We identified patterns (rules) of 2 to 5 risk variants that were enriched in MS cases compared to controls. Probabilistic measures (confidence and support) evaluated the strength of rules. Odds ratios (ORs), 95% bias corrected confidence intervals (CIs), and permutation p-values obtained from bootstrapped logistic regression models adjusted for genetic ancestry. A Bonferroni approach adjusted for multiple testing. Hahsler and Karpienko’s grouped matrix method identified rules with similar characteristics.

Results

122 rules met minimum requirements of 80% confidence and 5% support. 3 rules met the Bonferroni threshold for significance, and all consisted of 3 variants. The top 3 rules were: 1. HLA-DRB1*15:01, SLC30A7-rs56678847 and rs6880909– carriers of these variants had 20.2-fold increased odds of MS (95% CI: (8.5, 37.5); p=4x10-9); 2. HLA-DRB1*15:01, ADCY3-rs11125803, and rs13327021 (OR: 6.8, 95% CI: (3.1, 20.9); p=0.0001); and 3. HLA-DRB1*15:01, rs13327021, and LOC105375752-rs735542 (OR: 4.9, 95% CI: (2.4, 12.0); p=0.0002). Interestingly, several variants were shared across several of the 122 rules. In particular, INTS8-rs78727559 was present in 34% of top rules and TNIP3-rs17051321 was present in 32% of top rules. HLA-DRB1*15:01, rs35486093, and SLC30A7-rs56678847 were present in 21% of top rules.

Conclusions

In summary, we identified strong evidence suggesting specific combinations of MS risk variants confer elevated risk by applying a robust and novel analytical framework to a modestly sized study population. Replication analyses are underway. These results have the potential to significantly inform efforts aimed at developing risk prediction models for MS.

Collapse
Epidemiology Poster Presentation

P0467 - Hypertension, cholesterol levels, and Type II Diabetes are not associated with multiple sclerosis risk: Mendelian randomization analyses (ID 1473)

Speakers
Presentation Number
P0467
Presentation Topic
Epidemiology

Abstract

Background

Multiple sclerosis (MS) is a multi-factorial neurodegenerative, autoimmune disease. Higher body-mass index is an established risk factor for MS. The causal impact of other cardiometabolic conditions on MS risk are not known, including hypertension (HTN), hyperlipidemia (CHOL), and Type II Diabetes (T2D).

Objectives

To examine the causal impact of HTN, CHOL, and T2D, as well as variation in continuous measures of blood pressure and cholesterol levels, on risk of MS.

Methods

Two-sample Mendelian randomization (2SMR) was performed to investigate the causal contribution of HTN, CHOL, and T2D on MS risk. 2SMR is a causal inference approach where genetic variants associated with an exposure are used as instrumental variables for the exposure, to be tested for association in the outcome of interest. The Wald ratio of exposure and outcome effect estimates for each variant are then combined via meta-analysies to determine overall causal effects. We performed 2SMR using multiple summary statistics from multiple genome-wide association studies (GWAS) for all exposure-outcome combinations, including 4 MS GWAS (n=463,010; n=38,589; n=27,098; n=2,739), 1 GWAS of diastolic blood pressure (DBP) (n=317,756), 1 GWAS of systolic blood pressure (SBP) (n=317,754), 2 GWAS of HTN (n=337,199; n=337,159), 2 GWAS of high-density lipoprotein (HDL) (n=187,176; n=21,555), 2 GWAS of low-density lipoprotein (LDL) (n=173,082; n=21,559), and 2 GWAS of T2D (n=149,821; n=337,159). All 2SMR analyses were adjusted for horizontal pleiotropy using the Egger regression approach. Clumping was performed for each exposure GWAS to prune variants for LD in 10kb windows. LD proxies in the outcome GWAS were set at an r2 value of 0.8 or higher.

Results

Overall, there was no evidence to suggest causal associations between HTN, SBP, DBP, HDL, LDL, or T2D on MS risk. Neither SBP nor DBP was associated with MS risk (pmin=0.38, 0.26, β=-0.001, 0.60, respectively). HTN as a binary measure was similarly not associated across the various studies (pmin=0.21, β=-6.77). HDL and LDL were not associated with MS (pmin=0.25, 0.07, β=0.17, 0.26, respectively). And lastly, T2D was also not associated with MS (pmin=0.51, β=0.06).

Conclusions

Blood pressure variation, HTN, lipid levels, and T2D do not appear to have a genetically-driven association with MS risk. Considering the relationships between BMI and MS, and BMI and these other cardiometabolic traits, further research is necessary to disentangle the mechanisms through which BMI confers risk for MS.

Collapse
Comorbidities Poster Presentation

P0479 - Multiple sclerosis predisposes affected individuals for an earlier onset of hypertension. (ID 889)

Speakers
Presentation Number
P0479
Presentation Topic
Comorbidities

Abstract

Background

Hypertension (HTN) is a common condition in multiple sclerosis (MS), and it is associated with poorer MS outcomes. Recently, a large study showed HTN was 25% more common in MS than non-MS cohorts. It is unknown if the elevated HTN prevalence is because vascular alterations play a primary role in MS pathogenesis or if they are secondary to MS disease processes.

Objectives

To add insight to the MS-HTN relationship, we sought to determine if HTN age at onset (AAO) is earlier in MS patients compared to matched controls.

Methods

Using electronic health records (EHRs) from the Cleveland Clinic Health System (CCF), we identified 141,696 incident HTN diagnoses among Ohio residents between 1/2000-1/2017 who were ≥18 years at 1st encounter. Incident HTN was defined as the 1st of ≥2 recorded HTN diagnoses at least 3 months after the 1st encounter. Similar criteria determined incident MS (N=546). We then matched MS cases to controls on birth year (+/- 3 years), age at 1st encounter (+/- 3 years), sex, race and ZIP code, allowing for up to 10 matches. By matching in this retrospective cohort, where MS status is the exposure of interest, we remove potential confounding in the observed relationship of interest due to the matched variables. The final data set consisted of 509 MS cases and 4,522 matched controls; 87% MS cases were matched to ≥7 controls. Using HTN AAO as the dependent variable, we conducted Cox Proportional Hazards (CPH) and linear regression (LR) models with standard errors adjusted for intragroup correlations due to matching. Based on quartiles of the distribution of birth year in MS cases (1920-1949, 1950-1957, 1958-1965, 1966-1990), we constructed a categorical variable to be included as a covariate along with age at 1st encounter, sex, race, and smoking status (ever/never).

Results

Birth year violated the PH assumption, therefore stratified CPH models across birth year categories were conducted. MS and age at 1st encounter were time-varying, and treated as such. On average MS cases had a 73% increased hazards (HR = 1.73, 95% CI: 1.17, 2.55; p=0.006) for HTN onset, which decreased by 1% per year increase in age. Since the effect of MS was time-varying, we conducted models per birth year category. Interesting, MS was not associated with increased hazards for HTN onset for those born before 1966. In those born after 1965, MS was associated with a 37% increased hazards (HR = 1.37, 95% CI: 1.12, 1.68; p=0.0025), and this effect met the PH assumption.

From the LR model, there was an interaction between MS and birth year, therefore similar stratifed models were conducted. HTN AAO was on average 0.7 years earlier (95% CI: 0.05, 1.4; p=0.04) in MS cases than controls born after 1965. There were no difference for other birth year categories.

Conclusions

In those born after 1965, persons with MS experience an earlier onset of HTN. Future research is needed to characterize these relatioships by sex and race, as well as the timing of HTN onset with respect to MS onset.

Collapse

Presenter Of 1 Presentation

Comorbidities Poster Presentation

P0479 - Multiple sclerosis predisposes affected individuals for an earlier onset of hypertension. (ID 889)

Speakers
Presentation Number
P0479
Presentation Topic
Comorbidities

Abstract

Background

Hypertension (HTN) is a common condition in multiple sclerosis (MS), and it is associated with poorer MS outcomes. Recently, a large study showed HTN was 25% more common in MS than non-MS cohorts. It is unknown if the elevated HTN prevalence is because vascular alterations play a primary role in MS pathogenesis or if they are secondary to MS disease processes.

Objectives

To add insight to the MS-HTN relationship, we sought to determine if HTN age at onset (AAO) is earlier in MS patients compared to matched controls.

Methods

Using electronic health records (EHRs) from the Cleveland Clinic Health System (CCF), we identified 141,696 incident HTN diagnoses among Ohio residents between 1/2000-1/2017 who were ≥18 years at 1st encounter. Incident HTN was defined as the 1st of ≥2 recorded HTN diagnoses at least 3 months after the 1st encounter. Similar criteria determined incident MS (N=546). We then matched MS cases to controls on birth year (+/- 3 years), age at 1st encounter (+/- 3 years), sex, race and ZIP code, allowing for up to 10 matches. By matching in this retrospective cohort, where MS status is the exposure of interest, we remove potential confounding in the observed relationship of interest due to the matched variables. The final data set consisted of 509 MS cases and 4,522 matched controls; 87% MS cases were matched to ≥7 controls. Using HTN AAO as the dependent variable, we conducted Cox Proportional Hazards (CPH) and linear regression (LR) models with standard errors adjusted for intragroup correlations due to matching. Based on quartiles of the distribution of birth year in MS cases (1920-1949, 1950-1957, 1958-1965, 1966-1990), we constructed a categorical variable to be included as a covariate along with age at 1st encounter, sex, race, and smoking status (ever/never).

Results

Birth year violated the PH assumption, therefore stratified CPH models across birth year categories were conducted. MS and age at 1st encounter were time-varying, and treated as such. On average MS cases had a 73% increased hazards (HR = 1.73, 95% CI: 1.17, 2.55; p=0.006) for HTN onset, which decreased by 1% per year increase in age. Since the effect of MS was time-varying, we conducted models per birth year category. Interesting, MS was not associated with increased hazards for HTN onset for those born before 1966. In those born after 1965, MS was associated with a 37% increased hazards (HR = 1.37, 95% CI: 1.12, 1.68; p=0.0025), and this effect met the PH assumption.

From the LR model, there was an interaction between MS and birth year, therefore similar stratifed models were conducted. HTN AAO was on average 0.7 years earlier (95% CI: 0.05, 1.4; p=0.04) in MS cases than controls born after 1965. There were no difference for other birth year categories.

Conclusions

In those born after 1965, persons with MS experience an earlier onset of HTN. Future research is needed to characterize these relatioships by sex and race, as well as the timing of HTN onset with respect to MS onset.

Collapse