Comparing three data mining algorithms for identifying associated risk factors of Type 2 Diabetes

Maryam Tayefi; Habibollah Esmaeily; Majid Ghayour-Mobarhan; Ali Reza Amirabadizadeh

doi:10.26415/2572-004X-vol1iss4p133-134

PDF (English) EPUB (English)

Published: Nov 29, 2017

Keywords:

Artificial neural network, Support vector machine, Logistic regression method, Type 2 diabetes

Maryam Tayefi

Department of Modern Sciences and Technologies, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. Biochemistry of Nutrition Research Center, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

Habibollah Esmaeily

Department of Biostatistics, School of Health, Mashhad University of Medical Sciences, Mashhad, Iran.

Majid Ghayour-Mobarhan

Department of Modern Sciences and Technologies, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. Biochemistry of Nutrition Research Center, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

Ali Reza Amirabadizadeh

Medical Toxicology and Drug Abuse Research Center (MTDRC), Birjand University of Medical Sciences

Abstract

Introduction: Type 2 diabetes (T2DM) shows increasing prevalence and global health burden, causing a concern among health service providers and health administrators. The current study is aimed at developing and comparing some statistical models that are useful in measuring or establishing such associations. The three particular statistical methods investigated in this study are artificial neural network (ANN), support vector machines (SVM) and multivariate logistic regression (MLR) using demographic, anthropometric and biochemical characteristics on a sample of 9528 individuals from Mashhad city.

Methods: The statistical methods involved in this study are also known as machine learning algorithms and require dividing the available data in to training and testing dataset. This study has randomly selected 70% cases (6654 cases) for training and reserved the remaining 30% (2874 cases) for testing. The three methods are compared with help of the receiver operating characteristic (ROC) curve.

Results: The prevalence rate of T2DM is 14% in our population. The ANN model has 78.7% , accuracy, 63.1% sensitivity and 81.2% specificity. Values of these three parameters are 76.8%, 64.5% and 78.9% respectively for SVM and 77.7%, 60.1% and 80.5%, respectively for MLR. The area under the ROC curve (AUC) is 0.71 for ANN, in SVM model was 0.73 for SVM, and 0.70 for MLR.

Conclusion: The overall conclusion is that ANN performs better than two models and can be used effectively to identify associated risk factors of T2DM.

Downloads

Download data is not yet available.

How to Cite

Comparing three data mining algorithms for identifying associated risk factors of Type 2 Diabetes. (2017). Medical Technologies Journal, 1(4), 133-134. https://doi.org/10.26415/2572-004X-vol1iss4p133-134

Issue

Vol. 1 No. 4 (2017): October-December 2017

Section

Conference abstracts

How to Cite

Comparing three data mining algorithms for identifying associated risk factors of Type 2 Diabetes. (2017). Medical Technologies Journal, 1(4), 133-134. https://doi.org/10.26415/2572-004X-vol1iss4p133-134

Download Citation

Comparing three data mining algorithms for identifying associated risk factors of Type 2 Diabetes

Abstract

Downloads

How to Cite

Share

Most read articles by the same author(s)

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

How to Cite

Share

Most read articles by the same author(s)