Applying decision tree for detection of a low risk population for type 2 diabetes: A population based study
Introduction: The aim of current study was to create a prediction model using data mining approach, decision tree technique, to identify low risk individuals for incidence of Type 2 diabetes (T2DM), using the Mashhad Stroke and Heart Atherosclerotic Disorders (MASHAD) Study program.
Methods: a prediction model was developed using classification by the decision tree method on 9528 subjects recruited from MASHAD database. Moreover, the receiver operating characteristic (ROC) curve was applied.
Results: The prevalence rate of T2DM was ~14% in our population. For decision tree model, the accuracy, sensitivity, and specificity value for identifying the related factors with T2DM were 78.7%, 47.8% and 83%, respectively. In addition, the area under the ROC curve (AUC) value for recognizing the risk factors associated with T2DM was 0.64. Moreover, we found that subjects with family history of T2DM, age>=48, SBP>=130, DBP>=81, HDL>=29, LDL>=148 and occupation=other have more than 59% chance of this disorder, while the chance of T2DM in subjects without history with TG>=184, age>=48 and hs-CRP>=2.2, have approximately 51% chance.
Conclusion: Our findings demonstrated that decision tree analysis, using routine demographic, clinical, anthropometric and biochemical measurements, which combined with other risk score models, could create a simple strategy to predict individuals at low risk for type 2 diabetes in order to decrease substantially the number of subjects needing for screening and recognition of subject at high risk.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.