Machine Learning Model to Diagnose Diabetes Type 2 Based on Health Behavior

ALSHARİ, Haithm; ODABAŞ, ALPER

doi:10.35378/gujs.931760

Machine Learning Model to Diagnose Diabetes Type 2 Based on Health Behavior

Atıf İçin Kopyala

ALSHARİ H., ODABAŞ A.

GAZI UNIVERSITY JOURNAL OF SCIENCE, cilt.35, sa.3, ss.834-852, 2022 (ESCI)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 35 Sayı: 3
Basım Tarihi: 2022
Doi Numarası: 10.35378/gujs.931760
Dergi Adı: GAZI UNIVERSITY JOURNAL OF SCIENCE
Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus, Academic Search Premier, Aerospace Database, Aquatic Science & Fisheries Abstracts (ASFA), Communication Abstracts, Compendex, Metadex, Civil Engineering Abstracts, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.834-852
Anahtar Kelimeler: Artificial intelligence, Diabetes, Health behavior, Gradient boosting, ANN, LIFE-STYLE INTERVENTIONS, PREVENT
Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

Diabetes, in 2016, was the 7th death-causing disease in the world. It was the direct cause of 1.6 million deaths. In 2019, the number of adults (20-79 years) that were living with diabetes was approximately 463 million and is expected to rise to 700 million in 2045. The early diagnosis of diabetes will help treat it and prevent its complications. The need for an easy and fast way to diagnose diabetes is crucial. In this study, we are proposing a method to diagnose diabetes with the help of machine learning algorithms and tools. The proposed method utilizes the power of machine learning to create a model that can predict diabetes based on the health behavior of the patient. The model uses the relationship between a healthy lifestyle and diabetes. Our goal is to build a reliable machine learning model to predict diabetes, which will help significantly in easing and speeding up the diagnosing procedure of diabetes. We used modern machine learning algorithms like XGBoost, LightGBM, CatBoost, and artificial neural networks, and the dataset was obtained from the National Health and Nutrition Examination Survey (NHANES). In our study, the XGBoost algorithm performed the best with a Cross-Validation (10-fold) score of 0.864, and an overall accuracy of 87.7% for the validation dataset and 84.96% for the test dataset.