A Classification Algorithm-Based Hybrid Diabetes Prediction Model

dc.contributor.authorEdeh, Michael Onyema
dc.contributor.authorKhalaf, Osamah Ibrahim
dc.contributor.authorTavera, Carlos Andrés
dc.contributor.authorTayeb, Sofiane
dc.contributor.authorGhouali, Samir
dc.contributor.authorAbdulsahib, Ghaida Muttashar
dc.contributor.authorRichard Nnabu, Nneka Ernestina
dc.contributor.authorLouni, AbdRahmane
dc.date.accessioned2025-07-04T16:27:50Z
dc.date.available2025-07-04T16:27:50Z
dc.date.issued2022
dc.description.abstractDiabetes is considered to be one of the leading causes of death globally. If diabetes is not treated and detected early, it can lead to a variety of complications. The aim of this study was to develop a model that can accurately predict the likelihood of developing diabetes in patients with the greatest amount of precision. Classification algorithms are widely used in the medical field to classify data into different categories based on some criteria that are relatively restrictive to the individual classifier, Therefore, four machine learning classification algorithms, namely supervised learning algorithms (Random forest, SVM and Naïve Bayes, Decision Tree DT) and unsupervised learning algorithm (k-means), have been a technique that was utilized in this investigation to identify diabetes in its early stages. The experiments are per-formed on two databases, one extracted from the Frankfurt Hospital in Germany and the other from the database. PIMA Indian Diabetes (PIDD) provided by the UCI machine learning repository. The results obtained from the database extracted from Frankfurt Hospital, Germany, showed that the random forest algorithm outperformed with the highest accuracy of 97.6%, and the results obtained from the Pima Indian database showed that the SVM algorithm outperformed with the highest accuracy of 83.1% compared to other algorithms. The validity of these results is confirmed by the process of separating the data set into two parts: a training set and a test set, which is described below. The training set is used to develop the model's capabilities. The test set is used to put the model through its paces and determine its correctness.
dc.identifier.citationEdeh, M. O., Khalaf, O. I., Tavera, C. A., Tayeb, S., Ghouali, S., Abdulsahib, G. M., Richard-Nnabu, N. E., & Louni, A. R. (2022). A Classification Algorithm-Based Hybrid Diabetes Prediction Model. Frontiers in Public Health, 10. https://doi.org/10.3389/fpubh.2022.829519
dc.identifier.issn22962565
dc.identifier.urihttps://repositorio.usc.edu.co/handle/20.500.12421/7173
dc.language.isoen
dc.publisherFrontiers Media S.A.
dc.subjectAI
dc.subjectBayesian Naive
dc.subjectclassification
dc.subjectdecision tree
dc.subjectdiabetes
dc.subjectML
dc.subjectrandom forest
dc.subjectSupport Vector Machine (SVM)
dc.titleA Classification Algorithm-Based Hybrid Diabetes Prediction Model
dc.typeArticle

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
A Classification Algorithm-Based Hybrid Diabetes Prediction Model.pdf
Size:
443.44 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: