A Classification Algorithm-Based Hybrid Diabetes Prediction Model

Edeh, Michael Onyema; Khalaf, Osamah Ibrahim; Tavera, Carlos Andrés; Tayeb, Sofiane; Ghouali, Samir; Abdulsahib, Ghaida Muttashar; Richard Nnabu, Nneka Ernestina; Louni, AbdRahmane

A Classification Algorithm-Based Hybrid Diabetes Prediction Model

dc.contributor.author	Edeh, Michael Onyema
dc.contributor.author	Khalaf, Osamah Ibrahim
dc.contributor.author	Tavera, Carlos Andrés
dc.contributor.author	Tayeb, Sofiane
dc.contributor.author	Ghouali, Samir
dc.contributor.author	Abdulsahib, Ghaida Muttashar
dc.contributor.author	Richard Nnabu, Nneka Ernestina
dc.contributor.author	Louni, AbdRahmane
dc.date.accessioned	2025-07-04T16:27:50Z
dc.date.available	2025-07-04T16:27:50Z
dc.date.issued	2022
dc.description.abstract	Diabetes is considered to be one of the leading causes of death globally. If diabetes is not treated and detected early, it can lead to a variety of complications. The aim of this study was to develop a model that can accurately predict the likelihood of developing diabetes in patients with the greatest amount of precision. Classification algorithms are widely used in the medical field to classify data into different categories based on some criteria that are relatively restrictive to the individual classifier, Therefore, four machine learning classification algorithms, namely supervised learning algorithms (Random forest, SVM and Naïve Bayes, Decision Tree DT) and unsupervised learning algorithm (k-means), have been a technique that was utilized in this investigation to identify diabetes in its early stages. The experiments are per-formed on two databases, one extracted from the Frankfurt Hospital in Germany and the other from the database. PIMA Indian Diabetes (PIDD) provided by the UCI machine learning repository. The results obtained from the database extracted from Frankfurt Hospital, Germany, showed that the random forest algorithm outperformed with the highest accuracy of 97.6%, and the results obtained from the Pima Indian database showed that the SVM algorithm outperformed with the highest accuracy of 83.1% compared to other algorithms. The validity of these results is confirmed by the process of separating the data set into two parts: a training set and a test set, which is described below. The training set is used to develop the model's capabilities. The test set is used to put the model through its paces and determine its correctness.
dc.identifier.citation	Edeh, M. O., Khalaf, O. I., Tavera, C. A., Tayeb, S., Ghouali, S., Abdulsahib, G. M., Richard-Nnabu, N. E., & Louni, A. R. (2022). A Classification Algorithm-Based Hybrid Diabetes Prediction Model. Frontiers in Public Health, 10. https://doi.org/10.3389/fpubh.2022.829519
dc.identifier.issn	22962565
dc.identifier.uri	https://repositorio.usc.edu.co/handle/20.500.12421/7173
dc.language.iso	en
dc.publisher	Frontiers Media S.A.
dc.subject	AI
dc.subject	Bayesian Naive
dc.subject	classification
dc.subject	decision tree
dc.subject	diabetes
dc.subject	ML
dc.subject	random forest
dc.subject	Support Vector Machine (SVM)
dc.title	A Classification Algorithm-Based Hybrid Diabetes Prediction Model
dc.type	Article

Files

Original bundle

Now showing 1 - 1 of 1

Name:: A Classification Algorithm-Based Hybrid Diabetes Prediction Model.pdf
Size:: 443.44 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Artículos Científicos