Fifth International Undergraduate Research Conference (2021) of Military Technical College
Stroke and Diabetes Prediction using Machine Learning
Paper ID : 1025-IUGRC5-FULL (R3)
Authors:
Nehal Mostafa *1, aya ehab yousef2, Radwa Abd Elhakeem abd elmageed3
1Computer science, Computer Science and information technology, AAST, Aswan, Egypt
2AAST,aswan,computer science,egypt,hurghadaa
3Computer science in arab academy of sciences of technology
Abstract:
Diabetes is a disease that has no permanent cure;
hence early detection is required. is a dreadful disease
identified by escalated levels of glucose in the blood.
Machine learning algorithms help in identification and
prediction of diabetes at an early stage.
The main objective of this study is to predict diabetes
mellitus with better accuracy using an ensemble of
machine learning algorithms. machine learning (ML)
algorithms, and K-fold Cross Validation; Accuracy are
used in Prediction Diabetes (PD) dataset in our
research, collected from the Kaggle Machine Learning .
The dataset contains information about 768 patients
and their corresponding nine unique attributes and has
been considered for experimentation, which gathers
details of patients with and without having diabetes.
The proposed ensemble soft voting classifier gives
binary classification and uses the ensemble of three
machine learning algorithms.
random forest, K-Nearest Neighbors (KNN), and Naïve
Bayes for the classification. Empirical evaluation of the
proposed methodology has been conducted with
state-of-the-art methodologies and base classifiers
such as K-Nearest Neighbors (KNN).
by taking accuracy, precision, recall and specificity as
the evaluation criteria.
The proposed ensemble approach gives the highest
accuracy, precision, recall and specificity value with
77.922%, 83.006%, 83,552% and 67.088% respectively
on the Prediction Diabetes (PD) dataset.
Further, the efficiency of the proposed methodology
has also been compared and analyzed with Stroke
Prediction dataset.
The proposed ensemble soft voting classifier has
given accuracy, precision, recall and specificity value
with93.83%,92.59%,96.12% and 91.91% on Stroke
Prediction dataset using Random Forest Algorithm.
Keywords:
Diabetes Prediction Stroke KNN random forest
Status : Paper Accepted (Oral Presentation)