Abhishek Pal1, Sarada Mallik2, Rima Dutta3
[Vol. 05 (01), December, 2024, pp. 51-55]
Diabetes Mellitus is a serious condition affecting a large number of people worldwide. Various factors such as age, obesity, lack of exercise, genetic predisposition, lifestyle choices, poor diet, and high blood pressure contribute to the onset of this disease. Individuals with diabetes are at an increased risk for complications like heart disease, kidney disease, stroke, vision issues, and nerve damage. In hospitals, diabetes diagnosis typically involves conducting various tests and providing treatment based on the results. Big Data Analytics has become essential in the healthcare industry, which handles massive volumes of data. By utilizing big data techniques, it is possible to analyze large datasets, uncover hidden patterns, and derive insights to predict outcomes more effectively. However, existing methods for classification and prediction in diabetes diagnosis have limited accuracy. In this paper, we propose an improved diabetes prediction model incorporating additional external factors and standard metrics like glucose levels, BMI, age, and insulin. This new model enhances classification accuracy with an updated dataset and introduces a pipeline framework to further improve prediction accuracy.