Diagnosis of Diabetes Using Bayesian and Boosting Classifier
Abstract
In recent years, the diagnosis of diseases using artificial intelligence and machine learning algorithms has gained significant importance. Using data from relevant medical studies enables the extraction of valuable insights that can reduce the occurrence of numerous fatalities. One of the rapidly growing chronic diseases is diabetes, which has shown a growing prevalence due to urbanization and reduced physical activity. Hence, early detection of diabetes in individuals has immense significance. This paper utilizes a dataset comprising information from individuals who underwent diabetes diagnostic tests and employs classification techniques to determine whether their test results were positive or negative for diabetes. The novelty of this work lies in the comparative analysis of Bayesian classifiers and boosting methods, which have not been extensively explored in the literature. The utilized classification methods include Bayesian classifiers such as Bayesian Support Vector Machine, Bayesian k-nearest neighbor, Bayesian decision tree and boosting methods like Catboost, Adaboost, and XGboost. Performance evaluation metrics, including accuracy, precision, recall, F1-score, and ROC curve analysis, are employed to compare the efficacy of these methods in analyzing the data. The findings of this study will contribute to the advancement of accurate and efficient diabetes diagnosis using machine learning techniques, potentially helping in early intervention and management of the disease.
References
S. Abdollahi and R. Safa, “Machine learning and AI for advancing Parkinson's disease diagnosis: exploring promising applications,” Big Data and Computing Visions, vol. 4, no. 1, pp. 12-21, 2024.
M. Khalifa and M. Albadawy, “Artificial intelligence for diabetes: Enhancing prevention,diagnosis, and effective management,” Computer Methods and Programs in Biomedicine Update, 100141, 2024.
S.C. Mackenzie, C.A.R. Sainsbury, D.J. Wake, “Diabetes and artificial intelligence beyond the closed loop: a review of the landscape, promise and challenges,” Diabetologia, vol. 67, no. 2, pp. 223-235, 2024.
Khaleel, F.A., Al-Bakry, A.M.: “Diagnosis of diabetes using machine learning algorithms,” Materials Today: Proceedings 80, 3200–3203 (2023).
A. Choudhury and D. Gupta, “A survey on medical diagnosis of diabetes using machine learning techniques,” In: Recent Developments in Machine Learning and Data Analytics, IC3, pp. 67–78, 2019.
J.J. Khanam and S.Y. Foo, “A comparison of machine learning algorithms for diabetes prediction,” Ict Express, vol. 7, pp. 432–439, 2021.
P. Sonar and K. JayaMalini, “Diabetes prediction using different machine learning approaches,” In: 2019 3rd International Conference on Computing Methodologies and Communication, (ICCMC), pp. 367–371, 2019.
S. Sivaranjani, S. Ananya, J. Aravinth and R. Karthika, “Diabetes prediction using machine learning algorithms with feature selection and dimensionality reduction,” In: 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS),1, pp. 141–146, 2021.
M. Roobini, M. Lakshmi, R. Rajalakshmi, L. Sujihelen and K. Babu, “Type 2 diabetes mellitus classification using predictive supervised learning model,” Soft Computing, pp. 1–15, 2023.
G. Rajput, G and A. Alashetty, “Diabetes classification using ml algorithms,” In: Inven tive Systems and Control: Proceedings of ICISC 2023, pp. 867–877, Springer, 2023.
H. Patel, H and J. Briskilal, “Prediction of Diabetes using Machine Learning Algorithm,”
M.H.L. Louk and B.A. Tama, “Tree-based classifier ensembles for pe malware analysis: A performance revisit,” Algorithms, vol. 15, no. 9, p.332, 2022.
B.F. Wee, S. Sivakumar, K.H. Lim, W.K. Wong and F.H. Juwono, “Diabetes detection based on machine learning and deep learning approaches,” Multimedia Tools and Applications, vol. 83, no. 8, pp. 24153-24185, 2024.
Y. Resti, E.S. Kresnawati, N.R. Dewi and N. Eliyati, “ Diagnosis of diabetes mellitus in women of reproductive age using the prediction methods of naive bayes, discriminant analysis, and logistic regression,” Science and Technology Indonesia, vol. 6. No. 2, pp. 96–104, 2021.
G. Parthiban, A. Rajesh and S.K. Srivatsa, “ Diagnosis of heart disease for diabetic patients using naive bayes method,” International Journal of Computer Applications, vol. 24, no. 3, pp. 7–11, 2011.
C.Y. Chou, D.Y. Hsu and C.H. Chou, “Predicting the onset of diabetes with machine learning methods,” Journal of Personalized Medicine, vol. 13, no. 3, p. 406, 2023.
F. Ebrahimzadeh, and R. Safa, “Unlocking the Potential of the Metaverse for Innovative and Immersive Digital Care,” arXiv preprint arXiv:2406.07114, 2024.
A. Pourkeyvan, R. Safa, and A. Sorourkhah, “Harnessing the power of hugging face transformers for predicting mental health disorders in social networks,” IEEE Access, vol. 12, pp. 28025–28035, 2024.
J.H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of statistics, vol. 29, no. 5, pp. 1189–1232, 2001.
J. Friedman, T. Hastie and R. Tibshirani, “Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors),” The annals of statistics, vol. 28, no. 2, pp. 337–407, 2000.
L. Mason, J. Baxter, P. Bartlett and M. Frean, “Boosting algorithms as gradient descent,” Advances in neural information processing systems, vol. 12, 1999.
L. Prokhorenkova, G. Gusev, A. Vorobev, A.V. Dorogush and A. Gulin, “CatBoost: unbiased boosting with categorical features,” Advances I n neural information processing systems, vol. 31, 2018.
Y. Freund, R. Schapire and N. Abe, “A short introduction to boosting,” Journal Japanese Society For Artificial Intelligence, vol. 14, no. 5, pp. 771-780, 1999.
T. Chen and C. Guestrin, “Xgboost: Reliable large-scale tree boosting system,” In: Proceedings of the 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, pp.13–17, 2015.
N.G.Polson and S.L. Scott, “Data augmentation for support vector machines,” Bayesian Analysis, vol. 6, no. 1, pp.1–24, 2011.
G. Nuti, L.A.J. Rugama and A.I. Cross, “A Bayesian decision tree algorithm,” arXiv preprint arXiv:1901.03214, 2019.
C. Holmes and N. Adams, “A probabilistic nearest neighbor method for statistical pattern recognition,” Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 64, no. 2, pp.295–306, 2002.
Ž. Vujović, et al., “Classification model evaluation metrics,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 6, pp. 599–606, 2021.
T. Fawcett, “Roc graphs: Notes and practical considerations for researchers,” Machine learning, vol. 31, no. 1, pp. 1–38, 2004.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.