Prediction of Type 2 Diabetes using Support Vector Machine (SVM) with Enhanced Levy Flight based Fruitfly Optimization Algorithm (ELFFOA) and Feature Selection Approaches
Abstract
Researchers have been leveraging various data analytics methods for Diabetes mellitus (DM) diagnosis, prognosis and management. The data analytics paradigm has become advanced and automated with the emergence of machine learning (ML) and deep learning (DL) algorithms. With new techniques, the prediction accuracy of ML models for various real-world problems has increased significantly. In our previous work, we introduced and investigated the Improved K-Means with Adaptive Divergence Weight Binary Bat Algorithm to create an innovative diagnosis system. Across several problem scenarios, the performance of this algorithm is much better in terms of speed. However, this algorithm's accuracy of data categorization comes below expectations. To achieve high classification accuracy, the objective of this study work is to concentrate on methods and strategies. This aim is fulfilled through a Support Vector Machine (SVM) with an Enhanced Levy Flight-based Fruitfly Optimization model. This novel model improves diabetes prediction accuracy and can be applied to regressions, classifications, and other tasks. The nearest training data points’ distances should be greater as this can lower classifiers’ generalization errors. Missing values in datasets are retrieved using the Adaptive Neuro Fuzzy Inference System (ANFIS). A new algorithm called the Enhanced Inertia Weight Binary Bat Algorithm (EIWBBA) is introduced to optimize feature spaces and eliminate unimportant aspects. Further on, a novel feature selection technique is introduced by using the Enhanced Generalized Lambda Distribution Independent Component Analysis (EGLD-ICA). The classification uses a Support Vector Machine with an Enhanced Levy flight-based Fruitfly Optimization Algorithm (SVM-ELFFOA). The SVM-ELFFOA classification techniques are implemented using MATLAB software. It is evident that the discussed IKM-EIWBBA+SVM-ELFFOA classifier produces much better values of the accuracy of 93.50%, while the available IKM-EIWBBA+SVM yields 91.87%, IKM-ADWFA+LR renders 90.50%, and IKM+LR renders just 85.00%. From the simulation experiment, the proposed classification techniques implemented in MATLAB software and according to comparative data, this suggested model has a higher prediction accuracy of 93.50% compared to existing classification methods.
References
American Diabetes Association, “Diagnosis and classification of diabetes mellitus,” Diabetes care, 37(Supplement_1), pp. S81-S90, 2014. doi: https://doi.org/10.2337/dc14S081
R. Kahn, “Follow-up report on the diagnosis of diabetes mellitus: the expert committee on the diagnosis and classifications of diabetes mellitus,” Diabetes care, vol. 26, no. 11, pp. 3160, 2003.
W. Kerner and J. Brückel, “Definition, classification and diagnosis of diabetes mellitus,” Experimental and clinical endocrinology & diabetes, vol. 122, no. 07, pp. 384-386, 2014. doi: https://doi.org/ 10.1055/s-0034-1366278
J. P. Kandhasamy and S. J. P. C. S. Balamurali, “Performance analysis of classifier models to predict diabetes mellitus.” Procedia Computer Science, vol. 47, pp. 45-51, 2015. doi: https://doi.org/10.1016/j.procs.2015.03.182
Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Frontiers in genetics, vol. 9, pp. 515, 2018. doi: https://doi.org/10.3389/fgene.2018.00515
American Diabetes Association, “Diagnosis and classification of diabetes mellitus,” Diabetes care, vol. 37, no. 1, pp. S81-S90, 2014. doi: https://doi.org/10.2337/dc14-S081
R. Gargeya and T. Leng, “Automated identification of diabetic retinopathy using deep learning,” Ophthalmology, vol. 124, no. 7, pp. 962-969, 2017. doi: https://doi.org/10.1016/j.ophtha.2017.02.008
D. S. Kermany, M. Goldbaum, W. Cai, C. C. Valentim, H. Liang, S. L. Baxter, A. McKeown, G. Yang, X. Wu, F. Yan and J. Dong, “Identifying medical diagnoses and treatable diseases by image-based deep learning.” Cell, vol. 172, no. 5, pp. 1122-1131, 2018. doi: https://doi.org/10.1016/j.cell.2018.02.010
D. Sisodia and D. S. Sisodia, “Prediction of diabetes using classification algorithms,” Procedia computer science, vol. 132, pp. 1578-1585, 2018. https://doi.org/10.1016/j.procs.2018.05.122
M. Alehegn, R. Joshi and P. Mulay, “Analysis and prediction of diabetes mellitus using machine learning algorithm,” International Journal of Pure and Applied Mathematics, vol. 118, no. 9, pp. 871-878, 2018.
M. Mounika, S. D. Suganya, B. Vijayashanthi and S. K. Anand, “Predictive analysis of diabetic treatment using classification algorithm,” Int J Comput Sci Inf Technol, vol. 6, pp. 2502-2502, 2015.
A. Pavate and N. Ansari, “Risk prediction of disease complications in type 2 diabetes patients using soft computing techniques,” In 2015 Fifth International Conference on Advances in Computing and Communications (ICACC), pp. 371-375, 2015. doi: https://doi.org/10.1109/ICACC.2015.61
A. Kumar Dewangan and P. Agrawal, “Classification of diabetes mellitus using machine learning techniques,” International Journal of Engineering and Applied Sciences, vol. 2, no. 5, 2015.
K. Saravananathan and T. Velmurugan, “Analyzing diabetic data using classification algorithms in data mining,” Indian Journal of Science and Technology, vol. 9, no. 43, pp. 1-6, 2016. doi: https://doi.org/ 10.17485/ijst/2016/v9i43/93874
R. Joshi and M. Alehegn, “Analysis and prediction of diabetes diseases using machine learning algorithm: Ensemble approach,” International Research Journal of Engineering and Technology, vol. 4, no. 10, 2017.
N. Sneha and T. Gangil, “Analysis of diabetes mellitus for early prediction using optimal features selection,” Journal of Big data, vol. 6, no. 1, pp. 1-19, 2019. doi: https://doi.org/10.1186/s40537-019-0175-6
S. Hina, A. Shaikh and S. A. Sattar, “Analyzing diabetes datasets using data mining,” Journal of Basic & Applied Sciences, vol. 13, pp. 466-471, 2017. doi: https://doi.org/10.6000/1927-5129.2017.13.77
R. Asgarnezhad, M. Shekofteh and F. Z. Boroujeni, “Improving Diagnosis of Diabetes Mellitus Using Combination of Preprocessing Techniques,” Journal of Theoretical & Applied Information Technology, vol. 95, no. 13, 2017.
C. C. Jorge, H. N. Jorge, M. R. Wendell, S. A. Edalatpanah, A. B. Shariq, S. Naz, J. C. Javier and P. E. Gabriel, “Novel characterization and tuning methods for integrating processes,” International Journal of Information Technology, vol. 16, no. 3, pp. 13871395, 2024. doi: https://doi.org/10.1007/s41870-023-01679-9
M. S. Farahani, H. Farrokhi-Asl and S. Rahimian, “Hybrid Metaheuristic Artificial Neural Networks for Stock Price Prediction Considering Efficient Market Hypothesis,” International Journal of Research in Industrial Engineering, vol. 12, no. 3, pp. 27831337, 2023. doi: htttps://doi.org/ 10.22105/riej.2023.361216.1336
M. Lincy Jacquline and N. Sudha, “Weighted fuzzy C means and enhanced adaptive neuro-fuzzy inference based chronic kidney disease classification,” Journal of Fuzzy Extension and Applications, vol. 5, no. 1, pp. 100-115, 2024. doi: https://doi.org/10.22105/jfea.2024.437690.1376
R. Rasinojehdehi and S. E. Najafi, “Advancing risk assessment in renewable power plant construction: an integrated DEA-SVM approach,” Big Data and Computing Visions, vol. 4, no. 1, pp. 1-11, 2024. doi: https://doi.org/10.22105/bdcv.2024.447876.1178
M. Mohamadjkhani, R. Radfar, N. Pilevarisalmasi and M. Afsharkazemi, “Selection of open innovation method in the automotive industry using Adaptive network-based fuzzy inference system (ANFIS),” Journal of Applied Research on Industrial Engineering, vol. 11, no. 4, 2024. https://doi.org/10.22105/jarie.2023.393194.1543
S. Alam, J. Kundu, S. Ghosh and A. Dey, “Trusted fuzzy routing scheme in flying ad-hoc network,” Journal of Fuzzy Extension and Applications, vol. 5, no. 1, pp. 48-59, 2024. doi: https://doi.org/10.22105/jfea.2024.436052.1370
A. A. El-Douh, S. Lu, A. Abdelhafeez, A. M. Ali and A. S. Aziz, “Heart Disease Prediction under Machine Learning and Association Rules under Neutrosophic Environment,” Neutrosophic Systems with Applications, vol. 10, pp. 35-52, 2023. doi: https://doi.org/10.61356/j.nswa.2023.75
N. Khalil, M. Elkholy and M. Eassa, “A Comparative Analysis of Machine Learning Models for Prediction of Chronic Kidney Disease.” Sustainable Machine Intelligence Journal, vol. 5, pp. 3-1, 2023. doi: https://doi.org/10.61185/SMIJ.2023.55103
A. M. Ali and S. Broumi, “Machine Learning with Multi-Criteria Decision-Making Model for Thyroid Disease Prediction and Analysis,” Multicriteria Algorithms with Applications, vol. 2, pp. 80-88, 2024. doi: https://doi.org/10.61356/j.mawa.2024.26961
S. Mandour, A. Gamal and A. Sleem, “Mantis Search Algorithm Integrated with Opposition-Based Learning and Simulated Annealing for Feature Selection,” Sustainable Machine Intelligence Journal, vol. 8, pp. 5-56, 2024. doi: https://doi.org/10.61356/SMIJ.2024.8300
H. Wu, S. Yang, Z. Huang, J. He and X. Wang, “Type 2 diabetes mellitus prediction model based on data mining,” Informatics in Medicine Unlocked, vol. 10, pp. 100-107, 2018. doi: https://doi.org/10.1016/j.imu.2017.12.006
G. Krishnaveni and T. Sudha, “A novel technique to predict diabetic disease using data mining–classification techniques,” International Journal of Advanced Scientific Technologies, Engineering and Management Sciences (IJASTEMS), vol. 3, 2017.
S. Alby and BL Shivakumar, “A prediction model for type 2 diabetes using adaptive neuro-fuzzy interface system,” Biomedical Research, Special Issue: Computational Life Sciences and Smarter Technological Advancement: Edition: II, pp. 69-74, 2018. doi: 10.4066/biomedicalresearch.29-17-254
S. Alby and B. L. Shivakumar, “A prediction model for type 2 diabetes risk among Indian women,” ARPN Journal of Engineering and Applied Sciences, vol. 11, no. 3, pp. 20372043, 2016.
X. Huang, X. Zeng and R. Han, “Dynamic inertia weight binary bat algorithm with neighborhood search,” Computational intelligence and neuroscience, vol. 2017, no. 1, pp. 3235720, 2017. doi: https://doi.org/10.1155/2017/3235720
A. Hyvärinen and E. Oja, “Independent component analysis: algorithms and applications,” Neural networks, vol. 13, no. 4-5, pp. 411-430, 2000. doi: https://doi.org/10.1016/S08936080(00)00026-5
O. Chapelle, V. Vapnik, O. Bousquet and S. Mukherjee, “Choosing multiple parameters for support vector machines,” Machine learning, vol. 46, pp. 131-159, 2002. doi: https://doi.org/10.1023/A:1012450327387
W. T. Pan, “A new fruit fly optimization algorithm: taking the financial distress model as an example,” Knowledge-Based Systems, vol. 26, pp. 69-74, 2012. doi: https://doi.org/10.1016/j.knosys.2011.07.001
M. A, Kumar and I. L. Aroquiaraj, “Adaptive Divergence Weight Firefly Algorithm (ADWFA) with Improved K-Means Algorithm and Adaptive Neuro Fuzzy Inference System (ANFIS) for Type 2 Diabetes Mellitus Prediction,” Journal of Advanced Research in Dynamical and Control Systems, vol. 11, no. 6, pp. 18-31, 2009.
M. P. R. Ganesan, “Hybrid Genetic Discretization model with Parental comparison using Correlation Clustering for Distributed DNA Databases,” Journal of Theoretical and Applied Information Technology, vol. 100, no. 5, 2022.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.