Computer-aided decision-making for predicting liver disease using PSO-based optimized SVM with feature selection

Hamid Saadatfar,Abdollah Dehzangi,Shahaboddin Shamshirband

Authors	Hamid Saadatfar,Abdollah Dehzangi,Shahaboddin Shamshirband
Journal	Informatics in Medicine Unlocked
Page number	1-17
Serial number	17
Volume number	1
Paper Type	Full Paper
Published At	2019
Journal Grade	ISI
Journal Type	Typographic
Journal Country	Iran, Islamic Republic Of
Journal Index	Scopus

Abstract

Using medical data mining models has been considered as a significant way to predict diseases in recent years. In the field of healthcare, we face a large amount of data, and this is one of the challenges in predicting and analyzing the target disease. With the help of data mining models, one can convert this data into valuable information, and through analyzing them logically and scientifically, one can reach accurate decision-making and actual prediction. Another challenge in the field of disease prediction is selecting features that are more significant than other features. Feature subset selection is performed to improve the performance of models with the highest accuracy. The purpose of this study is to select significant features by comparing data mining models to predict liver disease based on an extraction, loading, transformation, analysis (ELTA) approach for correct diagnosis. Hence, the data mining models are compared based on the ELTA approach, such as random forest, Multi-Layer Perceptron (MLP) neural network, Bayesian networks, Support Vector Machine (SVM), and Particle Swarm Optimization (PSO)-SVM. Among these models, the PSO-SVM model has the best performance regarding the criteria of specificity, sensitivity, accuracy, Area under the Curve (AUC), F-measure, precision, and False Positive Rate (FPR). Furthermore, a 10-fold cross-validation method for evaluation of models is used so that the models were evaluated on a liver disease dataset. The average of estimated accuracy was calculated as 87.35%, 78.91%, 66.78%, 76.51% and 95.17% for Random forest, MLP Neural network, Bayesian network, SVM and PSOSVM models, respectively. Regarding the mentioned evaluation criteria, we obtained the highest performance of accuracy with the least number of features through the hybrid PSO-SVM-based optimized model.

Paper URL

tags: Data mining, Liver disease, Classification models, Feature selection, Disease prediction