Abstract
1-Introduction
2-Kernelized Principal Component Analysis(KPCA)
3-Bat Algorithm(BA)
4-Least Square Support Vector Machine(LSSVM)
5-Summary and Prospect
References
Abstract
Data mining technology has important clinical significance for disease classification and prevention. In order to improve the performance of the model and the accuracy of disease classification, this paper proposes KPCA-IBA-LSSVM model. In view of the high dimensionality and nonlinearity of medical data, KPCA is used to reduce dimension. BA algorithm is used to optimize the parameters of LSSVM. At the same time, BA algorithm is easy to fall into local extreme and premature convergence. So this paper improves BA algorithm from three aspects. Finally, in order to verify the validity of the algorithm, this paper uses Breast Cancer, Statlog (Heart) and Heart Disease datasets from UCI machine learning database to validate the model. The simulation results show that the model has achieved better classification accuracy, and the model can also be used for classification and prediction of other diseases. The method proves to have certain feasibility and promotion.
Introduction
In particular, it is of important practical significance for disease prediction, for it will greatly improve the prevention of disease and reduce the incidence of new diseases. Data mining technology is widely used in the medical field, but the model is relatively single and some parameters are set manually. Besides, medical data has the characteristics of wide dimensions, noise, strong coupling and non-linearity, which can not optimize the performance of the model. Therefore, the KPCA-IBA-LSSVM model is proposed to classify diseases. Kernelized principal component analysis is a nonlinear dimensionality reduction method based on kernel technique. Least square support vector machine is an improved algorithm of SVM. It transforms the quadratic programming problem of SVM into the problem of linear equations, which makes the problems easier to solve. In addition, the performance of the classifier is closely related to the selection of parameters.