Heart disease is a life-threatening disease. In the world most of the people were affected by heart diseases. Diagnosis of heart disease is also one the most important challenging task for practitioners. So, effective classification model is required to predict the heart condition of the patients, that information will help the practitioners to take precision decision in effective way in the early stage itself. Ensemble classification techniques are used for predicting heart disease. The ensemble method will improve the classification of weak algorithms through combining multiple classifiers. The experiment was carried out with the following ensemble techniques such as Boosting, Bagging, Stacking and Majority Vote. The experimental result shows that, when using boosting, stacking and bagging methods the accuracy was improved around six percentages what classification accuracy they have obtained without ensemble techniques, in majority voting method, the maximum accuracy was improved with around seven percentages. The authors concluded the majority voting method provides better classification accuracy when compared to other ensemble techniques [1]. In this paper, classification model is developed for predicting heart diseases with dimensionality reduction. Principal component analysis is used for dimensionality reduction. The method was analyzed with six classifiers such as Decision tree, Gradient Boosted tree, Logistic Regression, Multilayer Perceptron, Naïve Bayes and Random Forest. The performance of the model was evaluated by three datasets obtained from the UCI repository such as Cleveland, Hungarian and combination of Cleveland and Hungarian. The experimental result clearly shows that, the combination of Principal Component Analysis + Chi-Square + Random Forest provides better classification accuracy [2].
Khourdifi, Y., & Bahaj, M. (2019) developed an optimized classification model for predicting heart disease. Feature selection is carried out by fast correlation-based feature selection method and features were optimized by two well known approaches such as one is an Ant colony optimization and another one is Particle Swarm Optimization techniques. These hybrid approaches have been analyzed with the following classification algorithms such as K- Nearest Neighbor, Support Vector Machine, Naïve bayes, Random Forest, Multilayer Perceptron and Artificial Neural Network. The experiments were conducted with three criteria’s such as 1. Classifiers without any kind of optimization, 2. Classifiers optimized with fast correlation-based feature selection method and 3. Classifiers optimized with fast correlation-based feature selection + Ant colony optimization + particle Swarm Optimization. The experimental result shows the proposed hybrid method fast correlation-based feature selection method + Ant colony optimization method + Particle swarm optimization provides better result in all aspects such as Precision, recall, F- measure and accuracy [3]. In this paper, heart disease is analyzed through two well known data mining tools such as 1. Weka and 2. Orange. The experiment was conducted with following classifiers such as 1. Naïve Bayes 2. Support Vector Machine 3. Random forest and 4. K-Nearest Neigbor. The performance was evaluated by two parameters 1. Precision and 2. Recall. The obtained result shows weka tools provides better classification accuracy than orange tool [4].
In this paper, the classification of heart disease is carried out with several data mining tools and machine learning techniques. The experiment was conducted with six data mining tools and six well known classifiers. Six data mining tools are 1. Weka 2. Orange 3. Rapid miner 4. Knime 5. Mat lab and 6. Scikit learn and six well known classifiers are 1. Support Vector Machine 2. K-Nearest Neighbor 3. Random Forest 4. Logistic Regression 5. Artificial Neural Network and 6. Naïve Bayes. The performances of the proposed system have been evaluated based on three parameters 1. Accuracy 2. Sensitivity and 3. Specificity. The experimental results concluded mat lab data mining tool will provides better accuracy, sensitivity and specificity for all classifiers compared to other five data mining tools [5]. Huapaya et al. (2020) conducted experimental study on heart diseases detection through several supervised machine learning algorithms such as KNN, SVM, Neural Network, Random Forest and Decision tree. The performance of classifier was evaluated by three parameters 1. Precision 2. Recall and 3. F1-Score. The experimental result shows KNN classifier provides better in all aspects such as precision, recall and F1-score than other classifiers [6].
In this paper, the experiment was conducted with various machine learning methods, with feature selection techniques and without feature selection techniques for predicting heat disease. The experiment results concluded, Support Vector Machine classifier provides a better prediction rate than other classifiers without using any feature selection techniques. Naïve Bayes classifier provides a better prediction rate than other classifiers with using following feature selection methods such as Correlation based feature selection method and Fuzzy rough set. By using Fuzzy rough set and Chi square methods as a feature selection technique, Radial Basis Function Network classifier provides a better prediction rate [7]. Amin Ul Haq proposed an efficient hybrid intelligent system frame work for predicting heart disease. The hybrid intelligent system framework consists of following feature selection methods such as Relief, mRMR and LASSO and classifiers such as Logistic regression, K-NN, ANN, SVM, NB and DT. The performance of the proposed system was evaluated based on two criteria’s such as with feature reduction and without features reduction. The proposed system provides good results in terms of accuracy and execution time [8].
Apurb Rajdhan et al. (2020) carried out the experimental study on heart disease prediction. The experiment was conducted with the following classifiers such as 1. Decision tree, 2. Random Forest, 3. Naïve Bayes and 4. Logistic regression. The experimental result shows that random forest classifier produces a better classification result as 90.16% as other classifiers mentioned above [9]. The following papers were conducted experimental study for predicting heart disease with various classifiers and concluded which classifier provides better prediction rate [10] [11] [12]. We tried feature selection with different optimization algorithms and we obtained the promised results [14][15][16].
The remaining of the paper is organized as follows: Sect. 2 explains outline of bee algorithm. Section 3 explains modified bees algorithm. Section 4 provides overall architecture of the proposed system. Section 5 explains feature selection using modified bee algorithm. Section 6 presents implementation details and experimental results. Section 7 presents conclusion and future work.