- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
A Study On a Machine Learning Based Classification Approach in Identifying Heart Disease Within E-Healthcare
Vamsi Krishna Eruvaram and Mohan Raja Pulicharla*
MCA, 2004, Madras University, India
Submission: November 01, 2023; Published: December 20, 2023
*Corresponding author: Mohan Raja Pulicharla, MCA, 2004, Madras University, India
How to cite this article: Vamsi Krishna Eruvaram and Mohan Raja Pulicharla*. A Study On a Machine Learning Based Classification Approach in Identifying Heart Disease Within E-Healthcare. J Cardiol & Cardiovasc Ther. 2023; 19(1): 556004. DOI: 10.19080/JOCCT.2023.19.556004
- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Abstract
The heart, as the second most vital organ after the brain, is integral to maintaining bodily equilibrium, and disruptions to its function have profound health consequences. Heart disease, a leading global cause of mortality, often arises from cumulative daily physiological changes, emphasizing the importance of timely illness prediction. In healthcare, the fusion of data mining and machine learning, explored in this study using Support Vector Machine, Decision Tree, and Random Forest algorithms, addresses the challenges of diagnosing prevalent conditions like heart disease, particularly crucial in the field of cardiology.
Our proposed machine learning-based approach for diagnosing cardiac disease employs a range of classification algorithms and advanced feature selection techniques, demonstrating superior accuracy in detecting heart diseases from extensive datasets of unprocessed medical images. This technological advancement holds the potential to significantly enhance patient care in various healthcare settings, showcasing the promising impact of artificial intelligence tools on improving the quality of life for billions worldwide.
Keywords: Machine learning; Heart disease; Algorithms; Cardiovascular disease; Regression analysis
Abbreviations:ML: Machine Learning; LCS: Learning Classifier Systems; ILP: Logic Programming; ANN: Artificial Neural; SVMs: Network Support Vector; Machines; CVDs: Cardiovascular Diseases; TP: True Positive TN: True Negative; FN: False Negative; FP: False Positive LDL: Low-Density; Lipoprotein ECG: Electrocardiogram; CXR: Chest X-Ray; CVD: Cardiovascular Disease
- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Introduction
After the brain, the heart is regarded as the second-most significant organ. Every heart disruption causes the entire body to become upset. Heart disease is one of the top five killer diseases in the world. Disorders, including heart disease, are a result of the changes that occur to us daily. Consequently, it is crucial to predict a sickness at the appropriate time. Data mining is a fundamental and fundamental process for defining and discovering relevant data and uncovering hidden patterns in massive databases. By predicting and diagnosing various diseases, data mining, and machine learning techniques are used in the medical sciences to address genuine health-related challenges. This study compares the performance of three machine learning algorithms-support vector machine, decision tree, and random forest-for the prediction of heart disease.
- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Machine Learning-Based Approach for Diagnosing Cardiac Disease
The study emphasizes the critical need for swift and accurate heart disease identification, proposing a machine learning approach with Support Vector Machines, Logistic Regression, Artificial Neural Networks, K-Nearest Neighbors, Naive Bayes, and Decision Trees for classification (Figure 1). Efficiency is enhanced through feature selection algorithms and a conditional mutual information method, ensuring commendable accuracy, particularly with Support Vector Machines. This makes it a promising tool for rapid implementation in medical settings, crucial for early identification and interrupting cardiac disease progression. The analysis of diverse datasets identifies key features for heart disease prediction, utilizing seven machine learning methods. A hybrid dataset is created and analyzed with Python’s Scikit-learn module using a univariate feature selection technique, offering a comprehensive approach to discern crucial factors in predicting and preventing heart disease.

- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
Datasets are split into training and testing using holdout and cross-validation techniques, adjusting algorithm parameters for maximum accuracy. Evaluation metrics, including a classification report and confusion matrix, gauge performance. Majority voting, combining logistic regression, SVM, and naive Bayes, achieves 88.89% accuracy on the first dataset, while the hybrid dataset lags individual ones. Project outcomes are compared with prior methodologies. Machine learning, an algorithmic system falling under AI, learns without explicit programming, relying on statistics and data for outcome prediction. Linked to data mining and Bayesian modeling, it operates by taking data as input and generating answers through algorithms, seen in applications like personalized recommendations, fraud detection, and predictive maintenance [1].
- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Machine Learning vs Traditional Programming
Traditional Programming: This problem is meant to be solved via machine learning. The computer creates a rule after learning how the input and output data are related. Every time there is fresh data, the programmers do not need to design new rules. The algorithms change because of fresh information and experiences, increasing their efficacy over time (Figure 2).

Machine Learning Approach
Machine learning mimics human learning through experience, succeeding in familiar scenarios with easier predictions. Like humans, machines train by observing examples for precise predictions but struggle with new instances. Central to machine learning is learning and inference, primarily achieved by identifying patterns in data. A crucial skill for data scientists is selecting data to create a feature vector, simplifying reality with sophisticated algorithms. This feature vector transforms the learning step into a condensed model that describes the data (Figure 3-6).





Machine learning Algorithms
Machine learning can be grouped into two broad learning tasks: Supervised and Unsupervised. There are many other algorithms.
Supervised Learning: An algorithm learns the link between given inputs and a particular output using training data and feedback from humans. For instance, a practitioner can forecast sales using input data such as marketing expenses and weather predictions.
There are two categories of supervised learning:
Classification task
Regression task
Classification: To determine a customer’s gender for a commercial, information is extracted from the database, including height, weight, occupation, salary, and purchase history. The classifier’s goal is to assign a probability label (male or female) based on these features. Once the model learns to distinguish between genders, it can be used for predictions with new data. For example, if the classifier predicts a 70% probability of being male and 30% female, the algorithm confidently assigns the customer as male. Classifiers can have multiple classes for predicting items, like glass, table, shoes, each representing a different class [2,3].
Types of Algorithms: (Table 2)

Unsupervised Learning: In unsupervised learning, an algorithm explores input data without being given an explicit output variable (e.g., explores customer demographic data to identify patterns) (Table 3).

Machine Learning (ML) Algorithm and Practical Application
Numerous machine learning algorithms exist, chosen based on specific goals. In the following flower prediction example, ten algorithms predict flower types based on petal dimensions. The dataset is depicted in the top left image, divided into red, light blue, and dark blue groups. Classifications include the upper left of the second image in the red group, the middle exhibiting ambiguity and light blue, and the bottom in the dark category. Subsequent images illustrate various algorithms attempting to categorize the data (Figure 7 & 8).


Objectives of the study
To study the Machine Learning application in heart disease identification.
To analyze the factors impacting on heart disease through using the Machine Learning application under Artificial Intelligence systems.
To study the factors that affect heart disease risk the most across three separate datasets.
To combine two separate datasets to create a hybrid dataset and to use three datasets.
One hybrid dataset to apply several machine learning algorithms for the prediction of heart disease.
To suggest efficient and effective measures to diagnose the heart disease identification through Machine Learning
- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Methodology
This project utilizes three datasets from Kaggle and the UCI machine learning repository, employing Python for data analysis through the Anaconda distribution. Various supervised machine learning algorithms, such as support vector machines, decision trees, k-nearest neighbors, naive Bayes, random forest, and logistic regression, are employed, including an ensemble technique, the majority voting classifier. The study explores different parameters and methods, such as adjusting C and gamma values for support vector machines and tuning k values for k-nearest neighbors. Feature selection is conducted using the univariate method, and a hybrid dataset is created from two distinct datasets, standardized, and evaluated. The research compares outcomes with literature, focusing on achieving the best accuracy in diagnosing heart disease [4,5].
Anaconda Distribution Package Configuration
Anaconda, a free and open-source distribution for R and Python, simplifies large-scale data processing and package management. This study leverages Anaconda’s graphical user interface, Anaconda Navigator, for program launch and package, environment, and channel management. The navigator integrates various tools, such as RStudio, Spyder, Orange, Jupiter, and Jupiter Notebook. Specifically, Jupiter Notebook is used for running essential data analysis codes in an interactive, web-based computational environment.
Algorithm for Disease Prediction
Algorithm: Detection of heart disease using classifiers
Input: Heart disease dataset with several attributes
Output: Accuracy score/Confusion matrix/Classification report of predicted values (Figure 9).

Performance Metrics
This study evaluates heart disease identification using various metrics, such as training accuracy, testing accuracy, precision, recall, and F1-measure, employing terms like True Positive (TP), True Negative (TN), False Negative (FN), and False Positive (FP). TP indicates correctly identified heart disease cases, TN represents correctly identified healthy cases, FN refers to missed heart disease cases, and FP signifies incorrectly identified cases. Accuracy, denoted by ac, measures the percentage of correctly identified vectors among all normal and abnormal samples (Figure 10).

- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
Introduction
Machine learning in healthcare can enhance diagnostic tools for medical image analysis. ML algorithms applied to medical imaging, such as X-rays or MRI scans, utilize pattern recognition to identify specific conditions. These applications analyze vast datasets, aiding healthcare professionals in making more informed judgments [6,7].
Applications for machine learning (ML)
Machine Learning (ML) applications are pervasive and play a vital role in diverse practical domains, notably healthcare and patient data security. This study explores the utilization of ML to analyze medical records and predict diseases, addressing gaps in effective ML methods and applications within the healthcare industry.
Highlights of Machine Learning: (Table 4)

Machine Learning
Machine learning, a subset of artificial intelligence, creates predictive systems by learning from experiences and building models on datasets to uncover hidden patterns. In healthcare, machine learning applications optimize trial samples, increase data points, and play a pivotal role in early epidemic detection. The study emphasizes the transformative impact of machine learning on healthcare operations, allowing professionals to focus on patient care and addressing global healthcare challenges (Figure 11).

- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
Results and Discussion
This section presents the results of data analysis for predicting heart diseases, considering variables such as age, chest pain type, blood pressure, blood glucose level, ECG, heart rate, exercise angina, and four types of chest pain. The heart disease dataset undergoes effective preprocessing, eliminating unrelated records and handling missing values. The K-means algorithm is then applied to compose the preprocessed dataset, discussing four types of heart diseases: asymptomatic pain, atypical angina pain, non-anginal pain, and non-anginal pain. Histogram analysis shows a higher risk of heart disease in the age range of 50 to 55, where the development of coronary fatty streaks begins (Figure 12) [8-11].
Figure shows the impact of blood pressure and sugar in heart disease. It is inferred that population with diabetics and high blood pressure is expected to get heart disease (Figure 13).
Figure shows the impact of blood pressure and sugar in heart disease. It is inferred that population with diabetics and high blood pressure is expected to get heart disease (Figure 14).




K-means Clustering: The K-means clustering algorithm is chosen for its efficiency, simplicity, capacity to generate even-sized clusters, and scalability in handling the dataset, ensuring accurate outputs with a minimum sum of squares. The dataset comprises 209 observations with 7 variables (Figure 15).

Chest Pain Type: Asymptomatic
The plot of Age vs. Max Heart Rate broken down by Disease. Colour shows details about disease.The screen shot of the clustering are described below (Table 5 & 6), (Figure 16 & 17) & (Tables 7-12).










- Research Article
- Abstract
- Introduction
- Machine Learning-Based Approach for Diagnosing Cardiac Disease
- Provides the Maximum Accuracy, Several Parameters Relating to Various Algorithms
- Machine Learning vs Traditional Programming
- Methodology
- Overview of Conceptual and Theoretical Aspects of Machine Learning Application in Heath Industry
- Results and Discussion
- References
References
- M Gudadhe, K Wankhade, S Dongre (2010) Decision support system for Decision support system for heart disease based on support vector machine and artificial neural network,” 2021 Int. Conf. Compute. Common. Technol. ICCCT-2010 pp. 741-745.
- K Thenmozhi, P Deepika (2014) Heart Disease Prediction Using Classification with Different Decision Tree Techniques. Int J Eng Res Gen 2(6): 6-11.
- PPR Patil, PSA Kinariwala (2017) Automated Diagnosis of Heart Disease using Data Mining Techniques. Int J Adv Res Ideas Innov 3(2): 560-567.
- SK Mohan, C Thirumalai, G Srivastava (2019) Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques. IEEE Access 7.
- SP Bingulac (1994) On the Compatibility of Adaptive Controllers. Proc Fourth Ann Allerton Conf Circuits and Systems Theory Pp: 8-16.
- S Nikhar, AM Karandikar (2016) Prediction of Heart Disease Using Machine Learning Algorithms. International Journal of Advanced Engineering, Management and Science (IJAEMS) 2(6): 617-627.
- IS Jacobs, CP Bean (1963) Fine particles, thin films and exchange anisotropy. In Magnetism III, GT Rado, H Suhl Eds. New York: Academic Pp: 271-350.
- Aditi G, Gouthami K, Isha P, Kailas D (2018) Prediction of Heart Disease Using Machine Learning. Proceedings of the 2nd International conference on Electronics, Communication and Aerospace Technology (ICECA 2018).
- Abhay K, Ajay K, Karan S, Maninder P, Yogita H (2018) Heart Attack Prediction Using Deep Learning. International Research Journal of Engineering and Technology (IRJET) 5(4): 4420-4423.
- A Lakshmana Rao, Y Swathi, PSS Sundareswar (2019) Machine Learning Techniques for Heart Disease Prediction. International Journal of Scientific & Technology Research 8(11): 374-377.