Analisis Perbandingan Pengukuran Jarak Algoritma K-Nearest Neighbor Dengan Menggunakan Data Breast Cancer Dan Data Heart Disease
DOI:
https://doi.org/10.30989/ijds.v1i2.1200Keywords:
Breast Cancer, Data Mining, Heart Disease, K-Nearest NeighborAbstract
Breast Cancer is a cancerous condition that appears in the breast area. This type of cancer is often experienced by women with a characteristic feature of Breast Cancer, namely the appearance of unusual lumps in the breast area. Heart or Heart Disease is a type of Non-Communicable Disease (PTM): which results in a fairly high mortality rate. Heart Disease is caused by several risk factors including smoking, an unhealthy lifestyle, high cholesterol, hypertension, and diabetes.
Based on these facts, an appropriate algorithm is needed to classify Breast Caner and Heart Disease as an effort to prevent an increase in mortality rates due to Breast Cancer and Heart Disease. And the algorithm that will be used is the K-Nearest Neighbor algorithm with 3 distance measurement methods, namely Euclidean distance, Manhattan distance, and Minkowsky distance .
From the stages that have been carried out, the final results of the Euclidean distance method obtained an Accuracy value of 80.88% Breast Cancer data at K = 11, and 78.69% heart Disease data at K = 11. The Manhattan distance method obtained an Accuracy value of 89.71% of Breast Cancer data on K=11, and 78.69% of Heart Disease data on K=20.The Minkowsky distance method obtained an Accuracy value of 98.53% of Breast Cancer data on K=11, and 79.41% of Heart Disease data on K=11. This shows that the Minkowsky distance method works more optimally than the Euclidean distance and Manhattan distance methods.
References
[2] D. D. Anggraini and A. C. Hidajah, “Hubungan antara Paparan Asap Rokok dan Pola Makan dengan Kejadian Penyakit Jantung Koroner pada Perempuan Usia Produktif,” Amerta Nutr., vol. 2, no. 1, p. 10, 2018, doi: 10.20473/amnt.v2i1.2018.10-16.
[3] WHO, “World Health Statistics 2023.” [Online]. Available: https://iris.who.int/bitstream/handle/10665/367912/9789240074323-eng.pdf?sequence=1
[4] W. Nugraha, “Prediksi Penyakit Jantung Cardiovascular Menggunakan Model Algoritma Klasifikasi,” J. Manag. dan Inform., vol. 9, no. 2, pp. 3–8, 2021.
[5] A. Alhamad, A. I. S. Azis, B. Santoso, and S. Taliki, “Prediksi Penyakit Jantung Menggunakan Metode-Metode Machine Learning Berbasis Ensemble – Weighted Vote,” J. Edukasi dan Penelit. Inform., vol. 5, no. 3, p. 352, 2019, doi: 10.26418/jp.v5i3.37188.
[6] S. W. Binabar and Ivandari, “Optimasi Parameter K pada Algoritma KNN untuk Deteksi Penyakit Kanker Payudara,” IC-Tech, vol. XII, no. 2, pp. 11–18, 2017.
[7] I. Handayani and I. Ikrimach, “Comparison of K-Nearest Neighbor and Naïve Bayes for Breast Cancer Classification Using Python,” IJISCS (International J. Inf. Syst. Comput. Sci., vol. 5, no. 1, p. 1, 2021, doi: 10.56327/ijiscs.v5i1.953.
[8] M. M. Ahsan, L. S. Akter, and S. Zahed, “Machine-Learning-Based Disease Diagnosis?: A Comprehensive Review,” Healthcare, vol. 10, no. 3, pp. 1–30, 2022.
[9] D. T. Larose and C. D. Larose, Discovering Knowledge in Data: An Introduction to Data Mining: Second Edition, vol. 9780470908. 2014. doi: 10.1002/9781118874059.
[10] S. Ahmed Medjahed, T. Ait Saadi, and A. Benyettou, “Breast Cancer Diagnosis by using k-Nearest Neighbor with Different Distances and Classification Rules,” Int. J. Comput. Appl., vol. 62, no. 1, pp. 1–5, 2013, doi: 10.5120/10041-4635.
[11] M. A. Hasanah, S. Soim, and A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” J. Appl. Informatics Comput., vol. 5, no. 2, pp. 103–108, 2021, doi: 10.30871/jaic.v5i2.3200.
[12] O. Niakšu, “CRISP Data Mining Methodology Extension for Medical Domain,” Balt. J. Mod. Comput., vol. 3, no. 2, pp. 92–109, 2015.
[13] H. Seetha, M. N. Murty, and B. K. Tripathy, Modern Technologies for Big Data Classification and Clustering. Hershey PA: IGI Global, 2018. doi: 10.4018/978-1-5225-2805-0.
[14] H. Said, N. H. Matondang, and H. N. Irmanda, “Penerapan Algoritma K-Nearest Neighbor Untuk Memprediksi Kualitas Air Yang Dapat Dikonsumsi,” Techno.Com, vol. 21, no. 2, pp. 256–267, 2022, doi: 10.33633/tc.v21i2.5901.
Downloads
Published
Issue
Section
Citation Check
License
Indonesian Journal On Data Science allows readers to read, download, copy, distribute, print, search, or link to its articles' full texts and allows readers to use them for any other lawful purpose. The journal allows the author(s) to hold the copyright without restrictions. Finally, the journal allows the author(s) to retain publishing rights without restrictions
- Authors are allowed to archive their submitted articles in an open access repository
- Authors are allowed to archive the final published article in an open access repository with an acknowledgment of its initial publication in this journal
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Generic License.