Analisis Perbandingan Pengukuran Jarak Algoritma K-Nearest Neighbor Dengan Menggunakan Data Breast Cancer Dan Data Heart Disease

Authors

  • Herdiesel Santoso STMIK El Rahma, Indonesia
  • Linda Pratiwi STMIK El Rahma, Indonesia

DOI:

https://doi.org/10.30989/ijds.v1i2.1200

Keywords:

Breast Cancer, Data Mining, Heart Disease, K-Nearest Neighbor

Abstract

Breast Cancer is a cancerous condition that appears in the breast area. This type of cancer is often experienced by women with a characteristic feature of Breast Cancer, namely the appearance of unusual lumps in the breast area. Heart or Heart Disease is a type of Non-Communicable Disease (PTM): which results in a fairly high mortality rate. Heart Disease is caused by several risk factors including smoking, an unhealthy lifestyle, high cholesterol, hypertension, and diabetes.

Based on these facts, an appropriate algorithm is needed to classify Breast Caner and Heart Disease as an effort to prevent an increase in mortality rates due to Breast Cancer and Heart Disease. And the algorithm that will be used is the K-Nearest Neighbor algorithm with 3 distance measurement methods, namely Euclidean distance, Manhattan distance, and Minkowsky distance .

From the stages that have been carried out, the final results of the Euclidean distance method obtained an Accuracy value of 80.88% Breast Cancer data at K = 11, and 78.69% heart Disease data at K = 11. The Manhattan distance method obtained an Accuracy value of 89.71% of Breast Cancer data on K=11, and 78.69% of Heart Disease data on K=20.The Minkowsky distance  method obtained an Accuracy value of 98.53% of Breast Cancer data on K=11, and 79.41% of Heart Disease data on K=11. This shows that the Minkowsky distance  method works more optimally than the Euclidean distance and Manhattan distance methods.

References

[1] H. Kurniasih, Sumiyati, S. P. Winarso, and Z. Fitria, “The Level of Knowledge, Attitudes, Behaviour of Women in Reproductive Age (WRA) with Online Class BSE,” J. Kebidanan, vol. 12, no. 2, pp. 112–118, 2022, doi: https://doi.org/10.31983/jkb.v12i2.6906.
[2] D. D. Anggraini and A. C. Hidajah, “Hubungan antara Paparan Asap Rokok dan Pola Makan dengan Kejadian Penyakit Jantung Koroner pada Perempuan Usia Produktif,” Amerta Nutr., vol. 2, no. 1, p. 10, 2018, doi: 10.20473/amnt.v2i1.2018.10-16.
[3] WHO, “World Health Statistics 2023.” [Online]. Available: https://iris.who.int/bitstream/handle/10665/367912/9789240074323-eng.pdf?sequence=1
[4] W. Nugraha, “Prediksi Penyakit Jantung Cardiovascular Menggunakan Model Algoritma Klasifikasi,” J. Manag. dan Inform., vol. 9, no. 2, pp. 3–8, 2021.
[5] A. Alhamad, A. I. S. Azis, B. Santoso, and S. Taliki, “Prediksi Penyakit Jantung Menggunakan Metode-Metode Machine Learning Berbasis Ensemble – Weighted Vote,” J. Edukasi dan Penelit. Inform., vol. 5, no. 3, p. 352, 2019, doi: 10.26418/jp.v5i3.37188.
[6] S. W. Binabar and Ivandari, “Optimasi Parameter K pada Algoritma KNN untuk Deteksi Penyakit Kanker Payudara,” IC-Tech, vol. XII, no. 2, pp. 11–18, 2017.
[7] I. Handayani and I. Ikrimach, “Comparison of K-Nearest Neighbor and Naïve Bayes for Breast Cancer Classification Using Python,” IJISCS (International J. Inf. Syst. Comput. Sci., vol. 5, no. 1, p. 1, 2021, doi: 10.56327/ijiscs.v5i1.953.
[8] M. M. Ahsan, L. S. Akter, and S. Zahed, “Machine-Learning-Based Disease Diagnosis?: A Comprehensive Review,” Healthcare, vol. 10, no. 3, pp. 1–30, 2022.
[9] D. T. Larose and C. D. Larose, Discovering Knowledge in Data: An Introduction to Data Mining: Second Edition, vol. 9780470908. 2014. doi: 10.1002/9781118874059.
[10] S. Ahmed Medjahed, T. Ait Saadi, and A. Benyettou, “Breast Cancer Diagnosis by using k-Nearest Neighbor with Different Distances and Classification Rules,” Int. J. Comput. Appl., vol. 62, no. 1, pp. 1–5, 2013, doi: 10.5120/10041-4635.
[11] M. A. Hasanah, S. Soim, and A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” J. Appl. Informatics Comput., vol. 5, no. 2, pp. 103–108, 2021, doi: 10.30871/jaic.v5i2.3200.
[12] O. Niakšu, “CRISP Data Mining Methodology Extension for Medical Domain,” Balt. J. Mod. Comput., vol. 3, no. 2, pp. 92–109, 2015.
[13] H. Seetha, M. N. Murty, and B. K. Tripathy, Modern Technologies for Big Data Classification and Clustering. Hershey PA: IGI Global, 2018. doi: 10.4018/978-1-5225-2805-0.
[14] H. Said, N. H. Matondang, and H. N. Irmanda, “Penerapan Algoritma K-Nearest Neighbor Untuk Memprediksi Kualitas Air Yang Dapat Dikonsumsi,” Techno.Com, vol. 21, no. 2, pp. 256–267, 2022, doi: 10.33633/tc.v21i2.5901.

Published

2023-11-29

Citation Check