https://ejournal.unjaya.ac.id/index.php/ijds/issue/feed INDONESIAN JOURNAL ON DATA SCIENCE 2024-10-30T15:28:52+07:00 Ulfi Saidata Aesyi ulfiaesyi@gmail.com Open Journal Systems <p><strong>Indonesian Journal of Data Science (IJDS) </strong><span class="Y2IQFc" lang="en">is a scientific journal that contains research results in the realm of data science (Data Science). The scope of the journal includes:</span></p> <p><br>1. Big Data<br>2. Machine Learning<br>3. Data Mining<br>4. Deep Learning<br>5. Artificial Intelligence</p> https://ejournal.unjaya.ac.id/index.php/ijds/article/view/1302 Tree-based Machine Learning Ensembles and Feature Importance Approach for the Identification of Intrusions in UNR-IDD Dataset 2024-10-30T15:28:52+07:00 Akinyemi OYELAKIN moyelakin80@gmail.com <p>Detection of intrusions from network data with the use of machine learning techniques has gained great attention in the past decades. One of the key problems in the network security domain is the availability of representative datasets for testing and evaluation purposes. Despite several efforts by researchers to release datasets that can be used for benchmarking attack detection models, some of the released datasets still suffer from one limitation or the other. Thus, some researchers at the University of Nevada released a dataset named UNR-IDD dataset which was argued to be free from some of the limitations of the past datasets. This study proposed Tree-based ensemble approaches for building binary intrusion identification models from the UNR-IDD dataset. Decision Tree algorithms are used as base classifiers in the Extra Trees, Random Forest and AdaBoost-based intrusion detection models. The results of the experimental analyses carried out indicated that the three ensembles performed excellently when feature selection was used compared to when all features were applied. For instance, Extra Trees model achieved an accuracy of 0.97, precision of 0.98, recall of 0.98 and f1-score of 0.98. Similarly, Random Forest model achieved an accuracy of 0.98, precision of 0.98, recall of 0.99 and f1-score of 0.98. Adaboost-based model had an accuracy of 0.96, precision of 0.96, recall of 0.99 and f1-score of 0.98. It was deduced that Random Forest intrusion classification model achieved slight overall best results when compared to the other models built. It is concluded that the three homogeneous ensemble models achieved very promising results while feature importance was used as attribute selection method.</p> 2024-05-29T09:23:51+07:00 Copyright (c) 2024 INDONESIAN JOURNAL ON DATA SCIENCE https://ejournal.unjaya.ac.id/index.php/ijds/article/view/1375 ANALISIS TRANSFER DATA PADA JARINGAN TERDAMPAK ARP SPOOFING MENGGUNAKAN METODE ARP POISONING DAN STATISTIK DESKRIPTIF 2024-10-30T15:28:36+07:00 sudaryanto sudaryanto@itda.ac.id Dwi Nugraheny henynug@gmail.com <table> <tbody> <tr> <td> <p>This <em>&nbsp;Computer network security issues are very important and need to be considered in the development of computer networks. Networks connected to network devices are usually vulnerable to hacking. Hacking is an activity that allows a person or group to change or take data for personal gain. The aim of this research is to carry out testing and analysis to determine the condition and measure the level of security of the ITDA Yogyakarta intra-campus information system and computer network. Describe security gaps and measure the level of security that needs to be immediately repaired so that it can help correct failures in maintaining the security of ITDA Yogayakarta intra-campus information systems and networks. This research uses descriptive statistics with 20 PC units as samples. There were four tests in this study with a total success of 16 out of 20 samples. From the results of Arp spoofing on the local network, it can be concluded that after the local network is infiltrated by an attacker using the ARP spoofing method, the target traffic will be redirected to the attacker's device. This can allow attackers to monitor and understand the contents of data traffic on the local network. Changing the attacker's MAC address is very necessary because if the MAC is not replaced then network traffic will not be redirected to the attacker's device.</em></p> </td> </tr> </tbody> </table> 2024-07-09T17:39:33+07:00 Copyright (c) 2024 INDONESIAN JOURNAL ON DATA SCIENCE https://ejournal.unjaya.ac.id/index.php/ijds/article/view/1345 Metode Latent Dirichlet Allocation Untuk Menentukan Topik Pada Review Drama Korea 2024-10-30T15:28:20+07:00 Alfun Roehatul Jannah alfunjannah25@gmail.com Ria Kristi riakristibasri@gmail.com Muhammad Habibi muhammadhabibi17@gmail.com <p>The Hallyu Wave, involving the spread of South Korean culture and popular media, has rapidly grown over the past two decades. In addition to entertainment industries such as K-pop and K-drama, this phenomenon has also extended into the food and K-beauty sectors. Korean dramas, as the core of Hallyu, have become a global phenomenon with a continuously expanding fan base worldwide. A global survey in 2022 indicated that 36 percent of respondents in 26 countries considered Korean dramas very popular in their respective countries. In Indonesia, Korean films and dramas remain favorites, with 72 percent of streaming audiences choosing them on OTT services throughout 2022. Viu dominates as the most popular Korean drama streaming platform with 57 percent usage, followed by Netflix, Telegram, and WeTv. This research focuses on the analysis of Korean drama review data from 2015 to 2023 using the Latent Dirichlet Allocation (LDA) method. The goal is to provide a deep understanding of critical aspects such as acting, storyline, and cinematography. With LDA, this research aims to identify topics related to these elements, offering specific insights into audience preferences. From the conducted research, 10 ideal topics emerged out of 20 existing topics to ensure topic consistency using topic coherence. From the topic coherence results for these 20 topics, it can be concluded that the overall topic score for topic 10 is 0.527, providing ideal results for topic modeling in accordance with the rules.</p> 2024-08-07T14:12:31+07:00 Copyright (c) 2024 INDONESIAN JOURNAL ON DATA SCIENCE https://ejournal.unjaya.ac.id/index.php/ijds/article/view/1346 ANALISIS PROYEKSI KEBUTUHAN TENAGA KERJA BERDASARKAN SKILLS YANG DIBUTUHKAN MENGGUNAKAN ALGORITMA NAIVE BAYES CLASSIFIER 2024-10-30T15:28:03+07:00 Nur Azizah Firdausa azizahfirdaus88@gmail.com Ribka Rifanny Br Girsang ribkarifanny2002@gmail.com Dela Oktaviana laadelaa102@gmail.com Astr Wahyuningsiam astriwahyu29@gmail.com Muhammad Habibi muhammadhabibi17@gmail.com <p><strong><em>In August 2023, Indonesia faced an unemployment rate of 7.86 million people, although there is no denying that the percentage of unemployment has decreased from the previous year. The data is categorized into four groups, namely unemployment involves those who are looking for work, trying to set up a business having trouble landing a job, and even those who have worked but have not started. The Covid-19 pandemic changed the paradigm of work to remote, but the need for job information remains key. Labor demand projections provide long-term insights into promising sectors and fields, guiding job seekers to develop skills according to labor market trends. This research was conducted using naive bayes classification, which is a text classification method that relies on the likelihood of keywords to compare training and testing documents. This classification method is expected to help reduce unemployment rates and align individual skills with industry needs, contributing to education and training policies to make smart career decisions in the digital era.</em></strong></p> 2024-08-09T09:22:48+07:00 Copyright (c) 2024 INDONESIAN JOURNAL ON DATA SCIENCE https://ejournal.unjaya.ac.id/index.php/ijds/article/view/1318 Pemetaan Opini Publik Menggunakan Data Mining: Studi Kasus Naturalisasi Pemain Sepak Bola dengan K-Means dan Naive Bayes Classifier 2024-10-30T15:27:47+07:00 Tegar Agustian agustiantegarr@gmail.com Emilia Fresia Nandela emeliafresianandela@gmail.com Stani A. Sinay rstapat@gmail.com Muhammad Habibi muhammadhabibi17@gmail.com <p>Naturalisasi merupakan salah satu proses yang dilakukan oleh warga asing agar menjadi Warga Negara Indonesia (WNI) yang sah di mata hukum. Saat ini Timnas Indonesia memiliki beberapa pemain naturalisasi . Beberapa kalangan menyambut positif kehadiran mereka, melihatnya sebagai langkah strategis untuk meningkatkan kualitas dan daya saing tim. Namun, ada pula yang merasa skeptis dan meragukan keberlanjutan dukungan terhadap pemain lokal. Data yang diambil dari 3584 komentar YouTube melalui YouTube Data API mencerminkan keragaman opini yang dapat memberikan gambaran lebih mendalam tentang dinamika pandangan publik. Penelitian ini penting dalam konteks pemahaman pandangan masyarakat terhadap naturalisasi pemain sepak bola Timnas. Dengan menggunakan teknik Data Mining, terutama K-Means Clustering dan Naive Bayes Classifier, penelitian ini memberikan wawasan mendalam tentang kelompok-kelompok masyarakat dengan perspektif serupa atau berbeda terkait isu tersebut. Hasil dari proses K-Means Clustering digunakan sebagai label awal untuk melatih model Naive Bayes Classifier. Evaluasi kinerja model dilakukan menggunakan confusion matrix, yang menghasilkan akurasi sebesar 93,17% dan error rate sebesar 6,83%. Proses ini dilakukan terhadap dataset komentar YouTube yang telah diberi label melalui K-Means Clustering. Hasil klasifikasi menggunakan metode Naive Bayes menunjukan bahwa 3328 data komentar setuju dengan adanya naturalisasi pemain dan 256 data komentar tidak setuju.</p> 2024-08-10T00:00:00+07:00 Copyright (c) 2024 INDONESIAN JOURNAL ON DATA SCIENCE