Search for collections on Repository Universitas Sulawesi Barat

Klasifikasi Website Phishing Menggunakan Algoritma Random Forest dengan Teknik Random Oversampling Classification of Phishing Websites Using Algorithms Random Forest with Random Oversampling Technique

YUSRINA, YUSRINA (2024) Klasifikasi Website Phishing Menggunakan Algoritma Random Forest dengan Teknik Random Oversampling Classification of Phishing Websites Using Algorithms Random Forest with Random Oversampling Technique. Diploma thesis, UNIVERSITAS SULAWESI BARAT.

[thumbnail of SKRIPSI_YUSRINA_D0220367 (pdf.io).pdf] Text
SKRIPSI_YUSRINA_D0220367 (pdf.io).pdf

Download (1MB)
[thumbnail of SKRIPSI_YUSRINA_D0220367.pdf] Text
SKRIPSI_YUSRINA_D0220367.pdf
Restricted to Repository staff only

Download (8MB)

Abstract

Klasifikasi phishing dan Non-Phishing merupakan tantangan dalam keamanan siber, terutama dalam menangani ketidakseimbangan kelas. Penelitian ini mengevaluasi performa model Random Forest dengan dan tanpa teknik Oversampling, yaitu Random Oversampling dan Synthetic Minority Over-sampling Technique (SMOTE). Tanpa Oversampling, model mencapai akurasi 95.93% dengan precision 93.65% untuk Non-Phishing dan 95.11% untuk phishing, Recall 97.03% untuk Non-Phishing dan 97.73% untuk phishing, serta F1-score masing-masing 95.31% dan 96.41%. Penerapan Random Oversampling meningkatkan akurasi menjadi 96.16%, precision menjadi 96.36% untuk Non-Phishing dan 96.00% untuk phishing, serta F1-score yang lebih baik. Namun, Recall kelas Non-Phishing dan phishing menurun menjadi 94.88% dan 97.17%, menunjukkan potensi overfitting pada kelas minoritas. Sementara itu, SMOTE menghasilkan akurasi 95.62% dengan precision lebih tinggi untuk Non-Phishing, yaitu 96.89%, tetapi Recall menurun menjadi 94.41% untuk Non-Phishing dan 96.86% untuk phishing. Precision kelas phishing juga menurun menjadi 94.37%, sedangkan F1-score untuk kelas Non-Phishing dan phishing masing-masing adalah 95.63% dan 95.60%. Hasil penelitian menunjukkan bahwa model Random Forest sudah cukup andal tanpa Oversampling, sementara penggunaan teknik Oversampling harus dipertimbangkan dengan hati-hati untuk menjaga keseimbangan klasifikasi.
Phishing and Non-Phishing classification is a challenge in cybersecurity, particularly in handling class imbalance. This study evaluates the performance of the Random Forest model with and without Oversampling techniques, namely Random Oversampling and the Synthetic Minority Over-sampling Technique (SMOTE). Without Oversampling, the model achieves an accuracy of 95.93%, with a precision of 93.65% for Non-Phishing and 95.11% for phishing, Recall of 97.03% for Non-Phishing and 97.73% for phishing, and F1-scores of 95.31% and 96.41%, respectively. The application of Random Oversampling increases accuracy to 96.16%, precision to 96.36% for Non-Phishing and 96.00% for phishing, and results in improved F1-scores. However, the Recall for Non-Phishing and phishing decreases to 94.88% and 97.17%, indicating potential overfitting in the minority class. Meanwhile, SMOTE yields an accuracy of 95.62%, with a higher precision for Non-Phishing at 96.89%, but Recall decreases to 94.41% for Non-Phishing and 96.86% for phishing. The precision for the phishing class also decreases to 94.37%, while the F1-scores for Non-Phishing and phishing are 95.63% and 95.60%, respectively. The results indicate that the Random Forest model is already reliable without Oversampling, while the use of Oversampling techniques should be carefully considered to maintain classification balance.

Item Type: Thesis (Diploma)
Uncontrolled Keywords: Random Forest, Random Oversampling, SMOTE, ketidakseimbangan kelas, klasifikasi phishing. Random Forest, Random Oversampling, SMOTE, class imbalance, phishing classification.
Subjects: FAKULTAS TEKNIK > Informatika
Divisions: Fakultas Teknik
Depositing User: Unnamed user with email aryatiunsulbar@gmail.com
Date Deposited: 09 May 2025 06:59
Last Modified: 09 May 2025 06:59
URI: https://repository.unsulbar.ac.id/id/eprint/1862

Actions (login required)

View Item
View Item