Published: 2024-08-20
Optimization of K Value in KNN Algorithm for Spam and HAM Classification in SMS Texts
DOI: 10.35870/ijsecs.v4i2.2681
Ferryma Arba Apriansyah, Arief Hermawan, Donny Avianto
Abstract
Spam refers to the unsolicited and repetitive sending of messages to others via electronic devices without their consent. This activity, commonly known as spamming, is typically carried out by individuals referred to as spammers. SMS spam, which often originates from unknown sources, frequently contains advertisements, phishing attempts, scams, and even malware. Such spam messages can be pervasive, affecting almost all mobile phone numbers, thereby causing significant disruptions to communication by delivering irrelevant content. The persistent nature of spam messages underscores the need for effective filtering mechanisms. This study investigates the application of the K-Nearest Neighbors (KNN) algorithm for classifying SMS messages as either spam or non-spam (ham). The findings demonstrate that KNN, when optimized through various methods for determining the appropriate value of K, can achieve an impressive average accuracy of 99.16% in classifying SMS spam. This high level of accuracy indicates that KNN is a reliable method for spam detection.