Published: 2024-01-01

Penerapan Algoritma TF-IDF dan Cosine Similarity untuk Query Pencarian Pada Dataset Destinasi Wisata

DOI: 10.35870/jtik.v8i1.1416

No Cover Available

Downloads

Article Metrics
Share:

Abstract

This research aims to improve the search for tourist destinations in 50 datasets by using search queries to find relevant documents. By optimizing the search process, the goal is to create an accurate list of tourist destinations based on a given query. To achieve this, researchers used the TF-IDF and Cosine Similarity algorithms to retrieve and compare information, measuring similarity scores between search queries and tourist destinations in the dataset. Finally, the list of tourist destinations is ranked based on the similarity score measurement. The methods used are TF-IDF and Cosine Similarity. The fifty datasets containing text content documents were normalized through pre-processing stages, namely Case Folding, Stopword Removal, and Tokenization. Documents that have been normalized are then processed again through TF-IDF weighting. TF-IDF weighting is also applied to search queries. The similarity calculation between the TF-IDF vector from the document and the TF-IDF vector from the search query is carried out using Cosine Similarity to obtain a similarity score for each document based on the search query. Testing was carried out on 5 different queries, and precision testing results were obtained with an average value of 83%

Keywords

TF-IDF ; Cosine Similarity ; Search Queries ; Datasets ; Tourist

Peer Review Process

This article has undergone a double-blind peer review process to ensure quality and impartiality.

Indexing Information

Discover where this journal is indexed at our indexing page to understand its reach and credibility.

Open Science Badges

This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges by sharing data, materials, or preregistered studies.

Similar Articles

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)