Published: 2024-01-01
Penerapan Algoritma TF-IDF dan Cosine Similarity untuk Query Pencarian Pada Dataset Destinasi Wisata
DOI: 10.35870/jtik.v8i1.1416
Rio Al Rasyid, Dewi Handayani Untari Ningsih
Downloads
Article Metrics
- Views 1,663
- Downloads 2,643
- Scopus Citations
- Google Scholar
- Crossref Citations
- Semantic Scholar
- DataCite Metrics
-
If the link doesn't work, copy the DOI or article title for manual search (API Maintenance).
Abstract
This research aims to improve the search for tourist destinations in 50 datasets by using search queries to find relevant documents. By optimizing the search process, the goal is to create an accurate list of tourist destinations based on a given query. To achieve this, researchers used the TF-IDF and Cosine Similarity algorithms to retrieve and compare information, measuring similarity scores between search queries and tourist destinations in the dataset. Finally, the list of tourist destinations is ranked based on the similarity score measurement. The methods used are TF-IDF and Cosine Similarity. The fifty datasets containing text content documents were normalized through pre-processing stages, namely Case Folding, Stopword Removal, and Tokenization. Documents that have been normalized are then processed again through TF-IDF weighting. TF-IDF weighting is also applied to search queries. The similarity calculation between the TF-IDF vector from the document and the TF-IDF vector from the search query is carried out using Cosine Similarity to obtain a similarity score for each document based on the search query. Testing was carried out on 5 different queries, and precision testing results were obtained with an average value of 83%
Keywords
TF-IDF ; Cosine Similarity ; Search Queries ; Datasets ; Tourist
Article Metadata
Peer Review Process
This article has undergone a double-blind peer review process to ensure quality and impartiality.
Indexing Information
Discover where this journal is indexed at our indexing page to understand its reach and credibility.
Open Science Badges
This journal supports transparency in research and encourages authors to meet criteria for Open Science Badges by sharing data, materials, or preregistered studies.
How to Cite
Article Information
This article has been peer-reviewed and published in the Jurnal JTIK (Jurnal Teknologi Informasi dan Komunikasi). The content is available under the terms of the Creative Commons Attribution 4.0 International License.
-
Issue: Vol. 10 No. 3 (2026)
-
Section: Computer & Communication Science
-
Published: %750 %e, %2024
-
License: CC BY 4.0
-
Copyright: © 2024 Authors
-
DOI: 10.35870/jtik.v8i1.1416
AI Research Hub
This article is indexed and available through various AI-powered research tools and citation platforms. Our AI Research Hub ensures that scholarly work is discoverable, accessible, and easily integrated into the global research ecosystem. By leveraging artificial intelligence for indexing, recommendation, and citation analysis, we enhance the visibility and impact of published research.
Rio Al Rasyid
Program Studi Teknik Informatika, Fakultas Teknologi Informasi dan Industri, Universitas Stikubank, Kota Semarang, Provinsi Jawa Tengah, Indonesia
-
Sipayung, E.M., Fiarni, C. and Febrian, M., 2021, October. Implementation of Search Engine Optimization (SEO) in Wellness and Beauty Tourism Industry. In 2021 8th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI) (pp. 397-402). IEEE. DOI: https://doi.org/10.23919/EECSI53397.2021.9624309.
-
-
-
Salton, G. and Buckley, C., 1988. Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), pp.513-523. DOI: https://doi.org/10.1016/0306-4573(88)90021-0.
-
-
Arif, Y.M., Nurhayati, H., Harini, S., Nugroho, S.M.S. and Hariadi, M., 2020, February. Decentralized tourism destinations rating system using 6AsTD framework and blockchain. In 2020 international conference on smart technology and applications (ICoSTA) (pp. 1-6). DOI: IEEE. https://doi.org/10.1109/ICoSTA48221.2020.1570614662.
-
Liu, G., Lee, K.Y. and Jordan, H.F., 1997. TDM and TWDM de Bruijn networks and shufflenets for optical communications. IEEE Transactions on Computers, 46(6), pp.695-701. DOI: https://doi.org/10.1109/12.600827.
-
-
Pathak, P., Raghav, S., Jain, S. and Jalal, S., 2021, October. Essay Rating System Using Machine Learning. In 2021 5th International Conference on Information Systems and Computer Networks (ISCON) (pp. 1-6). IEEE. DOI: https://doi.org/10.1109/ISCON52037.2021.9702504.
-
Ratna, A.A.P., Santiar, L., Ibrahim, I., Purnamasari, P.D., Luhurkinanti, D.L. and Larasati, A., 2019, October. Latent semantic analysis and winnowing algorithm based automatic Japanese short essay answer grading system comparative performance. In 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST) (pp. 1-7). IEEE. DOI: https://doi.org/10.1109/ICAwST.2019.8923226.
-
-
Hakim, A.A., Erwin, A., Eng, K.I., Galinium, M. and Muliady, W., 2014, October. Automated document classification for news article in Bahasa Indonesia based on term frequency inverse document frequency (TF-IDF) approach. In 2014 6th international conference on information technology and electrical engineering (ICITEE) (pp. 1-4). IEEE. DOI: https://doi.org/10.1109/ICITEED.2014.7007894.
-
-
Yang, Y., Xia, L. and Zhao, Q., 2019. An automated grader for Chinese essay combining shallow and deep semantic attributes. IEEE Access, 7, pp.176306-176316. DOI: https://doi.org/10.1109/ACCESS.2019.2957582.
-
Yulita, W., Untoro, M.C., Praseptiawan, M., Ashari, I.F., Afriansyah, A. and Pee, A.N.B.C., 2023. Automatic Scoring Using Term Frequency Inverse Document Frequency Document Frequency and Cosine Similarity. Scientific Journal of Informatics, 10(2), pp.93-104. DOI: https://doi.org/10.15294/sji.v10i2.42209.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following terms:
1. Copyright Retention and Open Access License
Authors retain copyright of their work and grant the journal non-exclusive right of first publication under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license allows unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
2. Rights Granted Under CC BY 4.0
Under this license, readers are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, including commercial use
- No additional restrictions — the licensor cannot revoke these freedoms as long as license terms are followed
3. Attribution Requirements
All uses must include:
- Proper citation of the original work
- Link to the Creative Commons license
- Indication if changes were made to the original work
- No suggestion that the licensor endorses the user or their use
4. Additional Distribution Rights
Authors may:
- Deposit the published version in institutional repositories
- Share through academic social networks
- Include in books, monographs, or other publications
- Post on personal or institutional websites
Requirement: All additional distributions must maintain the CC BY 4.0 license and proper attribution.
5. Self-Archiving and Pre-Print Sharing
Authors are encouraged to:
- Share pre-prints and post-prints online
- Deposit in subject-specific repositories (e.g., arXiv, bioRxiv)
- Engage in scholarly communication throughout the publication process
6. Open Access Commitment
This journal provides immediate open access to all content, supporting the global exchange of knowledge without financial, legal, or technical barriers.