Implementasi Metode Imputasi Mean dan Single Center Imputation Chained Equation (SICE) Terhadap Hasil Prediksi Linear Regression pada Data Numerik
Main Article Content
Abstract
Data and information play an important role in all aspects of science, so data must be processed well through the process of data excavation or data mining. The excavation of patterns from data can be done using machine learning algorithms such as linear regression. However, in the process of extracting information from data, it can be less effective if there is a loss of value in a data. The purpose of this research is to implement the mean imputation and single center imputation chained equation (SICE) techniques against the linear regression algorithm. The data used in this research is numerical data. The root mean squared error (RMSE) value shows that the implementation of linear regression algorithm using the mean imputation technique results in better performance compared to the SICE imputation technique.
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
References
Jadhav, A., Pramod, D. and Ramanathan, K., 2019. Comparison of performance of data imputation methods for numeric dataset. Applied Artificial Intelligence, 33(10), pp.913-933. DOI: https://doi.org/10.1080/08839514.2019.1637138.
Enders, C.K., 2022. Applied missing data analysis. Guilford Publications. Available at: https://lccn.loc.gov/2022009851.
Afarisi, R., Tjandrasa, H. and Arieshanti, I., 2018. Perbandingan performa antara imputasi metode konvensional dan imputasi dengan algoritma mutual nearest neighbor. Jurnal Teknik Pomits, 2(1), pp.73-76. DOI: https://doi.org/10.12962/j23373539.v2i1.2735.
Han, J., Kamber, M. and Pei, J., 2011. Data Mining: Concepts and techniques. Cambridge: Elsevier. DOI: https://doi.org/10.1016/C2009-0-61819-5.
Susanti, P. and Azizah, N., 2019. Imputation of missing value using dynamic bayesian network for multivariate time series data. International conference on data and software engineering, 1, pp.1-5. DOI: https://doi.org/10.1109/ICODSE.2017.8285864.
Das, D.D., Nayak, M. and Pani, S. K., 2019. Missing value imputation a review. International Journal of Computer Science and Engineering. 7(4), pp.548-558. DOI: https://doi.org/10.26438/ijcse/v7i4.548558.
Jajuli, M. and Komarudin, O., 2017. Implementasi Metode Impusati Mean dan Expectation Maximisation terhadap Hasil Clustering k-Means Mahasiswa Pelamar Beasiswa Fakultas Ilmu Komputer Universitas Singaperbangsa Karawang. SESIOMADIKA, 1, pp.19-27. Available at: http://pmat-unsika.eu5.org/Prosiding/4MohammadJajuli-SESIOMADIKA-2017.pdf.
Young, W., Weckman, G. and Holland, W., (2011). A survey of methodologies for the treatment of missing values within datasets: limitations and benefits. Theoretical Issue in Ergonomics Science, 12(1), pp.16-43. DOI: https://doi.org/10.1080/14639220903470205
Khan, S. and Hoque, A., 2020. SICE: an improved missing data imputation technique. Journal of Big Data, 7(37), pp.1-21. DOI: https://doi.org/10.1186/s40537-020-00313-w.
Ilham, A., 2020. Hybrid Metode Boostrap dan Teknik Imputasi Pada Metode C4-5 untuk Prediksi Penyakit Ginjal Kronis. Statistika, 8(1), pp.43-51. Available at: https://garuda.kemdikbud.go.id/documents/detail/1767581.
Joseph, V., 2022. Optimal Ratio for Data Splitting. Stat Anal Data Min: The ASA Data Sci Journal, 1, pp.531-538. DOI: https://doi.org/10.1002/sam.11583.
Marcot, B. and Hanea, A., 2021. What is an Optimal Value of k in k-Fold Cross-Validation in Discrete Bayesian Network Analysis. Computational Statistics, 36(1), pp.2009-2031. DOI: https://doi.org/10.1007/s00180-020-00999-9.
Berrar, D., 2018. Cross-Validation. Data Science Laboratory, 1, pp.542-545. DOI: https://doi.org/10.1007/s00180-020-00999-9.
SinSomboonthong, S., 2022. Performance Comparison of New Adjusted Min-Max with Decimal Scaling and Statistical Column Normalization Methods for Artificial Neural Network Classification. International Journal of Mathematics and Mathematical Sciences, 1, pp.1-9. DOI: https://doi.org/10.1155/2022/3584406.
Chai, T. and Draxler, R., 2014. Root mean squared error (RMSE) or mean absolute error (MAE)?. Geosci. Model Dev, 7(1), pp.1525-1534. DOI: https://doi.org/10.5194/gmdd-7-1525-2014.