search for




 

Comparison of time series CV methods for deep learning
Journal of the Korean Data & Information Science Society 2024;35:397-410
Published online May 31, 2024;  https://doi.org/10.7465/jkdi.2024.35.3.397
© 2024 Korean Data and Information Science Society.

Youngwon Seo1 · Changryong Baek2

12Department of Statistics, Sungkyunkwan University, Seoul, Korea
Correspondence to: This work was supported by the Basic Science Research Program from the National Research Foundation of Korea (NRF-2022R1F1A1066209).
1 Graduate student, Department of Statistics, Sungkyunkwan University, Seoul 03063, Korea.
2 Professor, Department of Statistics, Sungkyunkwan University, Seoul 03063, Korea. E-mail: crbaek@skku.edu
Received April 20, 2024; Revised May 13, 2024; Accepted May 14, 2024.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
In this paper, we compared cross-validation methods for time series data prediction using deep learning models. These methods preserve time dependencies by dividing data into blocks and shuffling them randomly. Commonly used block division and shuffling methods include TS (Time series split), KF (KFold), and RW (Rolling window). This paper aims to assess the predictive performance of deep learning models under different cross-validation techniques and identify the strengths and weaknesses of these methods. We evaluated predictive performance using international oil prices, Dow Jones realized volatility, electricity consumption, and PM2.5 data. The results showed that no single cross-validation method outperformed uniformly over others. However, the KF method performed well on stationary time series data and was robust against anomalies. The TS method performed well on data with periodicity, while the RW method offered computational advantages due to shorter training times.
Keywords : Attention, BLSTM, cross-validation, deep learning, LSTM, prediction, time-series data