search for




 

Development of machine learning based prediction of particlulate matter concentration in Seoul
Journal of the Korean Data & Information Science Society 2022;33:1095-111
Published online November 30, 2022;  https://doi.org/10.7465/jkdi.2022.33.6.1095
© 2022 Korean Data and Information Science Society.

Min Woo Kim1 · Hyeong-Se Jeong2

1Observation Research Department, National Institute of Meteorological Sciences
2Research Applications Department, National Institute of Meteorological Sciences
Correspondence to: This work was funded by the Korea Meteorological Administration Research and Development Program “Developing Meteorological Observation Standards” (KMA2018- 00221) and “Developing Technology for User-Specific Weather Information Production” (KMA2018-00622) under Grant.
1 Researcher, Observation Research Department, National Institute of Meteorological Sciences, Seguipo 63568, Korea.
2 Senior researcher, Research Applications Department, National Institute of Meteorological Sciences, Seguipo 63568, Korea. E-mail: seya1019@naver.com
Received September 18, 2022; Revised October 31, 2022; Accepted November 7, 2022.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Particulate matter is a factor that adverse effect society and economy, Thus it is necessary to prepare for it. This study aims to predict the particulate matter concentration in Seoul in real time and provide it to the public. deep neural network (DNN), random forest (RF), support vector regression (SVR) and long short term memory (LSTM) were compared to select optimal particulate matter prediction model. Also sliding window method that learning for real-time was applicated to learn a latest data. A meteorological data and air pollutant data were used in Seoul in 2017-2018. To select optimized hyper-parameter, grid search method was used. The stability of the models was confirmed by repeating 100 times. Then, statistical analysis was conducted to evaluate the performance of each model. The finally selected model was SVR, which showed excellent predictive performance. The prediction accuracy of SVR model for each particulate matter class scored 87.14%.
Keywords : ASOS, machine learning, particulate matter, sliding window, SVR.