search for




 

Precipitation pattern classification model in Seoul
Journal of the Korean Data & Information Science Society 2019;30:1077-89
Published online September 30, 2019;  https://doi.org/10.7465/jkdi.2019.30.5.1077
© 2019 Korean Data and Information Science Society.

SangHun Cha1 · Kyeong Eun Lee2 · Gwangseob Kim3

12Department of Statistics, Kyungpook National University
3School of Architectural, Civil, Environment, and Energy Engineering, Kyungpook National University
Correspondence to: Associate professor, Department of Statistics, Kyungpook National University, 80 Daehakro, Bukgu, Daegu 41566, Korea. E-mail: artlee@knu.ac.kr


This work was supported by Korea Environment Industry & Technology Institute (KEITI) though Water Management Research Program, funded by Korea Ministry of Environment (MOE) (79606).
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
This study identified seasonal rainfall patterns in Seoul by using hourly precipitation of raining data, and we used hourly precipitation on the day before, average daily precipitaion, temperature, wind speed, humidity, and dew point to classify these patterns. Using k-medoids clustering, the seasonal precipitaion patterns were identified for the constructed data. As a result, there were 6 precipitation patterns in spring and summer, 5 precipitation patterns in autumn and 4 precipitation patterns in winter. To classify these patterns, we constructed a model to classify seasonal precipitation patterns into 4 methods : Linear Discriminant Analysis, Decision Tree, Bagging, and Random Forest, and verified by 5-fold cross-validation. As a result, in spring, summer, and autumn, the precipitation pattern classification rate was highest when applied by Random Forest, whereas in winter, the classification rate was highest when applied by Decision Tree.
Keywords : Bagging, cross-validation, decision tree, k-medoids clustering, linear discriminant analysis, precipitation patterns, random forest.