search for


Relationship between GHG emission by industry field using K-means clustering and mean temperature change in Korea
Journal of the Korean Data & Information Science Society 2023;34:9-22
Published online January 31, 2023;
© 2023 Korean Data and Information Science Society.

Youn Su Kim1 · Kwang Yoon Song2 · In Hong Chang3

1Department of Computer Science and Statistics, Graduate School, Chosun University
23Department of Computer Science and Statistics, Chosun University
Correspondence to: This study was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education. (NRF- 2019S1A6A3A01059888).
1 Ph.D. candidate, Department of Computer Science and Statistics, Graduate School, Chosun University, Gwangju 61452, Korea.
2 Research professor, Department of Computer Science and Statistics, Chosun University, Gwangju 61452, Korea.
3 Professor, Department of Computer Science and Statistics, Chosun University, Gwangju 61452, Korea. E-mail:
Received December 12, 2022; Revised December 27, 2022; Accepted December 28, 2022.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
The causes of global warming are divided into natural causes and human-driven causes. Recent global warming has been identified as being caused by human development among human-driven causes. As the increase in greenhouse gas (GHG) is affecting global warming, research on GHG is being conducted in various industry fields. In the past, many studies have developed models focusing on GHG emission, or have focused on the relationship with the mean temperature, and have conducted research on GHG emission based on specific industry fields. In this study, K-means clusters are analyzed based on GHG generated in various industry fields, classified groups with statistically similar characteristics, and determined which industry fields affect the mean temperature change of CO2, CH4, and N2O clusters among GHG. In addition, it was shown that the results of the comparative analysis were different from the results of the data dividing the entire data and the group. As a result of the analysis, CO2, which has the highest GHG emission, should be managed first, followed by manufacturing and construction, energy, household, and agriculture-forest-fishing, which affect the mean temperature change.
Keywords : Global warming, greenhouse gas, industry field, k-means clustering.