search for




 

A study on the methodology to express the main topics of text in time series using text mining
Journal of the Korean Data & Information Science Society 2019;30:1259-76
Published online November 30, 2019;  https://doi.org/10.7465/jkdi.2019.30.6.1259
© 2019 Korean Data and Information Science Society.

Hansol Bang1 · Hoseok Moon2

12Department of Defense Science, Korea National Defense University
Correspondence to: Professor, Department of Defense Science, Korea National Defense University, Nonsan 33021, Korea. E-mail: hsmoon0329@kndu.ac.kr
Received August 28, 2019; Revised September 11, 2019; Accepted September 11, 2019.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
This study is a study on the methodology of expressing a topic in time series using topic modeling of texts with time concepts such as news media coverage. Existing studies related to topic modeling to find the main topics of the text have been focused on a kind of static analysis that finds several themes of the entire text at once without including the concept of time. In this study, to overcome the limitations of the existing research, the topics of texts containing time concepts such as news media reports were represented in a time series, and quantitative analysis of the change in interest of the topics over time. In particular, we proposed a quantitative method that excludes the qualitative judgment of the researcher when determining the number of topics and selecting representative words for each topic. We also proposed a method that makes it easier to visually identify the priorities of time-based topics. The proposed methodology was applied to the North Korean nuclear news coverage to confirm the applicability of this study.
Keywords : Map of topic priority, news media coverage, NK’s nuclear weapons, textmining, topic modeling.