search for


Developing the high-risk drinking predictive model in Korea using the data mining technique
Journal of the Korean Data & Information Science Society 2017;28:1337-48
Published online November 30, 2017
© 2017 Korean Data & Information Science Society.

Il-Su Park1 · Jun-Tae Han2

1Department of Health Management, Uiduk University
2Department of Student Aid Policy Research, Korea Student Aid Foundation
Correspondence to: Jun-Tae Han
Team manager, Department of Student Aid Policy Research, Korea Student Aid Foundation, 125 Sinam-ro, dong-gu, Daegu, 41200, Korea. E-mail:
Received September 20, 2017; Revised October 20, 2017; Accepted November 2, 2017.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
In this paper, we develop the high-risk drinking predictive model in Korea using the cross-sectional data from Korea Community Health Survey (2014). We perform the logistic regression analysis, the decision tree analysis, and the neural network analysis using the data mining technique. The results of logistic regression analysis showed that men in their forties had a high risk and the risk of office workers and sales workers were high. Especially, current smokers had higher risk of high-risk drinking. Neural network analysis and logistic regression were the most significant in terms of AUROC (area under a receiver operation characteristic curve) among the three models. The high-risk drinking predictive model developed in this study and the selection method of the high-risk intensive drinking group can be the basis for providing more effective health care services such as hazardous drinking prevention education, and improvement of drinking program.
Keywords : Decision tree, high-risk drinking, logistic regression, neural network