search for


Prediction of food poisoning occurrence by using typical and social media informations
Journal of the Korean Data & Information Science Society 2018;29:1491-503
Published online November 30, 2018
© 2018 Korean Data and Information Science Society.

Hyeonjun Lyuk1 · Eunji Hwang2 · Jonghwa Na3

1Korea Health Industry Policy Development Institute ·
23Department of Information and Statistics, Chungbuk National University
Correspondence to: Professor, Department of Information and Statistics, Chungbuk National University, Chungbuk 28644, Korea. E-mail:
Received October 23, 2018; Revised November 15, 2018; Accepted November 19, 2018.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Predictive models for food-borne illness outbreaks have been a major concern, and research has been conducted on predictive models through a variety of statistical methods both domestically and internationally. In these methods, various regression methods including generalized additive model have been used. The model was developed mainly on the meteorological variables affecting food poisoning. In this paper, we constructed a regression model with the number of food poisoning cases provided by the Korea Food and Drug Administration. As the predicted variables considered, we used both typical data such as weather and environmental information and atypical data such as the SNS buzz amount of the food poisoning related keywords. The model was constructed by dividing into bacterial occurrences mainly in summer and viral ones occurring mainly in winter. Various regression models were considered for the model construction, and daily prediction models for 16 major provinces nationwide were constructed and model evaluation was conducted. As a result, the zero-inflated negative binomial regression model and the negative binomial regression model are found to be the most suitable models for the bacterial and viral food poisoning, respectively.
Keywords : Atypical data, food poisoning, negative binomial regression, poisson regression, zero-inflated model.