search for




 

A Bayesian skewed logit model for high-risk drinking data
Journal of the Korean Data & Information Science Society 2019;30:335-48
Published online March 31, 2019;  https://doi.org/10.7465/jkdi.2019.30.2.335
© 2019 Korean Data and Information Science Society.

Soo Bin Kim1 · Beom Seuk Hwang2

12Department of Applied Statistics, Chung-Ang University
Correspondence to: Assistant professor, Department of Applied Statistics, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, Korea. E-mail: bshwang@cau.ac.kr
This research was supported by the Graduate Fellowship in 2018.
Received February 6, 2019; Revised March 2, 2019; Accepted March 10, 2019.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
In the data on the causes and characteristics of high-risk drinking incidents conducted by the Korea Centers for Disease Control and Prevention (KCDC), high-risk drinking variable has features of unbalanced binary data that are extremely skewed. In this case, symmetric link function models including the logit model and the probit model may yield biased estimates of the parameters. To figure out this issue, we used a skewed logit model, which is one of the skewed link models to analyze such unbalanced binary data based on Bayesian inference methods. The skewed link model is a generalized model that includes the symmetric and asymmetric link function models, and has the advantage of ensuring the propriety of the posterior distribution when using an improper noninformative prior distribution in Bayesian inference. The analysis of the model on high-risk drinking data showed that the skewed logit model is more appropriate for explaining asymmetric binary data than the other comparing models.
Keywords : Asymmetric link function, Bayesian inference, high-risk drinking, MCMC, unbalanced binary data.