search for




 

On regression analysis of interval-valued data
Journal of the Korean Data & Information Science Society 2018;29:351-65
Published online March 31, 2018
© 2018 Korean Data and Information Science Society.

Soohyun Im1 · Kee-Hoon Kang2

12Department of Statistics, Hankuk University of Foreign Studies
Correspondence to: Professor, Department of Statistics, Hankuk University of Foreign Studies, Yongin 17035, Korea. E-mail: khkang@hufs.ac.kr
Received February 2, 2018; Revised February 25, 2018; Accepted February 27, 2018.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
The interval data, which is one of the symbolic data, is given as an interval in which all observations are not a single value. In this paper, we introduce some regression approaches for interval-valued data to focus on linear regression analysis. In addition, we propose to use truncated normal distribution instead of uniform distribution in the resampling approach. It is assumed that it has more information near the center point of the interval. Several methods are compared through simulation. Also, we apply these approaches to the real data related to the fine dust. As the sample size increases, there is little difference between the methods. In terms of resampling method, the proposed one shows better performance.
Keywords : Linear regression model, resampling method, statistical inference, symbolic data, truncated normal distribution.