search for




 

A study on the location of the observation which has the least effect on the t statistic
Journal of the Korean Data & Information Science Society 2019;30:1221-32
Published online November 30, 2019;  https://doi.org/10.7465/jkdi.2019.30.6.1221
© 2019 Korean Data and Information Science Society.

Sunyoung Park1 · Hyunseok Kang2 · Sora Kim3 · Honggie Kim4

1Daejeon Metropolitan Police Agency
2Daejeon High School
34Department of Information and Statistics, Chungnam National University
Correspondence to: Professor, Department of Information and Statistics, CNU, 99, Daehak-ro, E-mail: honggiekim@cnu.ac.kr

This research is fully supported by 2018 CNU research fund. This paper is based on part of Sunyoung Park’s Master thesis.
Received August 6, 2019; Revised September 27, 2019; Accepted September 30, 2019.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
One of the features in big data is a high-volume. Therefore, when we analyze bid data, it is important to recognize stable data as well as to detect outliers. In this study, we try to find the location of an observation which has the least effect on the t statistic using influence function which originally has been introduced to detect highly influencing observations. We find the value of observations in such a way that the influence function of the t statistic is zero given that the null hypothesis is true. Also we carry out simulations with random samples from normal distributions to compute the true changes in the t statistic, which has been used to justify the location with the least effect. As a result of the simulations, it is proved that the location of the observation with zero influence function value coincides the location of the observation with the least effect on the t statistic.
Keywords : Empirical influence function, influence function, outliers, sample influence function, t statistic.