search for




 

Imputation method for missing data based on KNN and pattern consistency index in microarray data
Journal of the Korean Data & Information Science Society 2018;29:1179-87
Published online September 30, 2018
© 2018 Korean Data and Information Science Society.

Sunyoung Lee1 · Dongjae Kim2

12Department of Biomedicine · Health science, The Catholic University of Korea
Correspondence to: Professor, Department of Biomedicine · Health Science, The Catholic University of Korea, 222, Banpo-daero, Seocho-gu, Seoul 137-701, Korea.
E-mail: djkim@catholic.ac.kr
Received August 23, 2018; Revised September 17, 2018; Accepted September 20, 2018.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
The KNN imputation method is widely used as a missing-value imputation method in time course gene expression data. This method imputation the missing value by using k genes that are closest to the gene in which the missing value occurred. However, it has the inherent disadvantage that there may be neglecting the correlation between observation points. In this paper, we proposed a new missing value imputation method by applying the pattern consistency index proposed by Son and Baek to the KNN method. We also compared the performance between the established method and the suggested method through simulations of three yeast time course data.
Keywords : Imputation of missing values, k-nearest neighbors, microarray, pattern consistency index, time course gene expression data.