search for




 

Comparison of false discovery rate procedures in microarray studies
Journal of the Korean Data & Information Science Society 2019;30:455-68
Published online March 31, 2019;  https://doi.org/10.7465/jkdi.2019.30.2.455
© 2019 Korean Data and Information Science Society.

Joonsung Kang1

1Department of Information Statistics, Gangneung-Wonju National University
Correspondence to: Associate professor, Department of Information Statistics, Gangneung-Wonju National University, Jukheon-gil 7, Gangneung-si, Korea. E-mail: mkang@gwnu.ac.kr
Received January 11, 2019; Revised January 23, 2019; Accepted January 24, 2019.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
We consider the problem to identify di erentially expressed genes in microarray studies, which involves a proper multiple testing to high dimensional and low sample size problems in nonstandard data setups. In particular, we need to account for unknown dependence structures among genes in microarray data. A traditional multiple testing rate, family-wise error rate (FWER) is too conservative to control the type I error in these setups, whereas a less conservative multiple testing rate, false discovery rate (FDR) has gotten much attention in a lot of research areas such as genomic data, FMRI data and so on. We compare various FDR procedures in simulated data and real microarray data. For simulated data with dependency, we take hidden Markov model (HMM) into account in order to generate test statistics. For power consideration, we assess di erent FDR procedures by utilizing false negative rate (FNR). By calculating 1-FNR for each FDR procedure, we consider which FDR procedure is appropriate for identifying di erentially expressed genes in microarray data. Numerical results show that the Sun and Cai FDR procedure (2009) is more appropriate for controlling the FDR while minimizing FNR in simulated data under various setups. It also controls the FDR in real microarray data.
Keywords : Familywise error rate, false discovery rate, false negative rate, hidden Markov model, microarray data.