search for




 

A study on principal component analysis using penalty method
Journal of the Korean Data & Information Science Society 2017;28:721-31
Published online July 31, 2017
© 2017 Korean Data & Information Science Society.

Cheolyong Park1

1Major in Statistics, Keimyung University
Correspondence to: Cheolyong Park
Professor, Major in Statistics, Keimyung University, Daegu 42601, Korea. E-mail: cypark1@kmu.ac.kr
Received June 5, 2017; Revised July 8, 2017; Accepted July 11, 2017.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
In this study, principal component analysis methods using Lasso penalty are introduced. There are two popular methods that apply Lasso penalty to principal component analysis. The first method is to find an optimal vector of linear combination as the regression coefficient vector of regressing for each principal component on the original data matrix with Lasso penalty (elastic net penalty in general). The second method is to find an optimal vector of linear combination by minimizing the residual matrix obtained from approximating the original matrix by the singular value decomposition with Lasso penalty. In this study, we have reviewed two methods of principal components using Lasso penalty in detail, and shown that these methods have an advantage especially in applying to data sets that have more variables than cases. Also, these methods are compared in an application to a real data set using R program. More specifically, these methods are applied to the crime data in Ahamad (1967), which has more variables than cases.
Keywords : Elastic net, lasso, penalty, principal component analysis, regression model


KDISS e-Submission go

July 2017, 28 (4)