search for




 

A study on variability in leading causes of death using geographically weighted principal component analysis
Journal of the Korean Data & Information Science Society 2024;35:23-32
Published online January 31, 2024;  https://doi.org/10.7465/jkdi.2024.35.1.23
© 2024 Korean Data and Information Science Society.

Myungjin Kim1

1Department of Statistics, Kyungpook National University
Correspondence to: This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (RS-2023-00248103).
1 Assistant professor, Department of Statistics, Kyungpook National University, Daegu 41566, Korea. E-mail: myungjin_kim@knu.ac.kr
Received December 20, 2023; Revised January 8, 2024; Accepted January 9, 2024.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
High-dimensional data is common in various fields. However, it is challenging to handle it due to the curse of dimensionality. Dimensionality reduction methods, such as principal component analysis (PCA), are considered to address this issue. PCA generates new, unrelated features that aim to explain the variation in data. However, PCA expresses characteristics based on the global structure information of the data, resulting in the same principal components across all regions. This limitation becomes apparent when handling data exhibiting regional covariance structures, and it fails to capture regional characteristics. We employ geographically weighted principal component analysis, which explains spatial heterogeneity, providing a better understanding of data variation and the identification of important variables in the principal components specific to each region. In the application of mortality data, our findings reveal that patterns of features are notably distinguished between eastern and western areas, starting from Minnesota, Iowa, Missouri, Arkansas, and Louisiana.
Keywords : Component loading, kernel function, total variance, winning variable.