Sparse GCA and Thresholded Gradient Descent

Authors: Sheng Gao, Zongming Ma; Published: 2023; Volume: 24(135); Pages: 1-61.

Abstract

Sparse GCA and Thresholded Gradient Descent is a study on uncovering linear relationships across multiple data sets. It extends canonical correlation analysis, which is designed for two data sets, to handle multiple leading generalized correlation tuples and a loading matrix with a small number of nonzero rows. This study includes sparse CCA and sparse PCA of correlation matrices as special cases. The authors propose a thresholded gradient descent algorithm for estimating GCA loading vectors and matrices in high dimensions, by formulating sparse GCA as a generalized eigenvalue problem at both population and sample levels. The algorithm is shown to have tight estimation error bounds with proper initialization. The effectiveness of the algorithm is demonstrated through experiments on various synthetic data sets.

[abs]

[pdf][bib]