Functional L-Optimality Subsampling for Functional Generalized Linear Models with Massive Data

Authors: Hua Liu, Jinhong You, Jiguo Cao; Published in Journal of Machine Learning Research, 24(219):1−41, 2023.

Abstract

When dealing with massive data, memory and computation become significant challenges in analysis. One approach to address these challenges is by using subsamples from the full data as a surrogate. For functional data, where multiple measurements are collected over the domains, the memory and computation requirements further increase when the sample size is large. The situation becomes even more computationally intensive when statistical inference is required through bootstrap samples. In this paper, we propose an optimal subsampling method called functional L-optimality subsampling (FLoS) for functional generalized linear models. This method is motivated by the analysis of large-scale kidney transplant data. To the best of our knowledge, this is the first attempt to propose a subsampling method specifically designed for functional data analysis. We also establish the asymptotic properties of the resulting estimators. Extensive simulation studies and analysis of the kidney transplant data demonstrate that the FLoS method outperforms the uniform subsampling approach. It can effectively approximate the results based on the full data while significantly reducing computation time and memory requirements.

[Abstract]

[PDF][BibTeX]

[Code]