Incremental Learning in Diagonal Linear Networks

Raphaël Berthier; 24(171):1−26, 2023.

Abstract

This paper discusses the trajectory of the gradient flow in Diagonal Linear Networks (DLNs) when initialized with small values. DLNs are simplified artificial neural networks that use quadratic reparametrization of linear regression to induce sparse implicit regularization. The study shows that incremental learning occurs in DLNs as the coordinates are progressively activated, and the iterate becomes the minimizer of the loss function constrained to the active coordinates. The research also reveals that the sparse implicit regularization of DLNs decreases over time. However, the work is limited to the underparametrized regime with anti-correlated features due to technical constraints.

[abs]

[pdf][bib]