Ridges, Neural Networks, and the Radon Transform
By Michael Unser; Volume 24, Issue 37: Pages 1-33, 2023.
Abstract
A ridge is a function characterized by a one-dimensional profile (activation) and a multidimensional direction vector. Ridges are relevant in neural networks as functional descriptors of a neuron’s effect, with the direction vector encoded in the linear weights. This paper explores the properties of the Radon transform in relation to ridges and the characterization of neural networks. We introduce a broad category of hyper-spherical Banach subspaces (including the relevant subspace of measures) where the back-projection operator is invertible. We also provide conditions under which the back-projection operator can be extended to the full parent space, with its null space identified as a Banach complement. Starting from first principles, we then characterize the sampling functionals within the range of the filtered Radon transform. Furthermore, we extend the definition of ridges for any distributional profile and determine their (filtered) Radon transform in a comprehensive manner. Finally, we use our formalism to clarify and simplify some of the results and proofs regarding the optimality of ReLU networks that have been presented in the literature.
[Abstract]