Integrating Random Effects in Deep Neural Networks
The paper “Integrating Random Effects in Deep Neural Networks” by Giora Simchoni and Saharon Rosset (2023) explores the use of mixed models to handle correlated data in deep neural networks (DNNs). While DNNs typically assume that observed responses are statistically independent, real-life large-scale applications often involve correlated data with spatial, temporal, and clustering structures. Currently, DNNs either ignore these correlations or employ ad-hoc solutions for specific use cases. The authors propose using mixed models to address this issue and improve predictive performance by treating the effects underlying the correlation structure as random effects.
The key to combining mixed models and DNNs is the use of the Gaussian negative log-likelihood (NLL) as a natural loss function. This loss function is minimized with DNN machinery, including stochastic gradient descent (SGD). However, using SGD with NLL presents some theoretical and implementation challenges, which the authors address in their paper. They demonstrate the effectiveness of their approach, called LMMNN, in various correlation scenarios using simulated and real datasets. Although the focus is on a regression setting and tabular datasets, some results for classification are also presented.
The code for LMMNN is available on GitHub at https://github.com/gsimchoni/lmmnn.
[abs]