TY - GEN
T1 - On the spectral bias of two-layer linear networks
AU - Varre, Aditya
AU - Vladarean, Maria Luiza
AU - Pillaud-Vivien, Loucas
AU - Flammarion, Nicolas
N1 - Publisher Copyright:
© 2023 Neural information processing systems foundation. All rights reserved.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - This paper studies the behaviour of two-layer fully connected networks with linear activations trained with gradient flow on the square loss. We show how the optimization process carries an implicit bias on the parameters that depends on the scale of its initialization. The main result of the paper is a variational characterization of the loss minimizers retrieved by the gradient flow for a specific initialization shape. This characterization reveals that, in the small scale initialization regime, the linear neural network's hidden layer is biased toward having a low-rank structure. To complement our results, we showcase a hidden mirror flow that tracks the dynamics of the singular values of the weights matrices and describe their time evolution. We support our findings with numerical experiments illustrating the phenomena.
AB - This paper studies the behaviour of two-layer fully connected networks with linear activations trained with gradient flow on the square loss. We show how the optimization process carries an implicit bias on the parameters that depends on the scale of its initialization. The main result of the paper is a variational characterization of the loss minimizers retrieved by the gradient flow for a specific initialization shape. This characterization reveals that, in the small scale initialization regime, the linear neural network's hidden layer is biased toward having a low-rank structure. To complement our results, we showcase a hidden mirror flow that tracks the dynamics of the singular values of the weights matrices and describe their time evolution. We support our findings with numerical experiments illustrating the phenomena.
UR - https://www.scopus.com/pages/publications/85183597359
M3 - Conference contribution
AN - SCOPUS:85183597359
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Neural information processing systems foundation
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
Y2 - 10 December 2023 through 16 December 2023
ER -