Joseph Rynkiewicz (SAMM Paris1-Panthéon-Sorbonne), 12 avril 2019
Résumé : We consider models involving deep multilayer perceptrons (DMLP) with rectified linear (ReLU) functions for activation units. We study the asymptotic behavior of the difference of cost function (sum of square errors or the opposite of log-likelihood) for the estimated model and the theoretical best model. This behavior gives us information on the overfitting properties of such models. If the model is heavily overparameterized a loss of identifiability can occur and the behavior of the difference of cost functions is not obvious, since, in this case, the true parameter vector cannot be identified uniquely. In this framework, we show that the studied behavior depends not only on the size of the estimated network but also on the complexity of the true, and unknown model. Some simulations confirm our theoretical findings and also rise new questions.