Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)

Research output: Contribution to journalConference articlepeer-review

Abstract

We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk. We focus on problems without strong convexity, for which all previously known algorithms achieve a convergence rate for function values of O(1/√n) after n iterations. We consider and analyze two algorithms that achieve a rate of O(1/n) for classical supervised learning problems. For least-squares regression, we show that averaged stochastic gradient descent with constant step-size achieves the desired rate. For logistic regression, this is achieved by a simple novel stochastic gradient algorithm that (a) constructs successive local quadratic approximations of the loss functions, while (b) preserving the same running-time complexity as stochastic gradient descent. For these algorithms, we provide a non-asymptotic analysis of the generalization error (in expectation, and also in high probability for least-squares), and run extensive experiments showing that they often outperform existing approaches.

Original languageEnglish
JournalAdvances in Neural Information Processing Systems
Publication statusPublished - 1 Jan 2013
Externally publishedYes
Event27th Annual Conference on Neural Information Processing Systems, NIPS 2013 - Lake Tahoe, NV, United States
Duration: 5 Dec 201310 Dec 2013

Fingerprint

Dive into the research topics of 'Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)'. Together they form a unique fingerprint.

Cite this