At this seminal we’ll discuss stochastic optimisation methods. Those methods are a key technique to learn many machine learning models on big datasets. Firstly, we’ll discuss classical stochastic gradients descent: theoretical results and issues of method. Secondly, we’ll talk about variance reduction techniques used to improve convergence rate (AdaGrad, SVRG, Adam). Finally, we’ll cover practicals aspects: how to implement algorithm in distributed settings (multi-CPU/GPU learning) and what computations problems one will face.
Speaker: Vasily Ershov.
Presentation language: Russian.
Date and time: February 14th, 20:00-21:30.
Location: Times, room 405.
Videos from seminars will be available at http://bit.ly/MLJBSeminars