Optimization for Machine Learning, Fall 2023

This course primarily focuses on algorithms for large-scale optimization problems arising in machine learning and data science applications. The first part will cover first-order methods including gradient and subgradient methods, mirror descent, proximal gradient method, accelerated gradient method, Frank-Wolfe method, and inexact proximal point methods. The second part will introduce algorithms for nonconvex optimization, stochastic optimization, distributed optimization, manifold optimization, reinforcement learning, and those beyond first-order.

Course Information

  • Teaching Assistant: Lin Zang

  • Meeting Information: 9:40-10:55 am, Tuesday/Thursday, Hylan Building 203

  • Office Hours

    • 4:00-5:00 pm, Wednesday, Wegmans Hall 2403 (Jiaming Liang)

    • 4:00-5:00 pm, Friday, Wegmans Hall 1219 (Lin Zang)

  • Textbooks

    • Amir Beck. First-order methods in optimization. SIAM, 2017.

    • Yurii Nesterov. Lectures on convex optimization. Springer, 2018.

  • Recommended Readings

    • Guanghui Lan. First-order and Stochastic Optimization Methods for Machine Learning. Springer, 2020.

    • Benjamin Recht and Stephen Wright. Optimization for Data Analysis. Cambridge University Press, 2022.

    • Suvrit Sra, Sebastian Nowozin, and Stephen Wright, eds. Optimization for Machine Learning. MIT Press, 2011.

Topics

  • Introduction

  • First-order methods II: advanced topics

    • Inexact proximal point methods [notes]

    • Augmented Lagrangian [notes]

    • Smoothing techniques [notes]

    • Optimization in relative scale [notes]

    • Randomized block coordinate descent [notes]

  • Selected topics in machine learning

  • Beyond first-order methods

    • Second and higher-order methods