Readings and topics
References
- Nonlinear optimization
- Algorithms for Optimization by Mykel Kochenderfer and Tim A. Wheeler. MIT Press, 2019.
-
- Numerical methods in optimization by
- Jorge Nocedal and Stephen J. Wright.
- Springer 2006.
- Primal-dual interior-point methods by Stephen J. Wright, SIAM 1997.
- Linear and Nonlinear Optimization. Igor Griva, Stephen G. Nash, Ariela Sofer
SIAM, 2009.
- Least squares
- Numerical methods for least squares problems by Åke Björck, SIAM 1996.
- Convex
- Convex optimization by Stephen Boyd and Lieven Vandenberghe.
Cambridge University Press, 2004.
Week 16 (April 24-April 30)
Global optimization
- Video
- Sampling and Global Optimization
- BFGS Convergence
- Notes
- Quasi-Newton Convergence
- Slides
- Global Optimization
Quadratic Programming
- Readings
- Nocedal and Wright Chapter 16
- Notes
- Lecture 42
- Julia
- Lecture 42 (Quadratic programming) (html)
(ipynb)
- Video
- Quadratic Programming
Week 15 (April 17-April 23)
Large-scale optimization.
Also, alternating algorithms and stochastic
gradient descent.
- Reading
- Large scale optimization handout
- Nocedal and Wright Chapter 7
- Notes
- Lecture 40
- Slides
- Stochastic separable slides
- Julia
- Lecture 39 (Stochastic) (html)
(ipynb)
(See also stochastic notes below)
- Video
- Trust Region Methods
Finish up trust region methods
Finish up trust region methods and then dive into large scale optimization.
Large-scale optimization: conjugate gradient, simple algorithms,
(we didn't get to...) scalable linear algebra, and limited memory quasi-newton.
- Readings
- Nocedal and Wright Chapter 5
- Nocedal and Wright Chapter 7
- Griva Sofer Nash, Chapter 13
- Conjugate gradient handout
- Large scale optimization handout
- Notes
- Lecture 39
- Video
- Trust Region Methods
Week 14 (April 10-April 16)
Trust region methods
- Readings
- Nocedal and Wright Chapter 4
- Trust region methods
- Notes
- Lecture 37
- Julia
- Lecture 37 (Trust region) (html)
(ipynb)
- Video
- Trust Region Methods
Nonlinear equations
- Readings
- Nocedal and Wright Chapter 11
-
- Solving Nonlinear equations with Newton's method
- by C. T. Kelley
-
- Codes for Solving Nonlinear equations with Newton's method
- by C. T. Kelley
- Nocedal and Wright Chapter 5
- Videos
- Checking gradients
- Sherman Morrison Woodbury Formula
- Nonlinear equations
- Homotopy Methods
Week 13 (April 3-April 9)
Derivative Free Optimization
Overview of derivative free optimization and nonlinear programming
nonlinear programming
Derivative free optimization: gradient methods, pattern search, Nelder-Mead
And Penalty and Augmented Lagrangian methods.
- Reading
- Nocedal and Wright Chapter 17, 18,
- Griva, Sofer, Nash, Chapter 16.
- Nonlinear programming slides
- Nocedal and Wright Chapter 9
- Derivative-free optimization
- Julia
- Lecture 34 (Derivative free) (html)
(ipynb)
- Videos
- Derivative-free optimization
Quasi-newton methods
- Readings
- Nocedal and Wright Chapter 6
- Quasi-newton methods
- Notes
- Lecture 33
- Videos
- Quasi-Newton Methods
Lecture 32
A Practical treatment of Newton's method, Cholesky and Modified Cholesky,
Zoutendijk's condition
- Readings
- Nocedal and Wright Chapter 3
- More and Sorensen around page 22 for proof that modified Cholesky factorization gives a matrix with a bounded condition number
- Julia
- Lecture 32 (Modified Cholesky) (html)
(ipynb)
- Notes
- Lecture 32
- Video
- Lecture 32
Lecture 31
Matrix calculus and finite difference
- Finite difference reading
- Nocedal and Wright, Section 8.1
- Matrix calculus reading
- Matrix Calculus for Deep Learning
- http://justindomke.wordpress.com/2011/03/24/matrix-calculus/ (which leads to)
- https://tminka.github.io/papers/matrix/ (which leads to)
- https://tminka.github.io/papers/matrix/minka-matrix.pdf
- https://people.cs.umass.edu/~domke/courses/sml/01background.pdf
- Older references, many of which have broken :(
- http://www.psi.toronto.edu/~andrew/papers/matrix_calculus_for_learning.pdf (Lots of examples, focus on Machine Learning) now https://web.archive.org/web/20180127130543/http://www.psi.toronto.edu/~andrew/papers/matrix_calculus_for_learning.pdf
- Peder Olsen's talk
- https://people.cs.umass.edu/~domke/courses/sml/01background.pdf
- http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html
- http://mathoverflow.net/questions/5209/notions-of-matrix-differentiation
- http://web.archive.org/web/20080830042052/http://www.cs.berkeley.edu/~pliang/cs281a/recitation-0924.pdf
- https://tminka.github.io/papers/matrix/minka-matrix.pdf
- http://ccrma.stanford.edu/~dattorro/matrixcalc.pdf
- http://www.atmos.washington.edu/~dennis/MatrixCalculus.pdf
- http://www.colorado.edu/engineering/CAS/courses.d/IFEM.d/IFEM.AppD.d/IFEM.AppD.pdf
- http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz now https://web-beta.archive.org/web/20111222131039/http://people.kyb.tuebingen.mpg.de/suvrit/work/mcal.pdf.gz
- Video
- Lecture 31
- Notes
- Lecture 31
- Julia
- Lecture 31 Gradient Checking (html)
(ipynb)
- Lecture 31 Automatic Differentiation (html)
(ipynb)
Lecture 30
Continue IP methods (see the centering step in the code below.) And start matrix calculus
(see next lecture for refs)
- Video
- Lecture 30
- Julia
- Lecture 30 (Centering steps) (html)
(ipynb)
- Notes
- Lecture 30
Lecture 29
Continue IP methods
- Readings
- Nocedal and Wright Chapter 14
- Notes
- Lecture 29
- Video
- Lecture 29
Lecture 28
Review of project and intro to interior point methods including
the central path.
- Readings
- Nocedal and Wright Chapter 14
- Julia
- Lecture 28 (Interior point) (html)
(ipynb)
- Notes
- Lecture 28
- Video
- Lecture 28
Lecture 27
An abbreviated lecture where we reviewed the midterm solutions.
This isn't posted as it discusses specific solutions to questions. I'm happy to
discuss these in office hours if further questions are needed.
Lecture 26
The midterm.
Lecture 25
Review for the Midterm
- Video
- Lecture 25
Lecture 24
Review for the Midterm
- Notes
- Lecture 24
- Slides
- Lecture 24
- Video
- Lecture 24
Lecture 23
We covered some modern tools for stochastic Newton methods and natural gradient methods.
Then we went into review.
- Notes
- Lecture 23
- Video
- Lecture 23
Lecture 22
We covered Stochastic gradient descent and a little bit on stochastic Newton.
- Notes
- Lecture 22
- Video
- Lecture 22
Lecture 21
We briefly covered the rest of the proof from last class on backtracking
line search and argued why Newton's method
ought to drive the step-length to 1,
- Notes
- Lecture 21
- Reading
- Nocedal and Wright, Chapter 3
- Video
- Lecture 21
Lecture 21
We (almost) proved that backtracking line search converges.
- Reading
- Convergence of line search
- Notes
- Lecture 20
- Video
- Lecture 20
Lecture 19
We saw the wolfe conditions for line search and proved that a
point always satisfies them. Then we introduced backtracking
line search.
- Reading
- Wolfe Conditions
- Nocedal and Wright, Lemma 3.1.
- Nocedal and Wright, Chapter 2, 3
- Griva Sofer Nash, Chapter 11
- AlgOpt Section 4.2
- Julia
- Lecture 19 (Line search) (html)
(ipynb)
- Notes
- Lecture 19
- Video
- Lecture 19
Lecture 18
We saw an intro to unconstrained optimization and line search methods for optimization.
- Reading (next few lectures)
- Unconstrained algorithms
- Line search algorithms
- Line search and the Wolfe conditions
- Convergence of line search
- Nocedal and Wright, Lemma 3.1.
- Nocedal and Wright, Chapter 2, 3
- Griva Sofer Nash, Chapter 11
- AlgOpt Chapter 4
- AlgOpt Chapter 5
- AlgOpt Chapter 6
- Reading (this lecture)
- Line search algorithms
- Unconstrained algorithms
- Nocedal and Wright, Chapter 2, 3
- Griva Sofer Nash, Chapter 11
- Notes
- Lecture 18
- Video
- Lecture 18
Lecture 17
We finished up the proof that the simplex method decreases the objective
and then saw Phase 1 / Crash.
This was it for the SIMPLEX method!
- Reading
- Griva, Sofer, and Nash, chapter 4.
- Nocedal and Wright, Chapter 13.
- Simple Simplex Code
- AlgOpt - Chapter 11 Simplex Method
- Julia
- Lecture 17 (Simplex) (html)
(ipynb)
- Notes
- Lecture 17
- Video
- Lecture 17
Lecture 16
We finished up the simplex method and went over how it works
by moving from vertex to vertex.
- Reading
- Griva, Sofer, and Nash, chapter 4.
- Nocedal and Wright, Chapter 13.
- Simple Simplex Code
- AlgOpt - Chapter 11 Simplex Method
- Julia
- Lecture 16 (Simplex) (html)
(ipynb)
- Notes
- Lecture 16
- Video
- Lecture 16
Lecture 15
We finished the fundamental theorem of linear programming and got into
the characterization of a vertex and a basic feasible point.
- Reading
- Griva, Sofer, and Nash, chapter 4.
- Nocedal and Wright, Chapter 13.
- AlgOpt - Chapter 11 Simplex Method
- Julia
- Lecture 15 (Enumeration) (html)
(ipynb)
- Notes
- Lecture 15
- Video
- Lecture 15
Lecture 14
We saw optimality conditions for linear programs and the dual.
We explained the feasible polytope and defined a vertex.
We stated the fundamental theorem of linear programming, that if a solution
exists, then there is a solution at a vertex of the feasible polytope.
- Reading
- Griva, Sofer, and Nash, chapter 4.
- Nocedal and Wright, Chapter 13.
- Polytope Geometry
- LP optimality
- AlgOpt - 11.2.1
- Julia
- Lecture 14 (LP Feasible Polytopes) (html)
(ipynb)
- Notes
- Lecture 14
- Video
- Lecture 14
Lecture 13
We saw codes for steepest descent and how it behaves. Then we started into LPs
and saw how to convert an LP into standard form.
We saw optimality conditions for linear programs and the dual.
- Reading
- Griva, Sofer, and Nash, chapter 4.
- Nocedal and Wright, Chapter 13.
- LP optimality
- AlgOpt - Chapter 11 - what we call standard for is their equality form from 11.1.3
- AlgOpt - 11.2.2
- Julia
- Lecture 13 (Steepest Descent) (html)
(ipynb)
- Notes
- Lecture 13
- Video
- Lecture 13
Lecture 12
We introduced the steepest descent method and shows
that it converges at a linear rate
on a quadratic objective.
- Reading
- Griva, Sofer, Nash Section 14.4
- Steepest Descent
- Two Remarks on the Kantorovich Inequality, by P. Henrici.
- Notes
- Lecture 12
- Video
- Lecture 12
Lecture 11
We saw the necessary conditions for linearly inequality
constrained optimization. We also saw the definition of a descent
direction and an active set.
- Reading
- Griva, Sofer, Nash Section 14.4
- Nocedal and Wright Chapter 2 (for descent directions)
- Chapter 10, 11 in AlgOpt
- Notes
- Lecture 11
- Video
- Lecture 11
Lecture 10
We saw the necessary and sufficient conditions for linearly constrained
optimization and how they give a system of augmented normal equations for
constrained least squares. We also saw how to get an initial point for a
null-space method. We also so examples of how to use constrained least
squares to rank sports teams.
- Reading
- Griva Sofer Nash, Chapter 14.
- Notes
- Lecture 10
- Julia
- Lecture 10 (Constrained Least Squares) (html)
(ipynb)
- Video
- Lecture 10
Lecture 9
We finished up constrained least squares and got into
constrained optimization. This included necessary and
sufficient conditions based on the reduced Hessian
and reduced Gradient.
We also saw how to get an initial point for a
null-space method.
- Reading
- Griva Sofer Nash, Chapter 14.
- Notes
- Lecture 9
- Video
- Lecture 9
Lecture 8
We saw convexity and least squares problems as an intro
to constrained optimization.
- Reading
- Björck Chapter 2
- Björck Chapter 5
- Notes
- Lecture 8
- Video
- Lecture 8
Lecture 7
Multivariate taylor, gradients, Hessians, multivariate optimality conditions.
- Notes
- Lecture 7
- Reading
- Section 1.6 in AlgOpt
-
- Nocedal and Wright Section 2.1, you should read Theorem 2.3, 2.4
- for yourself, and review the multivariate generalizations.
- No minimizer at the origin but a minimizer on all lines a algebraic discussion of the example we saw in class
- Discussion on symmetry of Hessian
- Wikipedia on symmetric Hessian
- Julia
- Lecture 7 (Optimality Conditions) (html)
(ipynb)
- Lecture 7 (Rosenbrock) (html)
(ipynb)
- Video
- Lecture 7
Lecture 6
We did an intro to 1d optimization.
- Notes
- Lecture 6
- Reading
- Chapter 2 in Nocedal and Wright
- Section 1.5 and 1.6 in AlgOpt
- Handouts
- unconstrained-intro.pdf
- Video
- Apologies -- the video had a technical error.
Lecture 5
More on optimization software and how to setup problems
for Gurobi and Optim.jl
- Notes
- Lecture 5
- Julia
- Lecture 5 (Optim.jl) (html)
(ipynb)
- Lecture 5 Sudoku (html)
(ipynb)
- Video
- Lecture 5
Lecture 4
We covered a quick intro to optimization software.
- Julia
- Lecture 4 (Least Squares) (html)
(ipynb)
- Notes
- Lecture 4
- Video
- Lecture 4
Lecture 2 and Lecture 3
We combined Lecture 2 and Lecture 3 into one because I'm going
to be gone on Friday.
We went over sequences and convergence after covering
more about software and Julia.
- Reading
- Appendix A.2 in Nocedal and Wright
- Julia
- Lecture 2 (Rates) (html)
(ipynb)
- Lecture 2 (Limit Points) (html)
(ipynb)
- Lecture 2 (Julia Intro) (html)
(ipynb)
- Notes
- Lecture 2
(Combined Lecture 2 and 3 notes)
- Video
- Lecture 2
- Lecture 3
Lecture 1
We reviewed the syllabus, and saw the xkcd raptor problem
as a motivation to optimization.
- Reading
- Syllabus
- Slides
- Lecture 1
- Julia
- Lecture 1 (Julia) (html)
(ipynb)
- Lecture 1 (Raptor) (html)
(ipynb)
- Video
- Lecture 1
In class videos and notes