Please answer the following questions in complete sentences in a clearly prepared manuscript and submit the solution by the due date on Gradescope. The homework is due 11:59pm Dec 1st due to Purdue regulations around quiet period. You have an automatic extension to the morning of December 5th. You need to have it in by December 3rd to be guaranteed to have it graded before the exam. We may discuss solutions on December 5th, so no homework can be submitted after that day. (If the class makes a motion to avoid discussing homework in class, I can give you an extra day or so if you'd like more time.)
Remember that this is a graduate class. There may be elements of the problem statements that require you to fill in appropriate assumptions. You are also responsible for determining what evidence to include. An answer alone is rarely sufficient, but neither is an overly verbose description required. Use your judgement to focus your discussion on the most interesting pieces. The answer to "should I include 'something' in my solution?" will almost always be: Yes, if you think it helps support your answer.
Please identify anyone, whether or not they are in the class, with whom you discussed your homework. This problem is worth 1 point, but on a multiplicative scale.
Make sure you have included your source-code and prepared your solution according to the most recent Piazza note on homework submissions.
Implement a Lanczos-based MINRES code that explictly builds and then finds the minimum residual vector within the Krylov subspace.
Compare the first 25 residuals from the Lanczos-based CG code we wrote in class that explictly builds , with the a standard implementation of CG from: http://www.cs.purdue.edu/homes/dgleich/cs515-2024/homeworks/cg.jl for the linear system
n = 100
on = ones(Int64,n)
A = A = spdiagm(-1=>-2*on[1:end-1],0=>4*on,1=>-2*on[1:end-1])
b = ones(n)
as well as the MINRES code you developed above.
Using the cg.jl
function, look at how many iterations are
required for CG to converge to a tolerance of
for the matrix in the last part.
Determine how this scales with .
Let and , , .
Consider the -by- matrix with diagonal elements
(Some call this the Strako\v{s} matrix.)
Use the Lanczos method starting from a a random vector and the vector and then plot the quantity for to . Describe what you SHOULD find and what you actually find. Do your results depend on the starting vector?
Plot for the to . Also plot for to .
What is What should it be?
Plot for to .
Suppose that is non-singular. And suppose we use MINRES to solve the system . In exact arithmetic, how many steps will MINRES take to converge to the exact solution? Be sure to explain \textit{how you determine what the answer is. Note, your result should be self contained and not utilize the theory discussed in class -- this is good practice for the final; if you do use a theorem to justify your answer, you may lose a few points.}
Compare the work involved with the method in step 1 to methods that would involve estimating and then using a specialized solver explicitly for this type of structure.
It turns out that the GMRES solution to in step , , can be viewed as constructing a degree polynomial, , such that is minimized. Think of it this way: since is in , we know for some scalars . If we define the polynomial , then we can express .
Then since GMRES constructed to minimize , we know that the polynomial is the degree polynomial that minimizes , since .
Suppose GMRES converged to the exact solution to the system in iterations, i.e. it constructed in such that is the exact solution to . Show that there exists a degree polynomial such that
Note that there are few ways to turn a non-symmetric linear system into a symmetric linear system. The first is to use the normal equations . The second is to use a set of augmented equations:
Show that the solution of the augmented system of equations exists for any square, full-rank non-symmetric matrix .
Try and use CG and MINRES to solve the Chutes and Ladders linear system from the beginning of class using these approaches. Do they work? If so, how many matrix-vector products do they take compared with your Neumann series-based solver to converge to a 2-norm relative residual of .
Also report on the same measures for the PageRank system from Homework 2 on the Wikipedia matrix.
For this problem, you must use Julia.
It's been updated now, it should be good.
It's based on an assignment from Michael Saunders' iterative methods class.
Download poisson2D-data.csv
and poisson2D-rhs.csv
and
make sure you can load them via the following script. These files were originally MAT files from
https://sparse.tamu.edu/FEMLAB/
using DelimitedFiles
using SparseArrays
data = readdlm("poisson2D-data.csv",',')
A = sparse(Int.(data[:,1]), Int.(data[:,2]),(data[:,3]))
A = (A + A')./2
b = vec(readdlm("poisson2D-rhs.csv"))
The matrix A is symmetric-positive definite. But we are going to look at solving where . We'll declare a method converged if the relative residual has -norm less than .
Plot the relative residuals for 100 iterations of the MINRES (minres
from Krylov.jl
)
method and the GMRES (gmres
from Krylov.jl
) method restarted every 30 iterations on a semilogy scale. Describe what you observe, in particular, do the methods converge?
For convenience, here is some code to help you get started:
using Krylov
x, hist_min = minres(A,b; rtol = mytol, itmax = mymaxiter)
x, hist_gm = gmres(A,b;restart = true, memory = myrestart, rtol = mytol, itmax = mymaxiter)
# look at hist_min.data[:resnorm]
If you use an incomplete cholesky ichol
or incomplete LU ilu
preconditioner with GMRES, it will converge. How many iterations does
it take? Does MINRES benefit from an incomplete Cholesky preconditioner?
We provide you with the ichol
and ilu
code.
Download the file precond.jl
from here http://www.cs.purdue.edu/homes/dgleich/cs515-2024/homeworks/precond.jl. There are some peculiarities with the Krylov package. We will help you set that up. Here is the code to help you get started
using LinearOperators
include("precond.jl")
Lchol = ichol(A)
M1 = opInverse(LowerTriangular(ichol(A)))
x, hist_pgm = gmres(A,b; restart = true, memory = myrestart, rtol = mytol, itmax = mymaxiter,M=M1)
And preconditioning MINRES:
P1 = opInverse(LowerTriangular(ichol(A)))
Mprecond = P1'*P1;
x, hist_pmin = Krylov.minres(A,b,M=Mprecond,itmax=mymaxiter, rtol=mytol)
Show the eigenvalues of the matrix before and after preconditioning. Do you observe any clustering?
For this problem, if you don't use Julia, you will have to convert a series of multigrid codes to your own language. This isn't too hard as it's only about 100 lines of code. However, in one step, you will have to use a sparse direct solver. This is simple to do in Julia. If you know how to do it in other languages, then this should be easy. However, you are on your own!
Download the codes http://www.cs.purdue.edu/homes/dgleich/cs515-2024/homeworks/multigrid_functions.jl
Using these codes, you can solve Poisson's equation as follows:
X = solve_poisson_direct(poisson_setup(nx,ny, (x,y) -> 1))
and plot a solution via
using Plots
pyplot()
surface(X)
(which is what a previous homework asked you to do!)
For this problem, we will always have n = nx = ny
. How long does
it take to solve this for .
(This worked great on my computer in terms of memory usage. But your
milage may vary, if you run out of memory, show the largest problem
you can solve!)
Plot the times on a log-log scale. If you double the problem size, about how much longer does the computation need?
Explain what the apply_poisson_jacobi
function does. Where is
the matrix?
Use this method to implement the Jacobi method to solve a linear system? For a problem with , estimate how much decrease in error occurs at each iteration of Jacobi. That is, if and are sucessive iterates with error and , respectively, then what does the trend in show? (Note, you probably have to run a few hundred to a few thousand iterations to see these values start to converge.)
Write a function apply_poisson_gauss_seidel
to implement the
Gauss-Seidel without building the matrix. Show the same error trend
for Gauss-Seidel compared with Jacobi.
Estimate the same rate for Jacobi and Gauss Seidel for a problem with . Given that each iteration of Jacobi and Gauss-Seidel takes time, give a rough estimate of the number of iterations needed to reduce the error to a fixed value of .
Now we get to play around with Multigrid! The function interpolate
takes a vector and returns a vector. Use interpolate
to evaluate the error that results from solving a Poisson problem
and interpolating the solution to a solution. Show a mesh-plot
of the absolute value of the error and the norm of the error.
What happens when you run a Jacobi iteration after interpolating the solution? Describe how the error changes in terms of the mesh-plot as well as the the norm of the error.
Explain what the simple_multigrid
function does. Estimate the
decrease in error for one iteration of the loop for and .
Explain what the apply_poisson_multigrid
function does and how
poisson_multigrid_error
compares with simple_multigrid
. Estimate
the decrease in error for one iteration of the loop for the same
values of .
In practice, we can't use the error as we won't have the true solution.
Use the poisson_multigrid_residual
method to solve Poissons
equation with 25 iterations for .
Show the times on a log-log scale along with the times from the direct
method in part 1. Can you solve a 10,000-by-10,000 problem? (This is a
big problem, you'll need O(16GB) of RAM!)
Use the Lanczos method to estimate the top 5 singular values of the Chutes and Ladders iteration matrix.
Does the loss of orthogonality of the Lanczos vectors impact the accuracy of the eigenvectors? Describe your investigation here. (A rough guide, you should present evidence that shows you are familiar with the ideas in class. This could be 1 paragraph up to around a page, and could include figures.)
Report if your code from (1) is correct for the 5 largest
singular values of the sparse matrix loaded directly from the Wikipedia file via
the load_data()
function in Homework 2.
So this is the matrix P, not the PageRank system. Provide evidence to support your statements that
the code gets the correct answers or explain what you think goes wrong if it does not.
(Your explanation is more important than whether or not it works, although you may lose
points if it doesn't work due to the size of the problem as the code from 1. should
scale to this level.)