Homework 3b

Please answer the following questions in complete sentences in a clearly prepared manuscript and submit the solution by the due date on Gradescope.

Remember that this is a graduate class. There may be elements of the problem statements that require you to fill in appropriate assumptions. You are also responsible for determining what evidence to include. An answer alone is rarely sufficient, but neither is an overly verbose description required. Use your judgement to focus your discussion on the most interesting pieces. The answer to "should I include 'something' in my solution?" will almost always be: Yes, if you think it helps support your answer.

Problem 0: Homework checklist

Please identify anyone, whether or not they are in the class, with whom you discussed your homework. This problem is worth 1 point, but on a multiplicative scale.
Make sure you have included your source-code and prepared your solution according to the most recent Piazza note on homework submissions.

Problem 1: Recursive block elimination

In class, we saw "element-wise" variable elimination. That is, in the first step, we eliminated the the first variable from the problem and built a new problem for every other variable.

In fact, we can do this for multiple variables too. There's only one small catch that has to do with pivoting, but we won't worry about that for this problem. That is, you can assume everything you need to be non-singular, is actually non-singular.

Why this matters A recent innovation are the presence of large "tensor-cores" that support "matrix multiplication" directly on GPUs or other CPU accelerators. These can make matrix-multiplication operations much much faster than expected. We are going to use them to solve a linear system via block-substitution. As a simple example, note that Apple's recent M-series chip includes a dedicated "matrix-multiplication" instruction! As another example, Intel has their own extensions too

Your problem Write down the steps of an algorithm to solve $\mA \vx = \vb$ via recursive block-elimination. That is, divide $\mA$ into blocks: $\mA = \bmat{\mA_1 & \mA_2 \\ \mA_3 & \mA_4}$ where all blocks have the same size. In this case, assume that $\mA$ has a power-of-two dimension so that you can do this recursively as well until you get down to a $1\times 1$ system. Then if $\vx = \bmat{\vx_1 \\ \vx_2}, \qquad \vb = \bmat{\vb_1 \\ \vb_2}$ are the vectors that conform to the partitioning of $\mA$ , develop an algorithm that uses as a subroutine:

    # return the matrix after solving k linear systems $A^{-1} B$
    function solve(A::Matrix, B::Matrix)
    end

and matrix-matrix multiplication to solve the linear system. Operations you are allowed:

submatrix extraction, e.g. A1 = A[1:div(n,2),1:div(n,2)]
addition and subtractions of matrices or vectors, e.g. A .- B
matrix-vector or matrix-matrix multiplication, e.g. A*b or A*B
recursive calls to solve
you are allowed to use division only for a n=1 case.

For testing, you can use the fact that pivoting isn't required for symmetric positive definite matrices. So just compute a random matrix $\mZ$ and let $\mA = \mZ^T \mZ$ .

Notes You may find it helpful to think about doing this with only two variables at a time first. Here, I write down a few small steps in the process to get you started. I have not fully debugged these notes as they are meant as a supplementary guide to this problem as a bridge from the lecture notes to the question

Let
$\mA = \bmat{ \alpha & \beta & \vc^T \\ \gamma & \delta & \vd^T \\ \vf & \vg & \mR } \qquad \vb = \bmat{ \theta \\ \nu \\ \vh }$ And let
$\vx = \bmat{ \omega \\ \mu \\ \vy}.$ Let $\mA_1 = \bmat{\alpha & \beta \\ \gamma & \delta}$ . Then note that $\mA_1 \bmat{ \omega \\ \mu} + \bmat{ \vc^T \\ \vd^T } \vy = \bmat{ \theta \\ \nu }$ . So like we found in class, we can find $\omega$ and $\mu$ given $\vy$ . (Assuming that $\mA_1$ is non-singular. (Which, to be clear, we are assuming in this problem!) Let $\vz = \bmat{\omega \\ \mu}$ . The last expression shows that we can write $\vz$ as a function of $\vy$ . At this point, we can then substitute $\vz(\vy)$ in like we did for the other cases in the class notes.

Problem 2: More on Cholesky

Note, we often use either $\mF$ or $\mL$ as the Cholesky factor.

Your professor also boldy asserted that we preserve positive definiteness after one step of the Cholesky factorization. Recall that this was: $\mA = \bmat{\alpha & \vb^T \\ \vb & \mC} = \mF \mF^T = \bmat{\gamma & 0 \\ \vf & \mF_1} \bmat{\gamma & \vf^T \\ 0 & \mF_1^T}.$ So we set $\gamma = \sqrt{\alpha}$ and $\vf = \vb/\gamma$ . Then, we recursively compute the Cholesky factorization of $\mC - \vb \vb^T/\alpha = \mF_1 \mF_1^T.$ This only works if $\mC - \vb \vb^T/\alpha$ is positive definite. This is discussed in the notes. (Hint: you can do it from the definition of positive definiteness: $\vx^T \mA \vx > 0$ for all $\vx$ .)

Briefly explain why this step justifies that a Cholesky factorization exists for any positive definite matrix. (This is really like a one or two sentence answer.)
One of the most useful aspects of the Cholesky factorization is as a way to check if a matrix is positive definite or negative definite. A matrix is negative definite if $\vx^T \mA \vx < 0$ for all $\vx$ . Change our Julia implementation to report if a matrix is not positive definite. (Hint: this relates to why a Cholesky factorization always exists for a positive definite matrix in the previous problem.)

Use this to test if the matrix for Poisson's equation from Homework 1 is positive or negative definite.

Problem 3: Direct Methods for Tridiagonal systems (Even more on Cholesky!)

In class, we said the definition of a sparse matrix was one with enough zeros to take advantage of it. In this problem, we'll show how to take advantage of tridiagonal structure in a matrix.

Make sure you have read through the notes on the Cholesky factorization in our notes on Elimination methods.

Let $\mA$ be a symmetric, tridiagonal, positive definite matrix: $\mA = \bmat{\alpha_1 & \beta_1 \\ \beta_1 & \alpha_2 & \ddots \\ & \ddots & \ddots & \ddots \\ & & \beta_{n-2} & \alpha_{n-1} & \beta_{n-1} \\ & & & \beta_{n-1} & \alpha_n }$

Let $\mA_{n-1}$ is the $n-1 \times n-1$ leading prinicipal minor of $\mA$ . Suppose you are given the Choleksy factorization of $\mA_{n-1} = \mL_{n-1} \mL_{n-1}^T$ . Determine the structure of $\mL_{n-1}$ (hint, $\mL_{n-1} \mL_{n-1}^T$ must be tridiagonal!).
Now, show how to compute the Cholesky factorization of $\mA = \mL \mL^T$ from $\mL_{n-1}$ in a constant number of operations.
Use the result of the first two parts to write down an algorithm that computes the Cholesky factorization of $\mA$ in a linear amount of work (starting from scratch!)

Problem 4: Direct methods for eigenvectors.

Suppose you try to compute an eigenvector using the "elimination" techniques from class. This question will ask you to investigate the issues that arise.

First, we will assume we know the eigenvalue $\lambda$ and that this

eigenvector is simple. (Don't worry if you don't know what this means. What it means for this problem is that the linear system we build only has the mild issue we discuss now.) Consequently, we wish to find a solution of the linear system: $\mA \vx = \lambda \vx \qquad \text{ or } \qquad (\mA - \lambda \mI) \vx = 0 .$ This linear system is singular because any scalar multiple of an eigenvector is also a solution. However, this might be considered a mild singularity as we simply get to choose the scale of the solution. (If the eigenvalue was not simple, then the solution would be more complicated, that's why we assume simple, which formally means that $\rank(\mA - \lambda \mI) = n-1$ )

Explain what happnes when we try and use our elimination solver solve1_pivot2 using this system and also alter this routine so that we are able to produce an eigenvector when given the matrix $\mA$ and $\lambda$ .

Problem 5: QR and Cholesky

Let $\mA$ be an $m \times n$ tall ( $m \ge n$ ), full rank matrix.

Let $\mA = \mQ \mR$ be the QR decomposition. Let $\mF$ be the Cholesky factor of the symmetric, positive definite matrix $\mB = \mA^T \mA$ . (So $\mF^T \mF = \mB$ .) Show that $\mR$ and $\mF$ have an extremely close relationship and discuss any uniqueness properties of Cholesky or QR that impact this relationship.

Matrix Computations

David Gleich

Purdue University

Fall 2024

Course number CS-51500

Tuesday-Thursday 10:30-11:45pm.

Homework 3b

Homework 3b

Problem 0: Homework checklist

Problem 1: Recursive block elimination

Problem 2: More on Cholesky

Problem 3: Direct Methods for Tridiagonal systems (Even more on Cholesky!)

Problem 4: Direct methods for eigenvectors.

Problem 5: QR and Cholesky

Assignments

texfiles