Conjugate Gradient for Differential Equations
An overview of the approach and pseudocode for the solver.

1 Differential Equations

Many graphics problems include differential equations. They have the general form the rate of change of xx is some function of xx. For most graphics simulations, the following all hold:

There are two approximations being used here: first, we treat complicated math as if it were linear, and, second, we use a bunch of small steps in time instead of a continuous model of time. These two are related: as each time step length approaches zero, the linear solution approaches the correct solution.

2 Implementing a sparse matrix solver

This section presents the conjugate gradient method. It is generally the fastest class of solvers for the kinds of differential equations we encounter in graphics. It’s general process runs as follows:

  1. Given an initial solution of the equation Ax=bA \vec x = \vec b.
  2. Find the gradient (i.e. direction of maximal change) of the error (i.e. absolute magnitude of AxbA \vec x - \vec b).
  3. Tweak that gradient so that it is conjugate (with respect to AA) with every previous gradient we’ve considered. For symmetric positive-definite matrices, this tweak will make the method converge on the correct answer much more quickly than a simple gradient descent method.
  4. Move the guessed solution along that tweaked gradient to minimize error.
  5. Repeat.

Given a linear system Ax=bA \vec x = \vec b, where AA is sparse, symmetric, and positive-definite:

  1. Store AA well. The operation we’ll do with AA is AvA \vec v for various v\vec v, so we need a storage that makes multiplication by a vector fast. LIL is the easiest such storage technique to code: A matrix is a list of rows and each row is a list of (column index, value) pairs for the non-zero entries in that row.

  2. Optionally, precondition the system. Instead of solving Ax=bA \vec x = \vec b, solve E1A(E1)Ty=E1bE^{-1}A(E^{-1})^T \vec y = E^{-1}\vec b and y=ETx\vec y = E^T x.

    Preconditioning is a topic we’ll not really discuss in this class. A good EE can make the algorithm converge in fewer steps, but also adds extra up-front work that becomes more involved for better EEs; that means picking EE is something of a balancing act, one more involved than we can adequately cover in this class.

  3. Use the following algorithm which, given an initial estimate x\vec x, iteratively improves it:

    1. r=bAx\vec r = \vec b - A \vec x // the residual vector; stores the error of the current estimate

    2. p=r\vec p = \vec r // the direction we’ll move x\vec x in to improve it

    3. ρ=rr\rho = \vec r \cdot \vec r // the squared magnitude of r\vec r

    4. repeat until ρ\rho is sufficiently small:

      1. ap=Ap\vec a_p = A \vec p

      2. α=ρpap\alpha = \dfrac{\rho}{\vec p \cdot \vec a_p} // how far to move in the p\vec p direction

      3. x+=αp\vec x \mathrel{+}= \alpha \vec p // move

      4. r=αap\vec r \mathrel{-}= \alpha \vec a_p // update the error

      5. ρ=rr\rho' = \vec r \cdot \vec r

      6. β=ρρ\beta = \dfrac{\rho'}{\rho} // used to ensure each direction is conjugate with all previous directions

      7. ρ:=ρ\rho := \rho' // update the error squared magnitude

      8. p×=β\vec p \mathrel{\times}= \beta // update the direction for the next step

      9. p+=r\vec p \mathrel{+}= \vec r // update the direction for the next step