2 3D transformations

Consider a 3D object defined as a set of homogeneous points, including $\vec p$ ; and a set of homogeneous surface normals – that is, directions that are normal to (perpendicular to) the surface at a given point, including $\vec n$ . By definition of points, $p_w = 1$ ; by the definition of directions, $n_w = 0$ .

2.1 Translation

We can move the object to a new location by multiplying every point and surface normal by a translation matrix. By construction, the translation matrix will have no impact on the surface normals.

To move by $\Delta x$ along the $x$ axis, $\Delta y$ along the $y$ axis, and $\Delta z$ along the $z$ axis, we use $\begin{bmatrix}1&0&0&\Delta x \\ 0&1&0&\Delta y \\ 0&0&1&\Delta z \\ 0&0&0&1\end{bmatrix} \vec p$

JavaScript code to create this matrix

const m4trans = (dx,dy,dz) => new Float32Array([1,0,0,0, 0,1,0,0, 0,0,1,0, dx,dy,dz,1])

This matrix has the same effect as adding $(\Delta x, \Delta y, \Delta z, 0)$ to every point and doing nothing to the surface normals. However, because it is a matrix we can combine it with other transformations.

If $T$ is a translation matrix, $T^{-1}$ is found by replacing each value $v$ on the first three rows of the last column with $-v$ .

$\begin{bmatrix}1&0&0&\Delta x \\ 0&1&0&\Delta y \\ 0&0&1&\Delta z \\ 0&0&0&1\end{bmatrix} \begin{bmatrix}1&0&0&-\Delta x \\ 0&1&0&-\Delta y \\ 0&0&1&-\Delta z \\ 0&0&0&1\end{bmatrix} = \begin{bmatrix}1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1\end{bmatrix}$

Surface normals (and other offsets and directions) should not change when an object is translated. This is handled automatically if the surface normal has $w=0$ :

$\begin{bmatrix}n_x \\ n_y \\ n_z \\ 0\end{bmatrix} = \begin{bmatrix}1&0&0&\Delta x \\ 0&1&0&\Delta y \\ 0&0&1&\Delta z \\ 0&0&0&1\end{bmatrix} \begin{bmatrix}n_x \\ n_y \\ n_z \\ 0\end{bmatrix}$

2.2 Rotation

We can rotate the object about the origin by multiplying every point and surface normal by a rotation matrix.

To rotate $\theta$ degrees about the $x$ axis, we use $\begin{bmatrix}1&0&0&0 \\ 0&\cos(\theta)&-\sin(\theta)&0 \\ 0&\sin(\theta)&\cos(\theta)&0 \\ 0&0&0&1\end{bmatrix} \begin{bmatrix}\vec p & \vec n \end{bmatrix}$

To rotate $\theta$ degrees about the $y$ axis, we use $\begin{bmatrix}\cos(\theta)&0&\sin(\theta)&0 \\ 0&1&0&0 \\ -\sin(\theta)&0&\cos(\theta)&0 \\ 0&0&0&1\end{bmatrix} \begin{bmatrix}\vec p & \vec n \end{bmatrix}$

To rotate $\theta$ degrees about the $z$ axis, we use $\begin{bmatrix}\cos(\theta)&-\sin(\theta)&0&0 \\ \sin(\theta)&\cos(\theta)&0&0 \\ 0&0&1&0 \\ 0&0&0&1\end{bmatrix} \begin{bmatrix}\vec p & \vec n \end{bmatrix}$

If $R$ is a matrix representing a rotation about the origin, $R^{-1} = R^{T}$ .

$\begin{bmatrix}1&0&0&0 \\ 0&\cos(\theta)&-\sin(\theta)&0 \\ 0&\sin(\theta)&\cos(\theta)&0 \\ 0&0&0&1\end{bmatrix} \begin{bmatrix}1&0&0&0 \\ 0&\cos(\theta)&\sin(\theta)&0 \\ 0&-\sin(\theta)&\cos(\theta)&0 \\ 0&0&0&1\end{bmatrix} = \begin{bmatrix}1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1\end{bmatrix}$

We also have guides to some fancier rotations:

Rotate

\theta

around axis

\vec r

Normalize $\vec r$ and set $c = \cos(\theta)$ and $s = \sin(\theta)$ .
The matrix is $\begin{bmatrix} r_x^2(1-c)+c & r_x r_y (1-c) - r_z s & r_x r_z (1-c) + r_y s \\ r_x r_y (1-c) + r_z s & r_y^2(1-c)+c & r_y r_z (1-c) - r_x s \\ r_x r_z (1-c) - r_y s & r_y r_z (1-c) + r_x s & r_z^2(1-c)+c \\ \end{bmatrix}$

Rotate as little as possible to have rotated

\vec a

line up with

\vec b

Normalize $\vec a$ and $\vec b$ .
Let $\vec r = \vec a \times \vec b$ and $s = \|\vec r\|$ .
Let $c = \vec a \cdot \vec b$ .
Use that $\vec r / s$ , $c$ , and $s$ in the rotate around axis $\vec r$ formula.

Rotate so that one axis now points towards

\vec p

and another points as close to

\vec q

as possible

Set the column of the matrix that corresponds to the axis that should point toward $\vec p$ be a normalized version of $\vec p$ .
Set the column of the matrix that shouldn’t point towards either of the points be a normalized version of $\vec p \times \vec q$ .
Set the remaining column to be the cross product of the other two.
Negate an axis’s column if the negative instead of positive axis should point toward a point.
Negate the unconstrained column if $(\vec x \times \vec y) \cdot \vec z < 0$ .

Rotate so

\vec p

lies on one axis and

\vec q

as close to another axis as possible

This is the inverse (which for rotations is simply the transpose) of the rotation that points the axes towards $\vec p$ and $\vec q$ .

Rotate a forward vector to lie along the

-z

axis and an up vector to lie as close to the

+y

axis as the forward allows

This is special case of the previous item, but one often wanted as part of the view matrix.

Let $\vec f$ be a normalized version of the forward vector
Let $\vec r$ be a normalized version of forward × up
Let $\vec u = \vec r \times \vec f$
The matrix is $\begin{bmatrix} r_x&r_y&r_z&0\\ u_x&u_y&u_z&0\\ -f_x&-f_y&-f_z&0\\ 0&0&0&1\\ \end{bmatrix}$

In many cases, this is specified by an eye location and a center location, where forward = center − eye, and followed by a translate of −eye, then a perspective projection with the $\pm$ being $-$ , not $+$ .

JavaScript code to create some of these matrices

const m4rotX = (ang) => { // around x axis
  let c = Math.cos(ang), s = Math.sin(ang);
  return new Float32Array([1,0,0,0, 0,c,s,0, 0,-s,c,0, 0,0,0,1]);
}
const m4rotY = (ang) => { // around y axis
  let c = Math.cos(ang), s = Math.sin(ang);
  return new Float32Array([c,0,-s,0, 0,1,0,0, s,0,c,0, 0,0,0,1]);
}
const m4rotZ = (ang) => { // around z axis
  let c = Math.cos(ang), s = Math.sin(ang);
  return new Float32Array([c,s,0,0, -s,c,0,0, 0,0,1,0, 0,0,0,1]);
}
const m4fixAxes = (f, up) => { // f to -z, up to near +y
  f = normalize(f)
  let r = normalize(cross(f,up))
  let u = cross(r,f)
  return new Float32Array([
    r[0],u[0],-f[0],0,
    r[1],u[1],-f[1],0,
    r[2],u[2],-f[2],0,
    0,0,0,1
  ])
}

See also the page on skeletal animation.

2.3 Scaling

We can scale the object about the origin along principle axes by multiplying every point and surface normal by a scaling matrix.

To scale by $s_x$ in the $x$ axis, $s_y$ in the $y$ axis, and $s_z$ in the $z$ axis, we use $\begin{bmatrix}s_x&0&0 \\ 0&s_y&0&0\\ 0&0&s_z&0 \\ 0&0&0&1\end{bmatrix} \vec p$

JavaScript code to create this matrix

const m4scale = (sx,sy,sz) => new Float32Array([sx,0,0,0, 0,sy,0,0, 0,0,sz,0, 0,0,0,1])

If $S$ is a matrix representing a scaling along the principle axes, $S^{-1}$ is found by replacing each value $v$ on the main diagonal with $1 / v$ .

$\begin{bmatrix}s_x&0&0 \\ 0&s_y&0&0\\ 0&0&s_z&0 \\ 0&0&0&1\end{bmatrix} \begin{bmatrix}{1 / s_x}&0&0 \\ 0&{1 / s_y}&0&0\\ 0&0&{1 / s_z}&0 \\ 0&0&0&1\end{bmatrix} = \begin{bmatrix}1&0&0&0 \\ 0&1&0&0 \\ 0&0&1&0 \\ 0&0&0&1\end{bmatrix}$

Scaling surface normals by the same matrix as points messes up the perpendicular property of the normals. To keep normals perpendicular to the surface, if we use a scaling matrix $S$ to create a new point $S \vec p$ , then we use its inverse to create a new surface normal $S^{-1} \vec n$ : $\begin{bmatrix}{1 / s_x}&0&0 \\ 0&{1 / s_y}&0&0\\ 0&0&{1 / s_z}&0 \\ 0&0&0&1\end{bmatrix} \vec n$

2.4 Other origins and axes

The rotation and scaling matrices above operate along the primary axes centered on the origin. We can construct matrices that use other origins and axes be multiplying several matrices together.

To rotate or scale about an origin $\vec o$ instead of the usual origin point $(0,0,0,1)$ , we use two helper matrices: $M$ does the rotate or scale about the origin and $T_o$ translates $\vec o$ to $(0,0,0,1)$ . The matrix that rotates or scales about $\vec o$ is then $T_o^{-1} M T_o$ .

To rotate around or scale along an axis passing through the origin and point $\vec a$ , we use several helper matrices: $R_z$ rotates $\vec a$ about the z axis until it lies in the $y=0$ plane; $R_y$ rotates $R_z \vec a$ about the y axis until it lies along the $x$ axis; and $M$ does the rotate around or scale along or around $x$ axis. The rotate around or scale along an axis passing through $\vec a$ is then $R_z^{-1} R_y^{-1} M R_y R_z$ .

Combining these, our generic matrix using any axis and origin is $T_o^{-1} R_z^{-1} R_y^{-1} M R_y R_z T_o$

The usual view matrix

Given a camera location, location to center in view, and up direction we can make a view matrix $R T$ where

$T$ is a translation matrix that moves the camera location to the origin
$R$ $R$ is a rotation matrix that rotates
- the forward vector (center − camera) to lie along the $−z$ axis
- the up vector to lie as close to the +y axis as the forward allows

JavaScript code to create a view matrix

Assuming we use a $-z$ projection matrix (see below):

const m4view = (eye, center, up) => m4mul(m4fixAxes(sub(center,eye), up), m4trans(-eye[0],-eye[1],-eye[2]))

2.5 Division

Notice that all the matrices given above have the same last row, $\begin{bmatrix}0&0&0&1\end{bmatrix}$ . Matrices with that property are called affine matrices, and the transformations they encode are called affine transformations. All affine transformations can be represented by a combination of translation, rotation, and scaling.

If the bottom row is some other vector $\vec b$ , the impact on point $\vec p$ is the same as dividing by $\vec b \cdot \vec p$ after the transformation encoded in the matrix. Such a division causes lines passing through some fixed point in space to become parallel lines instead, which will be useful in projecting from a 3D perspective to a 2D screen. Because of that application, matrices with a last row different from $\begin{bmatrix}0&0&0&1\end{bmatrix}$ are often called perspective matrices or projection matrices. Because they also cause certain frustums to become rectangular prisms, they are sometimes called frustum matrices.

One of the most common perspective or projection matrices does the following:

Copy $z$ (or in some cases $-z$ into $w$ )
Change $z$ so that $\frac{z}{w} = -1$ when $w = \text{near}$ , and $\frac{z}{w} = 1$ when $w = \text{far}$ ¹¹ WebGL inherits an odd decision from OpenGL: the depth buffer is capped to $[0, 1]$ but the frustum clipping caps $z$ to $[-1, 1]$ . There is a viewport-like transformation done between the vertex and fragment shaders to move one range to the other. Thus the vertex shader assumes a $-1$ -to- $1$ range for $z$ but the fragment shader assumes a $0$ -to- $1$ range for $z$ .

$\begin{bmatrix} 1&0&0&0\\ 0&1&0&0\\ 0&0&\pm \dfrac{\text{far}+\text{near}}{\text{far}-\text{near}} & \dfrac{2 \cdot \text{far} \cdot \text{near}}{\text{near}-\text{far}}\\ 0&0&\pm 1 & 0\\ \end{bmatrix}$

If the bottom row is some other vector $\vec b$ , the impact on surface normal $\vec n$ is (from a graphics perspective) destructive and meaningless. We never multiply surface normals (nor other offsets or directions) by a perspective matrix.

We also don’t adjust normals to handle display aspect ratios, which are handled by scaling $x$ and $y$ . Additionally, the simple $z$ -to- $w$ rule means we have a 90° field of view in both $x$ and $y$ , which is rarely what we want. Because of that, it is common to see a single matrix applied to positions but not normals which has the form

$\begin{bmatrix} s_x&0&0&0\\ 0&s_y&0&0\\ 0&0&\pm \dfrac{\text{far}+\text{near}}{\text{far}-\text{near}} & \dfrac{2 \cdot \text{far} \cdot \text{near}}{\text{near}-\text{far}}\\ 0&0&\pm 1 & 0\\ \end{bmatrix}$

where $s_x = s_y\dfrac{\text{screen height}}{\text{screen width}}$ and $s_y = \cot\left(\dfrac{\text{field of view in y}}{2}\right)$ .

JavaScript code to create this matrix

Assuming we want $-z$ as $w$ , the more common version:

const m4perspNegZ = (near, far, fovy, width, height) => {
  let sy = 1/Math.tan(fovy/2);
  let sx = sy*height/width;
  return new Float32Array([sx,0,0,0, 0,sy,0,0, 0,0,-(far+near)/(far-near),-1, 0,0,(2*far*near)/(near-far),0]);
}

3 Common matrix setups

Objects are commonly defined in a reference coordinate system. A model matrix $M$ moves points in that reference position into their correct location in the scene.

The viewing point may be anywhere in the scene. A view matrix $V$ positions and orients the camera from that arbitrary viewing point to the origin with the viewing direction down either the $+z$ or $-z$ axis.

Because most models are defined around the origin, most model matrices move from the origin to the scene location of a model. But view matrices move from the scene location of the viewer to the origin. These are inverses of each other, a common source of confusion.

Suppose you want to render from the viewpoint of the driver of a vehicle, moving the vehicle model and the viewpoint together. If $A$ is the model matrix of the vehicle then $A^{-1}$ is the corresponding view matrix (assuming the viewer is positioned at the origin of the vehicle’s coordinate space).

A projection matrix $P$ changes $z$ and $w$ to provide perspective and scales $x$ and $y$ to control field of view and aspect ratio. We pick the signs in the matrix so that $w>0$ in the viewing direction.

We expect

$P$ to change only if the screen is resized
$V$ to change every frame in which the viewer moves
$M$ to be different for each object in the scene

Assuming neither $M$ nor $V$ squishes or stretches things, every position $\vec p$ and normal $\vec n$ will be multiplied by both $M$ and $V$ . We can save some CPU-GPU communication and some per-vertex work by combining these into a model-view matrix $(V \cdot M)$ . The vertex shader will multiply every position and normal by this combined model-view matrix.

Positions will also be multiplied by $P$ , but any lighting information, including normals and usually also positions, will not. Thus $P$ will be supplied separately.

Thus we might expect something like the following

CPU setup routine:
    set P based on the field of view and aspect ratio
    send P to the GPU's projection uniform

CPU draw routine:
    set V from viewing position
    for each object:
        set mv as V * M
        send mv to the GPU's modelview uniform
        send the object's geometry data to the GPU

GPU vertex shader
    set output view_normal as modelview * input normal
    set output view_position as modelview * input position
    set gl_Position as projection * view_position

	matrix	Affine	uniform	Rigid	Linear
straight	✓	✓	✓	✓	✓
parallel		✓	✓	✓	✓
angles			✓	✓
size				✓
origin					✓

1 Homogeneous vectors

2 3D transformations

2.1 Translation

2.2 Rotation

2.3 Scaling

2.4 Other origins and axes

2.5 Division

3 Common matrix setups

4 Categories of motion