Library/Graphics/Matrices

Overview
In computer graphics, matrices are useful mathematical objects for encapsulating transformations: for example moving, rotating, and scaling points and vectors.

Uses include:
 * Scaling and rotating an artist's model to fit the world scene
 * Computing a camera's projection of the 3D world data onto a 2D display
 * Projects from a point on an object's surface into a shadow map to determine lighting
 * Storing a series of joint transformations in a animated character skeleton
 * etc.

A matrix, usually with 4x4 dimensions, can be used for all these general operations making it a versatile tool. Furthermore, matrices can store the result of sequential set of matrix operations - for example, moving then rotating a point - which further adds to matrices usefulness as a compact representation for otherwise complex operations.

In pure mathematics, matrices can be generalized to any dimensions: however, in computer graphics, the most frequently encountered matrix (for accomplishing some of the tasks mentioned above) is the 4x4 matrix. In programming languages, the matrix is often stored as 4x4 array of floating point numbers.

Additional Information

 * Wolfram MathWorld: Matrix
 * Wikipedia: Matrix

Notation


The traditional mathematical notation for matrices differs from the traditional conventions in computer graphics programming.


 * Matrix indices are 1-based
 * Matrix dimensions are specified as N rows by M columns

The differences introduce confusion as the API designer of a matrix library is left with the choice of whether to adopt the traditional mathematical notation, traditional programming notation, or a mix of the two approaches. Since there is no universal standard, this requires the programmer to identify the standard used within any particular library in order to use it correctly.

Example
In the above example, does return the element at the 2nd row,3rd column (mathematical notation); the 3rd column, 4th row ('standard' 2d array notion in graphics); or the 3rd row, 4th column (mix of the approaches, using row-first but zero-based indexing)? There's no right answer: it depends on the implement of the library.

LxEngine uses a zero-based, row-then-column indexing scheme. E.g. refers to the element in the first row, fourth column.

General Matrix Multiplication
A SxM matrix can be multipled by any MxT matrix resulting in a SxT matrix.


 * Rule: if the inner dimensions of two matrices match, they can be multipled and will result in a new matrix with dimensions equal to the outer dimensions of the multiplicands.

In the equation:

$$C = AB$$

$$C_{(i,j)} = A_{row\,i}\: \cdot \: B_{col\,j}$$

The element at the i-th row and j-th column of C will be equal to the dot product of the i-th row of A and the j-th column of B.

Column versus Row Vectors
Multiplying a generalized matrix with a generalized 4-tuple is a well-defined, unambiguous operation: Ci,j is equal to dot product of the i-th row of A and the j-th column of B. There is exactly one way to do it correctly and there are no API design choices involved.

That said, how geometric transformations are stored in a matrix not unambiguous. Each element of the matrix takes on a specific semantic meaning about the coordinate system when it is used as a transformation and the associated meaning is dependent on whether points and vectors will be treated as column vectors or row vectors. The best explanation is likely by example:

Translating a Point
Translation, i.e. moving a point p to point p' is accomplished by adding a vector t to that point. How is can this operation be represented as a matrix? There are two possible matrix representations depending on whether the point is treated as a row-vector or a column vector.

Note how the position of the translation elements (tx, ty, tz) within the matrix depending on the choice of vector representation.

Row Vector Representation
$$ \begin{bmatrix} p_{x} & p_{y} & p_{z} & 1 \end{bmatrix} \begin{bmatrix} 1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ t_{x} & t_{y} & t_{z} & 1 \end{bmatrix} = \begin{bmatrix} p_{x} + t_{x}\\ p_{y} + t_{y}\\ p_{z} + t_{z}\\ 1 \end{bmatrix} $$

Column Vector Representation
$$ \begin{bmatrix} 1 & 0 & 0 & t_{x}\\ 0 & 1 & 0 & t_{y}\\ 0 & 0 & 1 & t_{z}\\ 0 & 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} p_{x} \\ p_{y} \\ p_{z} \\ 1 \end{bmatrix} = \begin{bmatrix} p_{x} + t_{x} & p_{y} + t_{y} & p_{z} + t_{z} & 1 \end{bmatrix} $$

Example: Translation
Consider the case of creating a translation matrix that add (1, 2, 3) to the x, y, z coordinates of the point. How should the matrix be set up?

Pre-multiply or Post-multiply
First, consider how a general matrix and a general vector need to be multiplied together. Assuming that the matrix is 4x4, then the vector must either be 4x1 or 1x4. Therefore, the matrix multiplication can either be written as, where x is treated as a 1x4 matrix or 'row vector' that uses 'pre-multiplication' with the matrix; the resulting vector is also 1x4. Alternately, the equation can be written as where x is treated as a 4x1 matrix or 'column vector' that uses 'post-multiplication' with the matrix and results in x' as a 4x1 column vector.

(Two quick asides: in matrix notion, an NxM matrix specifies a matrix with N rows and M columns. This is different than the traditional computer graphics two-dimension array which is usually N specifies width and M specifies height.)

Second, consider if is equal to (1, 0, 0, 1). What is the result of the multiplication?

In general the matrix multiplication algorithm is:

This means with pre-multiplication (row-vector), the result is only one row R(1, 1...4). The elements of the result are the dot product of x with each column of A.  That means the translation factor of A must be located in row 4, column 1, or A(4, 1).

With column-vectors, the result is a column vector R(1...4, 1). The translation factor must be in A(1,4).

The choice of where the translation factor is put determines how the matrices must be multipled. Note again: the general matrix multiplication algorithm itself has not changed. It that since certain elements of the vector and matrix have a certain semantic meaning that the matrix multiplication must now be sure to align those values so X and Y are not being confused.

LxEngine uses a row-vector representation where operations are written in the pre-multiply form.

Alternate Approachs
Both fortunately and unfortunately, a column-vector can be correctly "pre-multiplied" by a matrix by simply transposing the vector before and after the result. This is fortunate because it affords a lot of flexibility. It is unfortunate, because it allows for the system to create 'correct' but confusing operations on graphics data.

Column-major Order vs. Row-major Order
Another consideration in the implementation of matrices in C++ is the memory layout of the elements. Despite the similar terminology, the ordering of the element is not necessarily related to the choice of treating transformations via column-vector or row-vector semantics. The column-major or row-major ordering of the matrix simply determines if the first four elements in the 16-element block of memory represent the matrix's first column or its first row.

In a good, encapsulated class, the internal memory layout should not matter: but in reality, matrix data is often passed to external libraries such as OpenGL or DirectX via a pointer. The internal memory layout needs to be communicated to the third-party API. Fortunately, a row-major ordered matrix can be treated as a column-major ordered matrix (and vice-versa) by simply taking the transpose of the matrix.

LxEngine uses a column-major order for internal memory layout. In other words, the first four elements of the matrix represent the first column of the matrix.

Also See

 * Wikipedia: Row-Major Order
 * OpenGL APIs: glLoadMatrix / glLoadTransposeMatrix

Rotation
Rotation about the x-axis, y-axis, and z-axis for column-vectors:

$$ \begin{alignat}{1} R_x(\theta) &= \begin{bmatrix} 1 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta \\[3pt] 0 & \sin \theta & \cos \theta \\[3pt] \end{bmatrix} \\[6pt] R_y(\theta) &= \begin{bmatrix} \cos \theta & 0 & \sin \theta \\[3pt] 0 & 1 & 0 \\[3pt] -\sin \theta & 0 & \cos \theta \\ \end{bmatrix} \\[6pt] R_z(\theta) &= \begin{bmatrix} \cos \theta & -\sin \theta & 0 \\[3pt] \sin \theta & \cos \theta & 0\\[3pt] 0 & 0 & 1\\ \end{bmatrix}. \end{alignat} $$

These matrices can easily be derived from the two-dimensional case:

$$x' = x \cos \theta - y \sin \theta\,$$ $$y' = x \sin \theta + y \cos \theta\,$$

Also See

 * Wikipedia: Rotation Matrix

Numerical Accuracy
Matrix operations involve many floating point operations. Multiplying several matrices together than transforming points results in a potentially a lot of accumulation of floating point error. How to track and minimize that: appropriate epsilon. Order of operations. Batch operations. Deferring transformation based on coordinate systems.

Variations

 * 3x4 / 4x3 matrices
 * SRT notation
 * special case imp pattern