The Matrix has You

So, now that we can put the world into perspective, let's do it the right way. The needlessly overcomplicated for the time being but will make sense in a few tutorials way.

First, let us look at the system of equations used to compute clip coordinates from camera space. Given that S is the frustum scale factor, N is the zNear and F is the zFar, we get the following four equations.

Equation 4.3. Camera to Clip Equations


The odd spacing is intentional. For laughs, let's add a bunch of meaningless terms that do not change the equation, but starts to develop an interesting pattern:

Equation 4.4. Camera to Clip Expanded Equations


What we have here is what is known as a linear system of equations. The equations can be specified as a series of coefficients (the numbers being multiplied by the XYZW values) which are multiplied by the input values (XYZW) to produce the single output. Each individual output value is a linear combination of all of the input values. In our case, there just happen to be a lot of zero coefficients, so the output values in this particular case only depend on a few input values.

You may be wondering at the multiplication of the additive term of Zclip's value by the camera space W. Well, our input camera space position's W coordinate is always 1. So performing the multiplication is valid, so long as this continues to be the case. Being able to do what we are about to do is part of the reason why the W coordinate exists in our camera-space position values (the perspective divide is the other).

We can re-express any linear system of equations using a special kind of formulation. You may recognize this reformulation, depending on your knowledge of linear algebra:

Equation 4.5. Camera to Clip Matrix Transformation


The two long vertical columns of XYZW labeled clip and camera are 4-dimensional vectors; namely the clip and camera space position vectors. The larger block of numbers is a matrix. You probably are not familiar with matrix math. If not, it will be explained presently.

Generically speaking, a matrix is a two dimensional block of numbers (matrices with more than 2 dimensions are called tensors). Matrices are very common in computer graphics. Thus far, we have been able to get along without them. As we get into more detailed object transformations however, we will rely more and more on matrices to simplify matters.

In graphics work, we typically use 4x4 matrices; that is, matrices with 4 columns and 4 rows respectively. This is due to the nature of graphics work: most of the things that we want to use matrices for are either 3 dimensional or 3 dimensional with an extra coordinate of data. Our 4D positions are just 3D positions with a 1 added to the end.

The operation depicted above is a vector-matrix multiplication. A matrix of dimension nxm can only be multiplied by a vector of dimension n. The result of such a multiplication is a vector of dimension m. Since our matrix in this case is 4x4, it can only be multiplied with a 4D vector and this multiplication will produce a 4D vector.

Matrix multiplication does what the expanded equation example does. For every row in the matrix, the values of each component of the column are multiplied by the corresponding values in the rows of the vector. These values are then added together; that becomes the single value for the row of the output vector.

Equation 4.6. Vector Matrix Multiplication


This results ultimately in performing 16 floating-point multiplications and 12 floating-point additions. That's quite a lot, particularly compared with our current version. Fortunately, graphics hardware is designed to make these operations very fast. Because each of the multiplications are independent of each other, they could all be done simultaneously, which is exactly the kind of thing graphics hardware does fast. Similarly, the addition operations are partially independent; each row's summation does not depend on the values from any other row. Ultimately, vector-matrix multiplication usually generates only 4 instructions in the GPU's machine language.

We can re-implement the above perspective projection using matrix math rather than explicit math. The MatrixPerspective tutorial does this.

The vertex shader is much simpler in this case:

Example 4.4. MatrixPerspective Vertex Shader

#version 330

layout(location = 0) in vec4 position;
layout(location = 1) in vec4 color;

smooth out vec4 theColor;

uniform vec2 offset;
uniform mat4 perspectiveMatrix;

void main()
{
    vec4 cameraPos = position + vec4(offset.x, offset.y, 0.0, 0.0);
    
    gl_Position = perspectiveMatrix * cameraPos;
    theColor = color;
}

The OpenGL Shading Language (GLSL), being designed for graphics operations, naturally has matrices as basic types. The mat4 is a 4x4 matrix (columns x rows). GLSL has types for all combinations of columns and rows between 2 and 4. Square matrices (matrices where the number of columns and rows are equal) only use one number, as in mat4 above. So mat3 is a 3x3 matrix. If the matrix is not square, GLSL uses notation like mat2x4: a matrix with 2 columns and 4 rows.

Note that the shader no longer computes the values on its own; it is given a matrix with all of the stored values as a uniform. This is simply because there is no need for it. All of the objects in a particular scene will be rendered with the same perspective matrix, so there is no need to waste potentially precious vertex shader time doing redundant computations.

Vector-matrix multiplication is such a common operation in graphics that operator * is used to perform it. So the second line of main multiplies the perspective matrix by the camera position.

Please note the order of this operation. The matrix is on the left and the vector is on the right. Matrix multiplication is not commutative, so M*v is not the same thing as v*M. Normally vectors are considered 1xN matrices (where N is the size of the vector). When you multiply vectors on the left of the matrix, GLSL considers it an Nx1 matrix; this is the only way to make the multiplication make sense. This will multiply the single row of the vector with each column, summing the results, creating a new vector. This is not what we want to do. We want to multiply rows of the matrix by the vector, not columns of the matrix. Put the vector on the right, not the left.

The program initialization routine has a few changes:

Example 4.5. Program Initialization of Perspective Matrix

offsetUniform = glGetUniformLocation(theProgram, "offset");

perspectiveMatrixUnif = glGetUniformLocation(theProgram, "perspectiveMatrix");

float fFrustumScale = 1.0f; float fzNear = 0.5f; float fzFar = 3.0f;

float theMatrix[16];
memset(theMatrix, 0, sizeof(float) * 16);

theMatrix[0] = fFrustumScale;
theMatrix[5] = fFrustumScale;
theMatrix[10] = (fzFar + fzNear) / (fzNear - fzFar);
theMatrix[14] = (2 * fzFar * fzNear) / (fzNear - fzFar);
theMatrix[11] = -1.0f;

glUseProgram(theProgram);
glUniformMatrix4fv(perspectiveMatrixUnif, 1, GL_FALSE, theMatrix);
glUseProgram(0);

A 4x4 matrix contains 16 values. So we start by creating an array of 16 floating-point numbers called theMatrix. Since most of the values are zero, we can just set the whole thing to zero. This works because IEEE 32-bit floating-point numbers represent a zero as 4 bytes that all contain zero.

The next few functions set the particular values of interest into the matrix. Before we can understand what's going on here, we need to talk a bit about ordering.

A 4x4 matrix is technically 16 values, so a 16-entry array can store a matrix. But there are two ways to store a matrix as an array. One way is called column-major order, the other naturally is row-major order. Column-major order means that, for an NxM matrix (columns x rows), the first N values in the array are the first column (top-to-bottom), the next N values are the second column, and so forth. In row-major order, the first M values in the array are the first row (left-to-right), followed by another M values for the second row, and so forth.

In this example, the matrix is stored in column-major order. So array index 14 is in the third row of the fourth column.

The entire matrix is a single uniform. To transfer the matrix to OpenGL, we use the glUniformMatrix4fv function. The first parameter is the uniform location that we are uploading to. This function can be used to transfer an entire array of matrices (yes, uniform arrays of any type are possible), so the second parameter is the number of array entries. Since we're only providing one matrix, this value is 1.

The third parameter tells OpenGL what the ordering of the matrix data is. If it is GL_TRUE, then the matrix data is in row-major order. Since our data is column-major, we set it to GL_FALSE. The last parameter is the matrix data itself.

Running this program will give us:

Figure 4.9. Perspective Matrix

Perspective Matrix

The same thing we had before. Only now done with matrices.