So far, we have learned how to write GLSL shaders, and we have introduced abstractions to make working with them easier. What we can draw so far with OpenGL are images that can be computed in the fragment shader (Chapter 13) and colored meshes (Chapter 14 and many previous chapters).
In this chapter, we introduce a new concept: transformations. A transformation is a mathematical operation that we can apply to the geometric data (points, lines, triangles, camera positions, and so on) that make up our scene so that such data change their positions and shapes. For example, given a mesh, we can apply a transformation to change its position in the scene, rotate it to face other direction, or scale it so that it becomes 2 times or 3 times bigger. We can use transformations to change how the scene is rendered; for examples, changing the zoom level or moving the final rendering around the canvas. Transformation is a fundamental concept in computer graphics, and mastering it is a requirement for a competent graphics practitioner.
Transformation, however, is quite a complicated topic in it self. In the end, we want to cover transformations on 3D objects because WebGL is all about rendering 3D scenes. Nevertheless, such 3D transformations are quite complex by themselves. Moreover, we have not really discussed how to deal with 3D data so far besides setting the $z$-coordinates to zero so as to not worry about the third dimension like in the last chapter. So, in this chapter, we shall limit ourselves to the discussions of 2D transformations. They are much easier to deal with because they are very easy to visualize with what we have learned so far, and there are fewer types of 2D transformations we need to discuss.
Mathematically, a 2D transformation is a function that consumes a 2D point and outputs a 2D point. More formally, we say that $f$ is a 2D transformation if its maps $\Real^2$ to $\Real^2$. In other words, its signature is $$f: \Real^2 \ra \Real^2.$$ Let us look at some examples. One of the simplest transformations is the identity transformation $\mathtt{I}$, which simply outputs whatever the input is: \begin{align*} \mathtt{I}( \ve{x} ) = \ve{x} \end{align*} for every $\ve{x} \in \Real^2$. So, \begin{align*} I((0,0)) &= (0,0), & I((1,2)) &= (1,2),& I((-5,3)) &= (-5, 3), \end{align*} and so on. (To simplify the notation, we will omit double parentheses and write $f(-5, 3)$ in place of $f((-5,3))$ from now on.)
Other transformations are more complicated to define as they involve some parameters to define their behavior. An example of transformations with parameters are the constant functions $\mathtt{C}_{\ve{a}}$ where $\ve{a} \in \Real^2$ is the parameter. The constant function always outputs $\ve{a}$ no matter what the input is: \begin{align*} \mathtt{C}_{\ve{a}}(\ve{x}) = \ve{a} \end{align*} for all $\ve{x} \in \Real^2$. So, \begin{align*} \mathtt{C}_{(2,1)}(10, 15) &= (2,1), & \mathtt{C}_{(-1,1)}(0, 0) &= (-1,1), & \mathtt{C}_{(46,49)}(39, 888) &= (46,49). \end{align*} (Again, when the parameter is a 2D point, we may drop the parantheses around it to make the notation simpler. That is, we may write $\mathtt{C}_{2,1}$ to mena $\mathtt{C}_{(2,1)}$.) Note that, while there is only one identity transformation, there are as many constants functions as there are points in $\Real^2$.
A type of transformation that is used extensively in computer graphics is the translation. Like a constant function, a translation $\mathtt{T}_{\ve{a}}$ is parametered by a point $\mathbf{a} \in \Real^2$. What it does is to add $\ve{a}$ to the input: \begin{align*} \mathtt{T}_{\ve{a}}(\ve{x}) = \ve{x} + \ve{a}. \end{align*} So, \begin{align*} \mathtt{T}_{1,1}(5,6) &= (6,7), & \mathtt{T}_{-2,3}(7,8) &= (5,11), & \mathtt{T}_{-10,-20}(100,200) &= (90,180). \end{align*}
Lastly, another type of transformation that is used extensively in computer graphics is scaling. It is also parameterized by a point $\ve{a} \in \Real^2$, but it is better to treat the points as an ordered pair $(a_1, a_2)$. The scaling $\mathtt{S}_{a_1,a_2}$ simply multiplies $a_1$ to the $x$ component of the input and $a_2$ to the $y$-component of the input. In other words, \begin{align*} \mathtt{S}_{a_1, a_2}\bigg( \begin{bmatrix} x \\ y \end{bmatrix} \bigg) = \begin{bmatrix} a_1 x \\ a_2 y \end{bmatrix} \end{align*} So, \begin{align*} \mathtt{S}_{2,2}(10, 9) &= (20, 18), & \mathtt{S}_{4,5}(-3, 2) &= (-12, 10), & \mathtt{S}_{20,10}(-1,-2) &= (-20, -20). \end{align*}
So far, we have introduced transformation as a mathematical concept. This, however, is very abstract and quite removed from the visual world of computer graphics. However, transformations are used in graphics because they we can apply them to graphics data and see the results with our eyes. In this section, we shall see how we can apply transformations to meshes why they are useful to graphics practioners.
Meshes are made of vertices, and all vertices must have one attribute: their positions. For a 2D mesh, the position is a point in $\Real^2$. If we apply a 2D transformation to it, we get another position. In other words, when we apply a transformation to a vertex, we move it to a new position. So, if we apply the same transformation to all vertices in a mesh, we can change both the mesh's positions and shape. Let's see how the transformations we have learned so far affect 2D meshes.
For the identity transformation $\mathtt{I}$, we know that it does not change any vertex's position at all, so any mesh would remain unchanged if we apply the identity transformation to it. For the constant transformation $\mathbb{C}_{a,b}$, all vertices are moved $(a,b)$, and so the mesh is reduced to a single point. So, these two transformations are not useful on their own.
The translation $\mathtt{T}_{a,b}$ is a little more interesting. We know that it adds the vector $(a,b)$ to all vertex positions. For example, let us say we have a square mesh made with 4 vertices at $(0, 0)$, $(1,0)$, $(1,1)$, and $(0,1)$. If we apply $T_{-1,2}$ to the mesh, then the vertex position becomes $(-1,2)$, $(0, 2)$, $(0, 3)$, and $(-1, 3)$, respectively. The overall effect is presented in the figure below.
| ${\huge \xrightarrow{\mathtt{T}_{-1,2}}}$ |
In general, the effect of applying $\mathtt{T}_{a,b}$ to a mesh is to move the mesh to the right by $a$ units and then to upward by $b$ units. Note that $a$ and $b$ are signed. So, moving to the right by a negative number of units means moving to the left, and moving upward by a negative number of units means moving downward.
Next, let's look at scaling $\mathtt{S}_{a,b}$. When we apply it to the square mesh above, the vertices become $(0,0)$, $(a,0)$, $(a,b)$, and $(0,b)$. What the results looks like depends on the sign of $a$ and $b$. When $a$ and $b$ are both positive, it scales the mesh up $a$ times in the $x$-direction and $b$-times in the $y$-direction. Note that $a$ and $b$ can be less than $1$, which means that it can also shrinks (i.e., scales down) the mesh too.
| ${\huge \xrightarrow{\mathtt{S}_{3,2}}}$ | ||
| ${\huge \xrightarrow{\mathtt{S}_{0.5,0.5}}}$ |
When $a$ is negative, the mesh "flips" horizontally. Any point to the right of the vertical line $x = 0$ is moved to the right side, and any point to the right would move to the left. The coordinate is scaled by a factor of $|a|$. Similarly, when $b$ is negative, the mesh "flips" vertically. Points above the horizontal line $y = 0$ go under, and points under the line go above.
| ${\huge \xrightarrow{\mathtt{S}_{-1,-1}}}$ | ||
| ${\huge \xrightarrow{\mathtt{S}_{-1,2}}}$ | ||
| ${\huge \xrightarrow{\mathtt{S}_{3,-2}}}$ |
One thing to nice is that, for scaling, there is a special point: the origin $(0,0)$. The origin is never changed by a scaling because multipying $0$ with any number always result in $0$. Other points are moved according to the scaling factors $a$ and $b$. It is as though "scape" itself is warped around $(0,0)$. So, we may call $(0,0)$ the center of scaling. To repeat, it is very important to remember that scaling always happen around a point. On the other hand, there is no "center of translation" because a translation move all points in the same way.
From the examples we have seen so far, we can see why transformations are useful to computer graphics. Using translation, we can move a mesh around the scene. Using scaling, we can make a mesh larger or smaller to our liking. Using these two transformations togeter allows us to resize a mesh and place it to where we like in a scene.
Like many other mathematical objects, we can perform operations such as addition, subtraction, and multiplication on transformations to another transformation. This allows us to build more complex transformations from simpler transformations, and it makes transformation another kind of "numbers," much like vectors. In this section, we will learn about three operations on transformations that are used widely in computer graphics: composition, addition, and scalar multiplication.
Let us talk, though, about how to decide whether two transformations are the same or not first. This will help us make more sense of the operations we are about to learn. We say that two transformations $f$ and $g$ are equal if the results of applying $f$ and $g$ to 2D any point $\ve{x}$ are the same. In mathematcal notations:
$f = g$ if $f(\ve{x}) = g(\ve{x})$ for all $\ve{x} \in \Real^2$.
In other words, $f$ and $g$ have the exact same effect on every point. We can swap $g$ in place of $f$, apply the function, and no one would notice any differences in the result. If there is a point $\ve{x}$ where $f(\ve{x}) \neq g(\ve{x})$, then $f$ and $g$ are not equal, and we would write $f \neq g$.
Let us observe equality between transformations we have seen so far. The identity transformation $\mathtt{I}$ returns the input $\ve{x}$ exactly without change. So does the translation by $(0,0)$, $\mathbb{T}_{0,0}$, and the scaling by $(1,1)$, $\mathtt{s}_{1,1}$. As a result, $$ \mathtt{I} = \mathtt{T}_{0,0} = \mathtt{S}_{1,1}.$$
The case above is the only case where a translation can be equal to a scaling. If $(a,b) \neq (0,0)$, them $\mathtt{T}_{a,b} \neq \mathtt{S}_{c,d}$ for any $(c, d) \in \Real^2$. This is because there is a point, $(0,0)$, where the effects of the two transformations differ: \begin{align*} T_{a,b}(0,0) = (a,b) \neq (0,0) = S_{c,d}(0,0). \end{align*}
With the discussion on equality out of the way, let us talk about the first operation on transformations. Given a 2D transformation $f: \Real^2 \ra \Real^2$ and another 2D transformation $g: \Real^2 \ra \Real^2$, we can create another transformation $h$ by just applying $f$ and then $g$. So, the defintion of $h$ is given by the equation $$h(\ve{x}) = g(f(\ve{x}))$$ for all $\ve{x} \in \Real^2$. Here, $h$ is called the composition of $f$ and $g$ and is denoted by $g \circ f$. As a result, we can write: $$(g \circ f)(\ve{x}) = g(f(\ve{x})).$$
Let us see an example. Let $f = \mathtt{T}_{1,0}$ and $g = \mathtt{S}_{2,2}$. We have that \begin{align*} (g \circ f)(x,y) &= g(f(x,y)) = \mathtt{S}_{2,2}(\mathtt{T}_{1,0}(x,y)) = \mathtt{S}_{2,2}(x+1, y) = (2x+2, 2y), \\ (f \circ g)(x,y) &= f(g(x,y)) = \mathtt{T}_{1,0}(\mathtt{S}_{2,2}(x,y)) = \mathtt{T}_{1,0}(2x, 2y) = (2x+1, 2y). \end{align*} It should be clear that, in this case, $f \circ g \neq g \circ f$ because their effects are different on all points. In generally, scaling then translation is not the same as translation then scaling. Composition is not commutative. The order at which we apply transformations matters.
On the other hand, if the order of transformations is fixed, it does not matter how we group the transformations in this order. More concretely, suppose we have three transformation $f$, $g$, and $h$ that we will apply to a point $\ve{x}$ in that order. The result is always $$h(g(f(\ve{x}))).$$ It does not matter if we first compose $f$ and $g$ to get $g\circ f$ and then compose it with $h$ before applying the result. In other words, \begin{align*} (h \circ (g \circ f))(\ve{x}) = h((g\circ f)\ve{x}) = h(g(f(\ve{x}))). \end{align*} It also does not matter if we compose $g$ and $h$ to get $h\circ g$ and then compose it with $f$ later: \begin{align*} ((h \circ g) \circ f)(\ve{x}) = (h \circ g)(f(\ve{x})) = h(g(f(\ve{x}))). \end{align*} This is because, in the end, all the transformations are applied in the same order. Mathematically, this fact is captured by the associative property: \begin{align*} (h \circ g) \circ f = h \circ (g \circ f) \end{align*} for any transformations $f$, $g$, and $h$. As a result, when we a long chain of transformations such as $f_5 \circ f_4 \circ f_3 \circ f_2 \circ f_1$, there is no need to write any parentheses between them because any parenthesization would yield the same transformation nonetheless. To repeat, composition is associative. As long as the order does not change, different groupings of transformations within the order do not produce different results.
While composition is in general not commutative, there are special cases where they are. The first special case is that the identity transformation commutes with any transformation: for any $f \in \Real^2 \ra \Real^2$, we have that \begin{align*} \mathtt{I} \circ f = f \circ \mathtt{I}. \end{align*} The second special case is that two translations commute with each other: \begin{align*} \mathtt{T}_{a,b} \circ \mathtt{T}_{c,d} = \mathtt{T}_{a+c, b+d} = \mathtt{T}_{c,d} \circ \mathtt{T}_{a,b}. \end{align*} So do two scalings: \begin{align*} \mathtt{S}_{a,b} \circ \mathtt{S}_{c,d} = \mathtt{S}_{ac, bd} = \mathtt{S}_{c,d} \circ \mathtt{S}_{a,b}. \end{align*} Commutativity generally breaks when you start mixing different types of transformations together.
A 2D transformation produces a 2D vector, which can be added. We can take this operation on vectors and then "lift" it to become an operation on transforamtions. More precisely, let $f$ and $g$ be two 2D transformations, the sum of $f$ and $g$ is the transformation $h$ such that \begin{align*} h(\ve{x}) = f(\ve{x}) + g(\ve{x}). \end{align*} For convenience, we shall write the sum of $f$ and $g$ as $f + g$. This gives: \begin{align*} (f+g)(\ve{x}) = f(\ve{x}) + g(\ve{x}). \end{align*} Transformation addition inherits many properties from vector addition (and addition in general). It is commutative and associative.
For the composition operator, the identity transformation $\mathtt{I}$ is the magic transformation that, when composed with another transformation, would leave the other transformation unchanged. For addition, this is no longer the case. This is because \begin{align*} (\mathtt{I} + f)(\ve{x}) = \mathtt{I}(\ve{x}) + f(\ve{x}) = \ve{x} + f(\ve{x}). \end{align*} We have that, the RHS would not be equal to the LHS unless $\ve{x} = (0,0)$, and so $I+f \neq f$. The magic transformatioin in this case is the constant transformation $\mathtt{C}_{0,0}$: \begin{align*} (\mathtt{C}_{0,0} + f)(\ve{x}) = \mathtt{C}_{0,0}(\ve{x}) + f(\ve{x}) = (0,0) + f(\ve{x}) = f(\ve{x}). \end{align*}
Adding two scalings together results in a scaling where the parameters are added. \begin{align*} (\mathtt{S}_{a,b} + \mathtt{S}_{c,d})(x,y) &= \mathtt{S}_{a,b}(x,y) + \mathtt{S}_{c,d}(x,y) \\ &= (ax, by) + (cx, dy) \\ &= \big( (a+c)x, (b+d) y \big) \\ &= \mathtt{S}_{a+c,b+d}(x,y). \end{align*} So, \begin{align*} \mathtt{S}_{a,b} + \mathtt{S}_{c,d} = \mathtt{S}_{a+c,b+d} \end{align*}
However, adding two translations together do not produce the same result as above. \begin{align*} (\mathtt{T}_{a,b} + \mathtt{T}_{c,d})(x,y) &= \mathtt{T}_{a,b}(\ve{x}) + \mathtt{T}_{c,d}(x,y) \\ &= (x+a, y+b) + (x+c,y+d) \\ &= (2x+a+b, 2y+b+d) \\ &= (x,y) + \big((x,y) +(a+c,b+d)) \\ &= \mathtt{I}(x,y) + \mathtt{T}_{a+c,b+d}(x,y) \end{align*} So, \begin{align*} \mathtt{T}_{a,b} + \mathtt{T}_{c,d} = \mathtt{I} + \mathtt{T}_{a+c,b+d} \end{align*}
In fact, we can say that a translation is the identity transformation added to a constant transformation:
\begin{align*} \mathtt{T}_{a,b}(x,y) &= (x+a,y+b) \\ &= (x,y) + (a,b) \\ &= \mathtt{I}(x,y) + \mathtt{C}_{a,b}(x,y) \\ &= (\mathtt{I} + \mathtt{C}_{a,b})(x,y), \end{align*} which gives us \begin{align*} \mathtt{T}_{a,b} = \mathtt{I} + \mathtt{C}_{a,b}. \end{align*}Similar to the previous section, we can take another vector operation, multiplication by a scalar, and define an operation on transformation with it. Let $f$ be a 2D transformation and $c \in \Real$ be a scalar. The product of $f$ and $c$ is a transformation $g$ such that \begin{align*} g(\ve{x}) = c f(\ve{x}) \end{align*} for all $\ve{x} \in \Real^2$. We write the product as $cf$. Note that the scalar multiplication is conducted to after we apply the transformation. The order is important here.
In fact, muliplication by scalar is actually the same as composition with a scaling: \begin{align*} (c f)(\ve{x}) = cf(\ve{x}) = \mathtt{S}_{c,c}(f(\ve{x})) = (\mathtt{S}_{c,c} \circ f)(\ve{x}), \end{align*} and so \begin{align*} cf = \mathtt{S}_{c,c} \circ f. \end{align*}
The reader might think it is pointless to introduce a new operation that can be defined in terms of previous ones. However, scalar multiplication gives us a a way to write expressions involving transformation more succinctly. A scaling $\mathtt{S}_{c,c}$ where the two scaling parameters are equal is called a uniform scaling. We can replace a composition with such a scaling (which we would write $\mathtt{S}_{c,c} \circ$) with just $c$.
Another convenience afforded by scalar multiplication is the definition of transformation subtraction, which we can defined as: \begin{align*} f - g = f + (-1)g. \end{align*}
Scalar multiplication inherits properties from the multiplication operators for real numbers. For one, the order we multiply the scalars do not matter: \begin{align*} c(df) = (cd)f = (dc)f = d(cf). \end{align*} When paired with transformation addition, we also have the distributive rules: \begin{align*} c(f + g) &= cf + cg \\ (c + d)f &= cf + df \end{align*} In fact, one may show that the distributive rules also apply to scaling transformations. \begin{align*} \mathtt{S}_{a,b} \circ (f + g) &= \mathtt{S}_{a,b} \circ f + \mathtt{S}_{a,b} \circ g \\ (\mathtt{S}_{a,b} + \mathtt{S}_{c,d}) \circ f &= \mathtt{S}_{a,b} \circ f + \mathtt{S}_{c,d} \circ f \end{align*}
We have studied some simple types of transformations. It is time to study another type of transformations that is more complicated than the one we have encountered before: linear transformation. Linear transformations are widely used in computer graphics because they encompass useful transformations and also because they can easily represented by data, not programs.
We say that a transformation (or a function) $f: \Real^2 \ra \Real^2$ is linear if the following conditions are true:
Let's classify the transformations we have seen so far whether they are linear or not. The constant function is generally not linear. If $(a,b) \neq (0,0)$ we have that \begin{align*} \mathtt{C}_{a,b}\big((x,y) + (0,0)\big) &= \mathtt{C}_{a,b}(x,y) \\ &= (a,b) \\ &\neq (a,b) + (a,b) \\ &= \mathtt{C}_{a,b}(x,y) + \mathtt{C}_{a,b}(0,0). \end{align*} The reader should check that $\mathtt{C}_{0,0}$ is linear.
Translation is also not linear if $(a,b) \neq (0,0)$. This is because \begin{align*} \mathtt{T}_{a,b}((x,y) + (0,0)) &= \mathtt{T}_{a,b}((x,y) + (0,0)) \\ &= (x,y) + (a,b) \\ &\neq (x,y) + (a,b) + (a,b) \\ &= \mathtt{T}_{a,b}(x,y) + \mathtt{T}_{a,b}(0,0) \end{align*} The only translation that is linear is $\mathtt{T}_{0,0} = \mathtt{I}$.
On the other hand, scaling is linear. This is because \begin{align*} \mathtt{S}_{a,b}((x_0, y_0) + (x_1, y_1)) &= \mathtt{S}_{a,b}(x_0 + x_1, y_0 + y_1) \\ &= (a(x_0 + x_1), b(y_0 + y_1)) \\ &= (ax_0 + ax_1, by_0 + by_1) \\ &= (ax_0, by_0) + (ax_1, by_1) \\ &= \mathtt{S}_{a,b}(x_0, y_0) + \mathtt{S}_{a,b}(x_1, y_1), \end{align*} and \begin{align*} \mathtt{S}_{a,b}(c(x, y)) &= \mathtt{S}_{a,b}(c(x, y)) \\ &= \mathtt{S}_{a,b}(cx, cy) \\ &= (acx, bcy) \\ &= c(ax,by) \\ &= c\,\mathtt{S}_{a,b}(x, y). \end{align*} Because $\mathtt{I} = \mathtt{S}_{1,1}$, we have that the identity transformation is linear as well.
One characteristic of a linear transformation is that it must preserve the point $(0,0)$. Let us say that $f$ is linear, then \begin{align*} f(0,0) = f((0,0) + (0,0)) = f(0,0) + f(0,0) = 2f(0,0), \end{align*} which meanes that $f(0,0) = (0,0)$. This is one of the reason why, if $(a,b) \neq (0,0)$, then $\mathtt{C}_{a,b}$ and $\mathtt{T}_{a,b}$ are not linear.
In Section 15.1, we learned about two types of transformations that are important to computer graphics: translation and scaling. There is a third type: rotation. In this chapter, we shall learn about rotations in 2D, and we will study rotations in 3D later in Chapter XXX.
In 2D, a rotation has a parameter, the angle $\theta$, and is denoted by $\mathtt{R}_\theta$. Given a point $\mathbf{x}$, the action of $\mathtt{R}_\theta$ on $\ve{x}$ is as follows.