Mod 4
Mod 4
VIEWING
C H AP T E R 5
W e have completed our discussion of the first half of the synthetic camera
model—specifying objects in three dimensions. We now investigate the multi-
tude of ways in which we can describe our virtual camera. Along the way, we examine
related topics, such as the relationship between classical viewing techniques and com-
puter viewing and how projection is implemented using projective transformations.
There are three parts to our approach. First, we look at the types of views that we
can create and why we need more than one type of view. Then we examine how an
application program can specify a particular view within OpenGL. We will see that
the viewing process has two parts. In the first, we use the model-view matrix to switch
vertex representations from the object frame in which we defined our objects to their
representation in the eye frame, in which the camera is at the origin. This represen-
tation of the geometry will allow us to use canonical viewing procedures. The second
part of the process deals with the type of projection we prefer (parallel or perspec-
tive) and the part of the world we wish to image (the clipping or view volume). These
specifications will allow us to form a projection matrix that is concatenated with the
model-view matrix. Once more, we use a simple example program to demonstrate
how the OpenGL API handles viewing. Finally, we derive the projection matrices that
describe the most important parallel and perspective views.
Object
Projector
Projection plane
COP
Object
Projector
DOP
Projection plane
each case with the projection plane parallel to one of the principal faces of the object.
Usually, we use three views—such as the front, top, and right—to display the object.
The reason that we produce multiple views should be clear from Figure 5.5. For a
box-like object, only the faces parallel to the projection plane appear in the image. A
viewer usually needs more than two views to visualize what an object looks like from
its multiview orthographic projections. Visualization from these images can require
skill on the part of the viewer. The importance of this type of view is that it preserves
both distances and angles, and because there is no distortion of either distance or
shape, multiview orthographic projections are well suited for working drawings.
orthogonal to the projection plane, as shown in Figure 5.6, but the projection plane
can have any orientation with respect to the object. If the projection plane is placed
symmetrically with respect to the three principal faces that meet at a corner of our
rectangular object, then we have an isometric view. If the projection plane is placed
symmetrically with respect to two of the principal faces, then the view is dimetric.
The general case is a trimetric view. These views are shown in Figure 5.7. Note that
in an isometric view, a line segment’s length in the image space is shorter than its
length measured in the object space. This foreshortening of distances is the same
240 Chapter 5 Viewing
Projection plane
FIGURE 5.8 Oblique view. (a) Construction. (b) Top view. (c) Side view.
in the three principal directions, so we can still make distance measurements. In the
dimetric view, however, there are two different foreshortening ratios; in the trimetric
view, there are three. Also, although parallel lines are preserved in the image, angles
are not. A circle is projected into an ellipse. This distortion is the price we pay for the
ability to see more than one principal face in a view that can be produced easily either
by hand or by computer. Axonometric views are used extensively in architectural and
mechanical design.
cal viewing devices, including the human visual system, have a lens that is in a fixed
relationship with the image plane—usually, the lens is parallel to the plane. Although
these devices produce perspective views, if the viewer is far from the object, the views
are approximately parallel, but orthogonal, because the projection plane is parallel
to the lens. The bellows camera that we used to develop the synthetic-camera model
in Section 1.6 has the flexibility to produce approximations to parallel oblique views.
One use of such a camera is to create images of buildings in which the sides of the
building are parallel rather than converging as they would in an image created with a
orthogonal view with the the camera on the ground.
From the application programmer’s point of view, there is no significant differ-
ence among the different parallel views. The application programmer specifies a type
of view—parallel or perspective—and a set of parameters that describe the camera.
The problem for the application programmer is how to specify these parameters in
the viewing procedures so as best to view an object or to produce a specific classical
view.
pyramid. This symmetry is caused by the fixed relationship between the back (retina)
and lens of the eye for human viewing, or between the back and lens of a camera
for standard cameras, and by similar fixed relationships in most physical situations.
Some cameras, such as the bellows camera, have movable film backs and can produce
general perspective views. The model used in computer graphics includes this general
case.
The classical perspective views are usually known as one-, two-, and three-point
perspectives. The differences among the three cases are based on how many of the
three principal directions in the object are parallel to the projection plane. Consider
the three perspective projections of the building shown in Figure 5.10. Any corner
of the building includes the three principal directions. In the most general case—
the three-point perspective—parallel lines in each of the three principal directions
converges to a finite vanishing point (Figure 5.10(a)). If we allow one of the principal
directions to become parallel to the projection plane, we have a two-point projection
(Figure 5.10(b)), in which lines in only two of the principal directions converge.
Finally, in the one-point perspective (Figure 5.10(c)), two of the principal directions
are parallel to the projection plane, and we have only a single vanishing point. As
with parallel viewing, it should be apparent from the programmer’s point of view that
the three situations are merely special cases of general perspective viewing, which we
implement in Section 5.4.
ticular type of view, the application programmer may well have to determine where
to place the camera.
In terms of the pipeline architecture, viewing consists of two fundamental oper-
ations. First, we must position and orient the camera. This operation is the job of the
model-view transformation. After vertices pass through this transformation, they are
represented in eye or camera coordinates. The second step is the application of the
projection transformation. This step applies the specified projection—orthographic
or perspective—to the vertices and puts objects within the specified clipping volume
in a normalized clipping volume. We will examine these steps in detail in the next
few sections, but at this point it would help to review the default camera, that is, the
camera that OpenGL uses if we do not specify any viewing functions.
OpenGL starts with the camera at the origin of the object frame, pointing in the
negative z-direction. This camera is set up for orthogonal views and has a viewing
volume that is a cube, centered at the origin and with sides of length 2. The default
projection plane is the plane z = 0 and the direction of the projection is along the z-
axis. Thus, objects within this box are visible and projected as shown in Figure 5.11.
Until now, we were able to ignore any complex viewing procedures by exploiting our
knowledge of this camera. Thus, we were able to define objects in the application
programs that fit inside this cube and we knew that they would be visible. In this ap-
proach, both the model-view and projection matrices were left as the default identity
matrices.
Subsequently, we altered the model-view matrix, initially an identity matrix, by
rotations and translations, so as to place the camera where we desired. The param-
eters that we set in glOrtho alter the projection matrix, also initially an identity
matrix, so as to allow us to see objects inside an arbitrary right parallelepiped. In
this chapter, we will generate a wider variety of views by using the model-view ma-
trix to position the camera and the projection matrix to produce both orthographic
and perspective views.
Projection plane
z
We conclude the chapter with a short discussion of the two most important
methods for handling global lighting effects: ray tracing and radiosity.
focus on a simpler model, based on the Phong lighting model, that provides a com-
promise between physical correctness and efficient calculation. We will consider the
rendering equation, radiosity, and ray tracing in greater detail in Chapter 12.
Rather than looking at a global energy balance, we follow rays of light from
light-emitting (or self-luminous) surfaces that we call light sources. We then model
what happens to these rays as they interact with reflecting surfaces in the scene.
This approach is similar to ray tracing, but we consider only single interactions
between light sources and surfaces and do not consider the possibility that light from
a source may be blocked from reaching the surface by another surface. There are
two independent parts of the problem. First, we must model the light sources in the
scene. Then we must build a reflection model that describes the interactions between
materials and light.
To get an overview of the process, we can start following rays of light from a
point source, as shown in Figure 6.2. As we noted in Chapter 1, our viewer sees only
the light that leaves the source and reaches her eyes—perhaps through a complex
path and multiple interactions with objects in the scene. If a ray of light enters her
eye directly from the source, she sees the color of the source. If the ray of light hits
a surface visible to our viewer, the color she sees is based on the interaction between
the source and the surface material: She sees the color of the light reflected from the
surface toward her eyes.
In terms of computer graphics, we can place the projection plane between the
center of projection and the objects, as shown in Figure 6.3. Conceptually, the clip-
ping window in this plane is mapped to the display; thus, we can think of the pro-
jection plane as ruled into rectangles, each corresponding to a pixel on the display.
292 Chapter 6 Lighting and Shading
COP
Because we only need to consider light that enters the camera by passing through
the center of projection, we can start at the center of projection and follow a projec-
tor though each pixel in the clipping window. If we assume that all our surfaces are
opaque, then the color of the first surface intersected along each projector determines
the color of the corresponding pixel in the frame buffer.
Note that most rays leaving a light source do not contribute to the image and are
thus of no interest to us. Hence, by starting at the center of projection and casting
rays we have the start of an efficient ray tracing method, one that we explore further
in Section 6.10 and Chapter 13.
Figure 6.2 shows both single and multiple interactions between rays and objects.
It is the nature of these interactions that determines whether an object appears red or
brown, light or dark, dull or shiny. When light strikes a surface, some of it is absorbed
and some of it is reflected. If the surface is opaque, reflection and absorption account
for all the light striking the surface. If the surface is translucent, some of the light is
transmitted through the material and emerges to interact with other objects. These
interactions depend on the wavelength of the light. An object illuminated by white
light appears red because it absorbs most of the incident light but reflects light in
the red range of frequencies. A shiny object appears so because its surface is smooth.
Conversely, a dull object has a rough surface. The shading of objects also depends
on the orientation of their surfaces, a factor that we will see is characterized by the
normal vector at each point. These interactions between light and materials can be
classified into the three groups depicted in Figure 6.4.
1. Specular surfaces appear shiny because most of the light that is reflected or
scattered is in a narrow range of angles close to the angle of reflection. Mirrors
are perfectly specular surfaces; the light from an incoming light ray may be
partially absorbed, but all reflected light emerges at a single angle, obeying the
rule that the angle of incidence is equal to the angle of reflection.
6.1 Light and Matter 293
y
6.2 LIGHT SOURCES
I
Light can leave a surface through two fundamental processes: self-emission and re-
flection. We usually think of a light source as an object that emits light only through
internal energy sources. However, a light source, such as a light bulb, can also re-
flect some light that is incident on it from the surrounding environment. We neglect
p the emissive term in our simple models. When we discuss OpenGL lighting in Sec-
x
tion 6.7, we will see that we can add a self-emission term.
If we consider a source such as the one shown in Figure 6.5, we can look at
it as an object with a surface. Each point (x, y, z) on the surface can emit light
z that is characterized by the direction of emission (θ , φ) and the intensity of energy
FIGURE 6.5 Light source. emitted at each wavelength λ. Thus, a general light source can be characterized by a
six-variable illumination function I(x, y, z, θ , φ, λ). Note that we need two angles
to specify a direction, and we are assuming that each frequency can be considered
independently. From the perspective of a surface illuminated by this source, we can
obtain the total contribution of the source (Figure 6.6) by integrating over its surface,
a process that accounts for the emission angles that reach this surface and must also
account for the distance between the source and the surface. For a distributed light
source, such as a light bulb, the evaluation of this integral is difficult, whether we
use analytic or numerical methods. Often, it is easier to model the distributed source
with polygons, each of which is a simple source, or with an approximating set of point
sources.
We consider four basic types of sources: ambient lighting, point sources, spot-
lights, and distant light. These four lighting types are sufficient for rendering most
simple scenes.
p1
I (x 2, y2, z 2, 2, 2, )
as having three components—red, green, and blue—and can use each of the three
color sources to obtain the corresponding color component that a human observer
sees. Thus, we can describe a color source through a three-component intensity or
illumination function
L = (Lr , Lg , Lb),
each of whose components is the intensity of the independent red, green, and blue
components. Thus, we use the red component of a light source for the calculation of
the red component of the image. Because light–material computations involve three
similar but independent calculations, we tend to present a single scalar equation, with
the understanding that it can represent any of the three color components. Rather
than write what will turn out to be identical expressions for each component of L, we
will use the the scalar L to denote any of the its components. That is,
L ∈ {Lr , Lg , Lb}.
2. Measuring all the light information at each point in an environment determines the light field,
which can be used to create new images. Its capture, which is a complex and expensive data-rich
process, is the subject of much present research. See Lev [96].
296 Chapter 6 Lighting and Shading
p0
be some light in the environment so that objects in the viewing volume that are not
blocked by other objects will always appear in the image.
We use L(p0) to refer to any of the components. The intensity of illumination received
from a point source located at p0 at a point p is proportional to the inverse square of
the distance from the source. Hence, at a point p (Figure 6.7), any component of the
intensity of light received from the point source is given by function of the form
1
L(p, p0) = L(p0).
|p − p0|2
The use of point sources in most applications is determined more by their ease
of use than by their resemblance to physical reality. Scenes rendered with only point
sources tend to have high contrast; objects appear either bright or dark. In the real
world, it is the large size of most light sources that contributes to softer scenes, as
FIGURE 6.8 Shadows created we can see from Figure 6.8, which shows the shadows created by a source of finite
by finite-size light source. size. Some areas are fully in shadow, or in the umbra, whereas others are in partial
shadow, or in the penumbra. We can mitigate the high-contrast effect from point-
source illumination by adding ambient light to a scene.
The distance term also contributes to the harsh renderings with point sources.
Although the inverse-square distance term is correct for point sources, in practice it
is usually replaced by a term of the form (a + bd + cd 2)−1, where d is the distance
between p and p0. The constants a, b, and c can be chosen to soften the lighting. In
addition, a small amount of ambient light also softens the effect of point sources.
Note that if the light source is far from the surfaces in the scene, then the intensity
6.2 Light Sources 297
of the light from the source is sufficiently uniform that the distance term is almost ps
constant over the surfaces.
6.2.4 Spotlights
Spotlights are characterized by a narrow range of angles through which light is emit-
Is
ted. We can construct a simple spotlight from a point source by limiting the angles at
which light from the source can be seen. We can use a cone whose apex is at ps , which FIGURE 6.9 Spotlight.
points in the direction ls , and whose width is determined by an angle θ, as shown in
Figure 6.9. If θ = 180, the spotlight becomes a point source.
More realistic spotlights are characterized by the distribution of light within the Intensity
cone—usually with most of the light concentrated in the center of the cone. Thus, the
intensity is a function of the angle φ between the direction of the source and a vector
s to a point on the surface (as long as this angle is less than θ; Figure 6.10). Although
this function could be defined in many ways, it is usually defined by cose φ, where the
exponent e (Figure 6.11) determines how rapidly the light intensity drops off.
As we will see throughout this chapter, cosines are convenient functions for
lighting calculations. If u and v are any unit-length vectors, we can compute the FIGURE 6.10 Attenuation of
cosine of the angle θ between them with the dot product a spotlight.
cos θ = u . v, Intensity
In contrast, the distant light source is described by a direction vector whose represen-
tation in homogeneous coordinates is the matrix
⎡ ⎤
x
⎢y⎥
⎢ ⎥
p0 = ⎢ ⎥ .
⎣z⎦
0
The graphics system can carry out rendering calculations more efficiently for distant
light sources than for near ones. Of course, a scene rendered with distant light sources
looks different than a scene rendered with near light sources. Fortunately, OpenGL
allows both types of sources.
⎡ ⎤
Lira Liga Liba
⎣
Li = Lird Ligd Libd ⎦ .
Lirs Ligs Libs
The first row of the matrix contains the ambient intensities for the red, green, and
blue terms from source i. The second row contains the diffuse terms; the third con-
tains the specular terms. We assume that any distance-attenuation terms have not yet
been applied.
We build our lighting model by summing the contributions for all the light
sources at each point we wish to light. For each light source, we have to compute
the amount of light reflected for each of the nine terms in the illumination array. For
example, for the red diffuse term from source i, Lird , we can compute a reflection
term Rird , and the latter’s contribution to the intensity at p is Rird Lird . The value of
Rird depends on the material properties, the orientation of the surface, the direction
of the light source, and the distance between the light source and the viewer. Thus,
for each point, we have nine coefficients that we can place in an array of reflection
terms:
⎡ ⎤
Rira Riga Riba
Ri = ⎣ Rird Rigd Ribd ⎦ .
Rirs Rigs Ribs
We can then compute the contribution for each color source by adding the ambient,
diffuse, and specular components. For example, the red intensity that we see at p from
source i is the sum of red ambient, red diffuse, and red specular intensities from this
source:
Iir = Rira Lira + Rird Lird + RirsLirs
= Iira + Iird + Iirs .
We obtain the total intensity by adding the contributions of all sources and, possibly,
a global ambient term. Thus, the red term is
Ir = (Iira + Iird + Iirs) + Iar ,
i
I = I a + Id + Is = La Ra + Ld Rd + Ls Rs ,
with the understanding that the computation will be done for each of the primaries
and each source; a global ambient term can be added at the end.
300 Chapter 6 Lighting and Shading
0 ≤ ka ≤ 1,
and thus
Ia = ka La .
Here La can be any of the individual light sources, or it can be a global ambient term.
A surface has, of course, three ambient coefficients—kar , kag , and kab—and they
can differ. Hence, for example, a sphere appears yellow under white ambient light if
its blue ambient coefficient is small and its red and green coefficients are large.
Rd ∝ cos θ ,
where θ is the angle between the normal at the point of interest n and the direction
of the light source l. If both l and n are unit-length vectors,3 then
3. Direction vectors, such as l and n, are used repeatedly in shading calculations through the dot
product. In practice, both the programmer and the graphics software should seek to normalize all
such vectors as soon as possible.
6.3 The Phong Lighting Model 301
d n
(a)
d/cos
(b)
cos θ = l . n.
Id = kd (l . n)Ld .
=1
=2
=5
about the angle of a perfect reflector—a mirror or a perfectly specular surface. Mod-
eling specular surfaces realistically can be complex because the pattern by which the
light is scattered is not symmetric. It depends on the wavelength of the incident light,
and it changes with the reflection angle.
Phong proposed an approximate model that can be computed with only a slight
increase over the work done for diffuse surfaces. The model adds a term for specular
reflection. Hence, we consider the surface as being rough for the diffuse term and
smooth for the specular term. The amount of light that the viewer sees depends on
the angle φ between r, the direction of a perfect reflector, and v, the direction of the
viewer. The Phong model uses the equation
Is = ksLs cosα φ.
We can add a distance term, as we did with diffuse reflections. What is referred to as
the Phong model, including the distance term, is written
1
I= (k L max(l . n, 0) + ksLsmax((r . v)α , 0)) + ka La .
a + bd + cd 2 d d
This formula is computed for each light source and for each primary.
It might seem counter intuitive to have a single light source characterized by
different amounts of red, green, and blue light for the ambient, diffuse and specular
6.3 The Phong Lighting Model 303
terms in the lighting model. However, the white highlight we might see on a red ball
is the distorted reflection of a light source or perhaps some other bright white object
in the environment. To calculate this highlight correctly would require a global rather
than local lighting calculation. Because we cannot solve the full rendering equation,
we instead use various tricks in an attempt to obtain realistic renderings, one of which
is to allow different ambient, specular, and diffuse lighting colors.
Consider, for example, an environment with many objects. When we turn on a
light, some of that light hits a surface directly. These contributions to the image can
be modeled with specular and diffuse components of the source. However, much of
the rest of the light from the source is scattered from multiple reflections from other
objects and makes a contribution to the light received at the surface under considera-
tion. We can approximate this term by having an ambient component associated with
the source. The shade that we should assign to this term depends on both the color of
the source and the color of the objects in the room—an unfortunate consequence of
our use of approximate models. To some extent, the same analysis holds for diffuse
light. Diffuse light reflects among the surfaces, and the color that we see on a partic-
ular surface depends on other surfaces in the environment. Again, by using carefully
chosen diffuse and specular components with our light sources, we can approximate
a global effect with local calculations.
We have developed the Phong lighting model in object space. The actual lighting
calculation, however, can be done in a variety of ways within the pipeline. In OpenGL,
the default is to, do the calculations for each vertex in eye coordinates and then
interpolate the shades for each fragment later in the pipeline. We must be careful
of the effect of the model-view and projection transformations on the vectors used
in the model because these transformations can affect the cosine terms in the model
(see Exercise 6.20). Consequently, to make a correct shading calculation, we must
either preserve spatial relationships as vertices and vectors pass through the pipeline,
perhaps by sending additional information through the pipeline from object space,
or go backward through the pipeline to obtain the required shading information.
2ψ = φ.
304 Chapter 6 Lighting and Shading
f (p) = ax + by + cz + d = 0,
6.5 Polygonal Shading 309
r = αl + βn.
n . r = αl . n + β = l . n.
We can get a second condition between α and β from our requirement that r also be
of unit length; thus,
1 = r . r = α 2 + 2αβl . n + β 2 .
r = 2(l . n)n − l.
Although the fixed-function OpenGL pipeline uses the modified Phong model
and thus avoids having to calculate the reflection vector, in Chapter 9 we introduce
programmable shaders that can use the reflection vector. Methods such as environ-
ment maps will use the reflected-view vector (see Exercise 6.24) that is used to de-
termine what a viewer would see if she looked at a reflecting surface such as a highly
polished sphere.
l v
glShadeModel(GL_FLAT);
5. We can make this assumption in OpenGL by setting the local-viewer flag to false.
6.5 Polygonal Shading 311
If flat shading is in effect, OpenGL uses the normal associated with the first vertex of
a single polygon for the shading calculation. For primitives such as a triangle strip,
OpenGL uses the normal of the third vertex for the first triangle, the normal of the
fourth for the second, and so on. Similar rules hold for other primitives, such as
quadrilateral strips.
Flat shading will show differences in shading for the polygons in our mesh. If
the light sources and viewer are near the polygon, the vectors l and v will be differ-
ent for each polygon. However, if our polygonal mesh has been designed to model
a smooth surface, flat shading will almost always be disappointing because we can
see even small differences in shading between adjacent polygons, as shown in Fig-
ure 6.25. The human visual system has a remarkable sensitivity to small differences in
light intensity, due to a property known as lateral inhibition. If we see an increasing
sequence of intensities, as shown in Figure 6.26, we perceive the increases in bright- FIGURE 6.26 Step chart.
ness as overshooting on one side of an intensity step and undershooting on the other,
as shown in Figure 6.27. We see stripes, known as Mach bands, along the edges. This
phenomenon is a consequence of how the cones in the eye are connected to the optic
nerve, and there is little that we can do to avoid it, other than to look for smoother
shading techniques that do not produce large differences in shades at the edges of
polygons.
glShadeModel(GL_SMOOTH);
Suppose that we have enabled both smooth shading and lighting and that we assign
to each vertex the normal of the polygon being shaded. The lighting calculation is
made at each vertex using the material properties and the vectors v and l computed
for each vertex. Note that if the light source is distant, and either the viewer is distant
312 Chapter 6 Lighting and Shading
Perceived intensity
Actual intensity
n or there are no specular reflections, then smooth (or interpolative) shading shades a
polygon in a constant color.
n2 n4 If we consider our mesh, the idea of a normal existing at a vertex should cause
n1 n3
concern to anyone worried about mathematical correctness. Because multiple poly-
gons meet at interior vertices of the mesh, each of which has its own normal, the
normal at the vertex is discontinuous. Although this situation might complicate the
mathematics, Gouraud realized that the normal at the vertex could be defined in such
a way as to achieve smoother shading through interpolation. Consider an interior ver-
FIGURE 6.28 Normals near tex, as shown in Figure 6.28, where four polygons meet. Each has its own normal. In
interior vertex. Gouraud shading, we define the normal at a vertex to be the normalized average of
the normals of the polygons that share the vertex. For our example, the vertex normal
is given by
n1 + n2 + n3 + n4
n= .
|n1 + n2 + n3 + n4|
Polygons
Vertices
computed shades for the vertices and has interpolated these shades over the faces of
the polygons.
Color Plate 21 contains another illustration of the smooth shading provided by
OpenGL. We used this color cube as an example in Chapters 2 and 3, and the pro-
grams are in Appendix A. The eight vertices are colored black, white, red, green, blue,
cyan, magenta, and yellow. Once smooth shading is enabled, OpenGL interpolates
the colors across the faces of the polygons automatically.
We can do a similar interpolation on all the edges. The normal at any interior point
can be obtained from points on the edges by
nA
nB
nA Once we have the normal at each point, we can make an independent shading calcu-
lation. Usually, this process can be combined with rasterization of the polygon. Until
recently, Phong shading could only be carried out off-line because it requires the in-
terpolation of normals across each polygon. In terms of the pipeline, Phong shading
requires that the lighting model be applied to each fragment; hence, the name per-
fragment shading. The latest graphics cards allow the programmer to write programs
nC nD that operate on each fragment as it is generated by the rasterizer. Consequently, we
nB can now do per-fragment operations and thus implement Phong shading in real time.
We will discuss this topic in detail in Chapter 9.
FIGURE 6.31 Interpolation of
normals in Phong shading.
6. The regular icosahedron is composed of 20 equilateral triangles; it makes a nice starting point for
generating spheres. See [Ope05].