Polyl. A stellation of the icosahedron.
It is also five
interlocking tetrahedra. The model is lighted by eight
light sources.
Automatic Creation of
Object Hierarchies for Ray
Tracing
Jeffrey Goldsmith
Intersection calculations domii nate the run time of
Jet Propulsion Laboratory canonical ray tracers. A common,algorithm to reduce
the number of intersection tests r equired is the inter-
section of rays with a tree of exteents, rather than the
John Salmon whole database of objects. A s hortcoming of this
California Institute of Tbchnology method is that these trees are d ifficult to generate.
Additionally, manually generated trees can be poor,
greatly reducing the run-time improvement available.
We present methods for evaluation of these trees in
approximate number of intersection calculations
required and for automatic generation of good trees.
These methods run in O(nlogn) expected time where n
is the number of objects in the scene. We report some
examples of speedups.
Due to the increased demand for visual realism in been improved so that each ray need not be intersected
computer generated images, ray tracing has become a with each object.4-8 Each method involves intersection
very popular rendering algorithm. Ray tracing programs calculations with simple bounding volumes to determine
simulate the interaction of light with the environment, if a more complex intersection calculation can be
thus simply determining such optical effects as reflec- avoided. Warren, Glassner, Fujimoto, and Kaplan4'5'7'3
tion, refraction, and shadowing. The basic algorithm",2 differ from Rubin and Whitted6 in that they trade space
traces rays from an eyepoint through a simulated screen (computer memory) for time in attempts to speed up the
at a set of objects to be seen. algorithm. In a limited space machine, such as a hyper-
Since ray tracing is a brute force algorithm, taking very cube parallel processor,9 these methods are currently
little advantage of global information about the picture not practical to use, though in the future that will likely
to be rendered, it is very costly in computer time. not be so, as memory space on these machines increases.
Whitted3 discovered that 75 percent of the time The method of hierarchical extents6 uses a tree search
required for simple scenes was taken in the calculations to find the objects that are hit by a ray. In the best case
that determined which objects were hit by each ray. The it yields O(logn) intersection calculations per ray. Since
original algorithm required intersecting each ray with tree nodes are small in comparison to object data, only
each object in the scene. Since then the algorithm has about 30 percent extra space is required to store the hier-
14 0272-1716/8710500-0014$01.00 e 1987 IEEE IEEE CG&A
Figure 1. Some objects surrounded by their bounding
boxes.
Seven spheres. Seven diffusely reflecting
spheres.
z z z "I z zi v
(
L-V
easily vary by a factor of fifty due simply to the choice
47 / of different trees. Thus, it is important to find a way to
L__V L__V _. choose a tree that reduces that time. Normally, the tree
is built from the structure used to model the scene. Some-
times, trees are built manually for simple or combined
1471
L-V L-V
models. A tree so generated is generally not a particu-
larly good one, because building these trees manually is
very difficult. This is due to the large amount of data that
must be manipulated all at once, as well as the difficulty
of performing complex tasks in three dimensions.
Figure 2.The bounding boxes represented by the struc- Trees that a modeler generates are also usually not
ture of the tree you will see in Figure 4. Note that the good because they were not generated to speed up the
boxes fit much more tightly than they are drawn here. ray-tracing operation, but instead to simplify modeling
of the scene. Trees built for modeling are generally too
simple, and often have large branching ratios, whereas
ray-tracer trees tend to be better if they have small
branching ratios to generate as many tree-pruning oper-
archy required for machine use in a typical implemen- ations as possible. In fact, for a checkerboard, a binary
tation. tree is the optimal configuration for the tree; yet few
To use the method of hierarchical extents, a tree of would choose to model it using two repeated polygons
bounding volumes must be constructed. Bounding and a very deep hierarchy.
volumes are simple geometric objects that fit around the Since it is possible to use many different trees to ren-
objects that make up the model. Bounding volumes are der a scene, and since manual construction of trees is
chosen to be objects that are simple to intersect with a tedious and not as effective as desired, computer- pro-
ray, such as spheres or rectangular prisms that have sides grams have been written to build these trees automati-
parallel to the coordinate planes. See Figure 1 for some cally. The simplest such algorithm constructs a complete
examples. These bounding volumes are combined into n-ary tree, filling the leaf nodes with objects in some sim-
a tree by picking some of them and surrounding them ple order. Not surprisingly, this method yields poor
with another bounding volume. This process is repeated results, since it takes no model information into account.
recursively until a bounding volume is generated that Another method, the median cut algorithm, divides
surrounds the whole scene. See Figure 2 for an example the scene into halves along some spatial axis and sur-
of a hierarchy of bounding volumes. rounds each half with a bounding volume. It then repeats
Many different trees can be built for a given scene, and this procedure recursively on each half. This method
they require differing computation times to render the works better than the previous one, but it still does not
image. The time required to render a simple image can adequately account for the intended use of the tree dur-
May 1987 15
B O/
- - -- ---------------- --------
Figure 3. In (a) is the insertion of node 4 as child of
node 1. In (b) is insertion of node 2 at position of leaf
node 1 to create new node 3.
I
Poly2. Another stellation of the icosahedron. This,
too, is illuminated by eight lights. Both Polyl and
Poly2 were computed by Roy Williams on a hyper-
cube. next section for a justification of this heuristic.) During
the search process, two or more children of a node may
have the same increase in bounding volume surface area
ing the rendering process. In the next section we after adding the new node. This occurs most frequently
describe how to build good trees based on a metric when the children are large and the increase is zero,
described in detail in the subsequent section. usually near the root of the tree. In that case, all equiva-
lently costly subtrees can be searched to find the best
location for insertion, or the tie can be broken by close-
Automatic generation of trees ness of the object to the center of the old bounding vol-
A useful tree generation algorithm must be applicable ume, or even random selection. In practice, only the first
to scenes with hundreds of thousands of objects; thus, two or three levels of the tree have nodes with big enough
any algorithm that runs in O(n2) time or worse is likely bounding volumes for this to happen, and then only near
to be too slow. This means that when considering place- the end of the construction. Thus, all those paths usually
ment of a node, one must not use information about all can be searched without undue cost. Also, since the tree
of the other nodes, just some small portion of them. This setup usually takes only a small fraction of the computer
constraint will not permit the optimal tree to be found. time that rendering the image takes, many paths can
For most models, however, many trees exist, with render- usually be searched without significant extra time.
ing costs that are only slightly higher than the optimal At each level of the tree during the search, the new
one. This is because the number of possible trees that node is considered a prospective child of each node that
exist is exponentially proportional to the number of will be searched (see Figure 3a). The tree is evaluated
objects, and local changes in the structure ofthe tree tend with the proposed insertion and the location with the
to have small effects on the overall cost. Thus, generat- smallest increase in tree cost is saved. When the search
ing a suboptimal tree is of little consequence, as long as reaches leaf nodes, the new node and the leaf node are
it is much better than trees generated by other methods, proposed as siblings of a new nonleaf node constructed
since the choice of trees only affects the amount of com- in the position of the old leaf node (see Figure 3b). When
putation needed to render the image, and small differ- the search is complete, the node is inserted in the tree
ences of time are not critical. wherever the increased cost of the tree will be
The general strategy used to construct the tree is a heu- minimized.
ristic tree search. Objects are added successively and the Thus, for each object to be inserted, an O(log n) tree
tree is searched to find a suitable insertion point for each search must be done. This yields a total asymptotic time
new node. Since not all nodes in the tree can be consid- complexity of O(nlogn).
ered as a point for insertion, the search must follow only Since the tree is being constructed without complete
a few paths. The choice of subtrees to search from a given knowledge of the model, the order in which objects are
node is determined by the smallest increase in surface added is significant. Several data orders were tested. The
area of the node's bounding volume that would occur if obvious choice is the order in which the modeler sup-
the new node were to be inserted as a child of it. (See the plies objects. This has some spatial coherence and some
16 IEEE CG&A
Checker. 256 polygons and one sphere.
archy predicted (and ran with) an average of 30.1 inter-
section tests per ray.
Evaluation of bounding box trees
In the tree construction algorithm we needed to esti-
mate the additional cost of insertion of an object into the
hierarchy. This required evaluating the cost of the whole
tree, at least initially. To achieve the desired results- that
is, a speedup of the rendering operation-it is necessary
to use a cost function based on the intended use of the
tree. Ray-tracing hierarchies are used to avoid intersect-
randomness. Sometimes, however, the modeler's output ing rays with the objects in the hierarchy by finding that
is in very bad order for the tree generation program. To a ray fails to hit a simple bounding volume in the hier-
try to correct for this, the data is shuffled once'0 before archy, and thus determining that all objects below that
being used as input to the program. Since a random seed node in the tree can be eliminated from further consider-
is used for the shuffle algorithm, the tree produced by the ation. To do this, bounding volumes must be tested for
algorithm can be represented by that seed, just one addi- intersection with the ray, incurring a cost in computer
tional number. With shuffled data, the resulting trees time. The quality of a tree is determined by the number
tend to be slightly worse than ones generated from data of bounding volume intersections incurred during ren-
in model order, but for models without a symmetry rep- dering, using the tree. Thus, we will evaluate a tree as the
resented in the original data order, by trying several expected number of bounding volume intersections
seeds, a tree can be found that is better than the one required to determine which objects to test for intersec-
generated from unshuffled data. tion during an average traversal (ray).
The effect of sorting the input data along a line was The simplest bounding volumes to intersect are the
also investigated. It seems that sorting is the worst thing sphere and the rectangular prism with all sides parallel
to do to the data, because the top levels of the thus gener- to coordinate planes. Each requires about ten floating-
ated tree do not represent the whole scene adequately. point operations on average to check for an intersection.
They are based on only a local section of the database The most effective bounding volumes are more com-
and cannot be revised with this algorithm. Sorted data plex," and vary with the primitive objects they sur-
tends to yield results similar to worst case shuffled data. round,'2 but the advantages of improving the structure
The results of each of these programs are summarized of the hierarchy will be seen regardless of the types of
in Table 1. Though no results are available from other bounding volume used, since the structure of the hier-
methods for it, a model with 16,373 objects was run archy does not affect the size of the bounding volumes
through the automatic tree generator. The resulting hier- around the primitive objects (leaves). The examples all
May 1987 17
parisons between conditional probabilities are needed
in most cases.
To compute the expected number of intersection cal-
culations of a whole tree, the conditional probabilities
must be scaled correctly, but the division can be factored
out and done only once. In fact, since bounding volume
surface areas are only used in ratios, they need only be
calculated to within a constant factor. If all the bound-
ing volumes for a model are orthogonal prisms, their
areas can be computed as (I + m)n + Im, which only costs
two multiplications and two additions to compute. If
only spherical bounding volumes are used, r2 can be
used as their area. (1, m,n, and r are as above.)
These conditional probabilities can be combined with
the structure of the tree to estimate how many nodes of
the tree will be hit by an arbitrary ray. Only rays that hit
the root node are considered, since the structure of the
Checker. Same model as before, but with transpar- tree is irrelevant to all other rays, and the structure of the
ent sphere, illustrating spherical aberration and tree has no effect on the size of the root bounding box.
internal reflection. This assumes that the tightest fit volume of the given type
is used for the set of objects it contains, which is natu-
ral for orthogonal prisms and space slices.12 If a ray hits
the root, then k more intersection calculations need to
be done, where k is the number of children it has.
use only orthogonal prisms, but mixed types would work In fact, if any node is hit, the minimum number of
as well by prorating each bounding volume by its cost to additional intersection calculations that must be done
intersect. is equal to the number of children it has. Since the cost
The first step in estimating the number of intersection of performing intersection calculations with the bound-
calculations that will be needed to intersect rays with a ing volumes of the children of a node is incurred by a ray
hierarchy is determining the probability with which an hitting that node, and the conditional probability of hit-
arbitrary ray will hit a given bounding volume. Since any ting it is as above, the total average cost of a node in units
ray that does not hit the root-level bounding volume will of intersection calculations is the product of the number
require exactly one intersection calculation, only rays of children it has and its surface area divided by the root
that do hit it need be considered. Thus, the conditional node's surface area. Note that the root node's cost is its
probability that a ray hits a bounding volume if it hits the number of children, and the cost of all leaf nodes is zero;
root bounding volume can be used instead. however, each leaf node's existence adds to its parent's
For rays with an endpoint at a fixed distance from a cost. The estimated number of intersections required to
bounding volume, the probability that an arbitrary one intersect a ray with a tree is the sum of the costs of its
will hit the bounding volume is proportional to the solid nodes.
angle subtended by the surface of the bounding volume. Figure 4 contains a simple example of a hierarchy. The
For convex objects, such as prisms and spheres, at large numbers on the nodes are the areas of the bounding
distances this is approximately proportional to the sur- volumes. The expected number of intersection calcula-
face area of the object.13 For an orthogonal prism of size tions required to intersect a ray with this tree is 7.3, bro-
I by m by n, the surface area is 21m+21n+2mn; for a ken down as follows. One intersection is required for the
sphere of radius r the surface area is 4Trr.2 For more root node, which is assumed to be a hit. This causes three
complex bounding volumes, approximations to their sur- more intersections at level twQ The leftmost node at level
face area can be used. two has probability .6 (6/10) and two children, yielding
The relationship between the surface area of a bound- a cost of 1.2 intersections. The second node at that level
ing volume and the likelihood that an arbitrary ray has probability .3 and three children, for .9 more calcu-
emanating at some large distance will hit it is approxi- lations. The third node has no children and thus no addi-
mately linear. Since all bounding volumes are contained tional cost. Two level-three nodes have children. They
within the root node's bounding volume, the conditional each have two children and probabilities of .2 and .4, giv-
probability of a ray hitting a given node if it hits the root ing costs of .4 and .8, respectively. Thus, the tree costs
node can be approximated as the area of the given node's one intersection at level zero, three at level one, 2.1 at level
bounding volume divided by the area of the root node's two, and 1.2 at level three, for a total of 7.3.
bounding volume. In general, this division need not be Evaluating the cost of a tree is an O(n) calculation. Dur-
done during the generation of the tree, since only com- ing the construction of the tree, different possible trees
18 IEEE CG&A
6 2(
Figure 4. A tree with 7.3 expected intersections per
ray.Numbers in the nodes are bounding volume areas.
may need to be evaluated; an incremental cost for the Spheres. 378 spheres arranged as a twisted cylinder.
addition of a node can be calculated in O(log n) time. This
calculation can also be done while building the tree
using the algorithm in the previous section, incurring no
additional time complexity. If the node is being added
as a child of another node (see Figure 3a), then the
incremental cost of its parent is (areanew areaOld)k + ar-
-
eanew where k is the number of siblings the new node
will have, and the areas are those of the parent's bound-
ing volume before and after proposed insertion. If a new
node is being combined with a leaf node to cause a new
parent node to be created (see Figure 3b), its incremen-
tal cost is 2areanew. Also, the increased cost to grandpar-
ents and other ancestors must be included. It is just
areanew areaOld)k of these ancestor nodes, where k is
-
the number of children they have. Note that this is just
a term in the general node cost increase.
To test our estimate of a tree's cost, the number of
bounding volume intersections done was measured for
some simple pictures. Only rays shot from the eye (pri-
mary rays) were used for the test because secondary
(shadow, reflection, and refraction) rays emanate from
a surface and thus must intersect that object's (and all its
ancestors') bounding volumes. This causes the minimum
number of intersection tests needed for a secondary ray
to be at least equal to the depth in the tree of the node
that contains the object from which it emanates. Thus,
the average number of intersection tests needed for a
secondary ray should be somewhat higher than it would
be for a truly arbitrary ray. Subtracting the average depth
of the tree from the number of intersections found for
secondary rays yields results about the same as for pri-
mary rays, only somewhat noisier. Also, only primary
rays that hit the largest bounding volume were counted,
since others are not affected by tree structure. Since pri-
mary rays tend to be similar to each other, three test runs
were done, from three different directions. included for comparison. Note that the intersection
Table 2 shows the results of these tests. Overall, the count for secondary rays is about log(the number of
predicted results agree fairly well with the measured objects) higher than it is for primary rays. (See, especially,
results. Data for all (including secondary) rays is also data for Poly2.)
May 1987 19
thanks to the reviewers for pointing out some details that
we missed.
This project was funded by the JPL Director's Discre-
tionary Fund, Department of Energy grants DE-
AS03-ER13118 and DE-FG03-85ER25009, the Parsons
Foundation, and the Systems Development Foundation.
References
1. A. Appel, "Some Techniques for Machine Renderings of Solids,"
AFIPS Conf. Proc., SJCC, AFIPS, Reston, Va., 1968 pp. 37-45.
2. J. Kajiya, Tutorial on Ray Tracing "State of the Art in Image Syn-
thesis," (seminar notes) Computer Graphics (Proc. SIGGRAPH 83),
July 1983.
3. T. Whtted, 'An Improved Illumination Model for Shaded Dis-
play," Comm. ACM, 1980, pp. 343-349.
4. L.V. Warren, "Geometric Hashing for Processing Complex
Scenes," CSDept. Memorandum, Univ. of Utah, Salt Lake City,
1985.
5. A. Glassner, "Space Subdivision for Fast Ray 'racing," IEEE
CGS-A, Oct. 1984, pp. 15-22.
Antenna. The high-gain antenna from Planet A, 6. S. Rubin and T. Whitted, "A 3-Dimensional Representation for Fast
Rendering of Complex Scenes," Computer Graphics (Proc. SIG-
a Japanese spacecraft investigating Halley's GRAPH 80), July 1980, pp. 110-116.
comet. 7. A. Fujimoto, "ARTS: Accelerated Ray-Tracing System," IEEE
CG&A, April 1986, pp. 16-26.
8. M. Kaplan, "Space-Tracing, a Constant Time Ray-Tracer," Com-
puter Graphics (Proc. SIGGRAPH 85), seminar notes from "State
Applications of the Art in Image Synthesis," July 1985.
Several other techniques have been found to speed up 9. G. Fox and S. Otto, "Algorithms for Concurrent Processors,"
Physics Today, May 1984, pp. 50-59.
the method of hierarchical extents."'12 The gains from 10. D. Knuth, The Art of Computer Programming, Vol. 2, Addison Wes-
improving the tree are independent of the gains from ley, Reading, Mass., 1969, p. 139.
these other methods; so the speedups obtained by using 11. T. Kay and J. Kajiya, "Ray 'racing Complex Scenes," Computer
more than one of these methods will multiply. Graphics (Proc. SIGGRAPH 86), Aug. 1986, pp. 169-278.
Better trees are especially valuable in animation. The 12. H.Weghorst, G.Hooper, and D. Greenberg, "Improved Computa-
small preprocessing cost of building the trees is further tional Methods for Ray Tracing," ACM Trans. on Graphics, Jan.
1984, pp. 52-69.
reduced by being distributed over many frames worth of 13. L. Stone, Theory of Optimal Search, Academic Press, New York,
rendering time. If objects move, the whole tree does not 1975, pp. 27-28..
have to be rebuilt. All the static objects in the scene can 14. J. Goldsmith and J. Salmon, "A Ray Tracing System for the Hyper-
be combined into a tree, and the moving objects can be cube," Caltech Concurrent Computing Project Memorandum
added before each frame. Objects that do not move very HM154, Calif. Inst. Of Technology, Pasadena, Calif., 1985.
far can be time/space bounded and included in the static
tree. Jeffrey Goldsmith has been at Jet Propulsion
The ability to estimate tree costs is valuable to some Lab's Computer Graphics Laboratory since 1983.
He is currently working on computer graphics on
parallel processing applications. The tree evaluator hypercube parallel processors as part of Califor-
allows one to determine the size of the workload repre- nia Institute of Technology's Concurrent Com-
sented by a portion of the hierarchy. If the model data- putation Program. He was a contributor to The
Magic Egg, the first computer-generated movie
base must be split up to fit in the local memory of a in Omnimax format, first shown at SIGGRAPH
processor of a parallel machine, then the intersection cal- *
t t Goldsmith received his BS and MS in com-
culation, which is the bulk of the ray-tracing puter science from Rensselaer Polytechnic Institute in 1981 and 1983.
computation'3 will be split up. It is important that this
split is into roughly equal parts, or the processors that John Salmon is a graduate student in computa-
do the smaller amounts of calculation will not be used tion and neural systems at the California Insti-
U tuteof Technology. 'His researchNinterests include
effective!y.'4 parallel processing, computer graphics, and
astrophysics. He received his BS in electrical
Acknowledgments engineering,computer science,and physics from
the Massachusetts Institute of Technology in
The pictures were computed on the 64-node Mark II 1981 and an MS in physics from the University
Caltech Hypercube. Thanks to Roy Williams for the of California at Berkeley.
models for Polyl and Poly2. Thanks to the JPL Computer The authors can be contacted at the California Institute of Technol-
Graphics Lab for the production of the slides. Also, ogy, Synchrotron Laboratory 206-49, Pasadena, California 91125.
20 IEEE CG&A