.
1 Levels and Types of Benchmarks
Based on the levels of performance they measure, benchmarks can be grouped into two
levels:
Component-level Benchmarks
System-level Benchmarks
Based on their compositions, benchmarks can be categorized into types:
Synthetic Benchmarks
Application Benchmarks
2.2 Component-Level Benchmarks
Component-Level benchmarks test a specific component of a computer system, such as the
video board, the audio card, or the microprocessor. They are useful for selecting a
component of a computer system, that corresponds to a particular function. Instead of
testing the performance of the system running real applications, component-level
benchmarks focus on the performance of subsystems within a system. These subsystems
may include operating system, arithmetic interger unit, arithmetic floating-point unit,
memory system, disk subsystem, etc. Examples of component-level benchmarks include:
SPECweb96 - measures the web server performance
GPC - measures graphics performance for displaying 3-D images
2.3 System-Level Benchmarks
System-Level benchmarks evaluate the overall performance of a computer running real
programs or applications. These benchmarks are useful when comparing systems of
different architectures. They take each subsystem into account, and indicate the effect of
each subsystem on the overall performance. Examples of system-level benchmarks include:
SYSmark/NT 4.0 - measures the performance of computers running popular
business applications under Windows NT 4.0
TPC-C - measures the performance of transaction processing system like
2.4 Synthetic Benchmarks
Synthetic benchmarks are created by combining basic computer functions in proportions
that developers feel will yield an indicative measure of the performance capabilities of the
machine under test. These benchmarks try to match the average frequency of operations
and operands of a large set of programs. Functions included in synthetic benchmarks are
usually created artificially to match an average execution profile and this impairs their
credibility.
Synthetic benchmarks are component-level benchmarks, and they evaluate a particular
capability of a subsystem. For example, a disk subsystem performance benchmark may
combine a series of basic seek, read, and write operations involving varying numbers of
disk blocks of varying sizes.
When evaluating the results from synthetic benchmarks, the following rules should be
followed:
understand the composition of the benchmark,
appreciate the factors contributing to the results,
determine if the benchmark tests functions that are typical of your workload and
environment.
Examples of synthetic benchmarks include:
WinBench 97 - measures the performance of a PC's graphics, disk, processor, video
and CD-ROM subsystems in Windows environment
MacBench 97 - measures the processor, floating-point, graphics, video, and CD-
ROM performance of a MAC OS system
2.5 Application Benchmarks
Application benchmarks employ actual application programs. Developers of these
benchmarks include applications that they feel perform common functions from within a
particular industry segment or a class of products. These application programs are run by
a macro of program operations which attempts to model the way users operate their
system.
Most application benchmarks are system-level benchmarks, and they measure the overall
performance of a system. When an application benchmark is run, it tests the contribution
of each component of the system to the overall performance. These benchmarks are usually
larger and difficult to execute and are not useful for measuring future needs.
The major drawback is that application benchmarks are subject to the benchmark
developer's interpretation of a "typical workload". Examples of application benchmarks
include:
Winstone 97 - tests a PC's overall performance when running Windows-based 32-bit
business applications
SYSmark/NT 4.0 - measures the performance of computers running popular
business applications under Windows NT 4.0