Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views12 pages

UNIT 2.1 CPU Building Blocks Reg Org

This chapter covers processor structure and function, including processor organization, register organization, instruction cycles, and instruction pipelining. It explains the roles of user-visible and control/status registers, the instruction cycle process, and compares the x86 and ARM processor architectures. The chapter concludes with key terms and review questions to reinforce learning objectives.

Uploaded by

vishnupriy4n
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views12 pages

UNIT 2.1 CPU Building Blocks Reg Org

This chapter covers processor structure and function, including processor organization, register organization, instruction cycles, and instruction pipelining. It explains the roles of user-visible and control/status registers, the instruction cycle process, and compares the x86 and ARM processor architectures. The chapter concludes with key terms and review questions to reinforce learning objectives.

Uploaded by

vishnupriy4n
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

CHAPTER

Processor Structure
and Function
14.1 Processor Organization
14.2 Register Organization
­User-​­Visible Registers
Control and Status Registers
Example Microprocessor Register Organizations
14.3 Instruction Cycle
The Indirect Cycle
Data Flow
14.4 Instruction Pipelining
Pipelining Strategy
Pipeline Performance
Pipeline Hazards
Dealing with Branches
Intel 80486 Pipelining
14.5 The x86 Processor Family
Register Organization
Interrupt Processing
14.6 The Arm Processor
Processor Organization
Processor Modes
Register Organization
Interrupt Processing
14.7 Key Terms, Review Questions, and Problems

488
14.1 / Processor Organization   489

Learning Objectives
After studying this chapter, you should be able to:
rr Distinguish between ­user-​­visible and control/status registers, and discuss the
purposes of registers in each category.
rr Summarize the instruction cycle.
rr Discuss the principle behind instruction pipelining and how it works in
practice.
rr Compare and contrast the various forms of pipeline hazards.
rr Present an overview of the x86 processor structure.
rr Present an overview of the ARM processor structure.

This chapter discusses aspects of the processor not yet covered in Part Three and sets
the stage for the discussion of RISC and superscalar architecture in Chapters 15 and 16.
We begin with a summary of processor organization. Registers, which form
the internal memory of the processor, are then analyzed. We are then in a position
to return to the discussion (begun in Section 3.2) of the instruction cycle. A descrip-
tion of the instruction cycle and a common technique known as instruction pipelin-
ing complete our description. The chapter concludes with an examination of some
aspects of the x86 and ARM organizations.

14.1 PROCESSOR ORGANIZATION

To understand the organization of the processor, let us consider the requirements


placed on the processor, the things that it must do:
■■ Fetch instruction: The processor reads an instruction from memory (register,
cache, main memory).
■■ Interpret instruction: The instruction is decoded to determine what action is
required.
■■ Fetch data: The execution of an instruction may require reading data from
memory or an I/O module.
■■ Process data: The execution of an instruction may require performing some
arithmetic or logical operation on data.
■■ Write data: The results of an execution may require writing data to memory
or an I/O module.
To do these things, it should be clear that the processor needs to store some
data temporarily. It must remember the location of the last instruction so that it can
know where to get the next instruction. It needs to store instructions and data tem-
porarily while an instruction is being executed. In other words, the processor needs
a small internal memory.
Figure 14.1 is a simplified view of a processor, indicating its connection to the
rest of the system via the system bus. A similar interface would be needed for any
490  CHAPTER 14 / Processor Structure and Function

Registers

ALU

Control
unit

Control Data Address


bus bus bus

System
bus
Figure 14.1 The CPU with the System Bus

of the interconnection structures described in Chapter 3. The reader will recall that
the major components of the processor are an arithmetic and logic unit (ALU) and
a control unit (CU). The ALU does the actual computation or processing of data.
The control unit controls the movement of data and instructions into and out of the
processor and controls the operation of the ALU. In addition, the figure shows a
minimal internal memory, consisting of a set of storage locations, called registers.
Figure 14.2 is a slightly more detailed view of the processor. The data trans-
fer and logic control paths are indicated, including an element labeled internal

Arithmetic and logic unit

Status flags
• Registers

Shifter •
Internal CPU bus

Complementer

Arithmetic
and
Boolean
logic

Control
unit

Control
paths
Figure 14.2 Internal Structure of the CPU
14.2 / Register Organization   491

processor bus. This element is needed to transfer data between the various registers
and the ALU because the ALU in fact operates only on data in the internal pro-
cessor memory. The figure also shows typical basic elements of the ALU. Note the
similarity between the internal structure of the computer as a whole and the internal
structure of the processor. In both cases, there is a small collection of major ele-
ments (computer: processor, I/O, memory; processor: control unit, ALU, registers)
connected by data paths.

14.2 REGISTER ORGANIZATION

As we discussed in Chapter 4, a computer system employs a memory hierarchy. At


higher levels of the hierarchy, memory is faster, smaller, and more expensive (per
bit). Within the processor, there is a set of registers that function as a level of mem-
ory above main memory and cache in the hierarchy. The registers in the processor
perform two roles:
■■ ­User-​­visible registers: Enable the ­machine-​­or assembly language programmer
to minimize main memory references by optimizing use of registers.
■■ Control and status registers: Used by the control unit to control the operation
of the processor and by privileged, operating system programs to control the
execution of programs.
There is not a clean separation of registers into these two categories. For
example, on some machines the program counter is user visible (e.g., x86), but on
many it is not. For purposes of the following discussion, however, we will use these
categories.

­User-​­Visible Registers
A ­user-​­visible register is one that may be referenced by means of the machine
language that the processor executes. We can characterize these in the following
categories:
■■ General purpose
■■ Data
■■ Address
■■ Condition codes
­General-​­purpose registers can be assigned to a variety of functions by the pro-
grammer. Sometimes their use within the instruction set is orthogonal to the oper-
ation. That is, any ­general-​­purpose register can contain the operand for any opcode.
This provides true g­ eneral-​­purpose register use. Often, however, there are restrictions.
For example, there may be dedicated registers for ­floating-​­point and stack operations.
In some cases, ­general-​­purpose registers can be used for addressing functions
(e.g., register indirect, displacement). In other cases, there is a partial or clean sep-
aration between data registers and address registers. Data registers may be used
only to hold data and cannot be employed in the calculation of an operand address.
492  CHAPTER 14 / Processor Structure and Function

Address registers may themselves be somewhat general purpose, or they may be


devoted to a particular addressing mode. Examples include the following:
■■ Segment pointers: In a machine with segmented addressing (see Section 8.3),
a segment register holds the address of the base of the segment. There may be
multiple registers: for example, one for the operating system and one for the
current process.
■■ Index registers: These are used for indexed addressing and may be autoindexed.
■■ Stack pointer: If there is ­user-​­visible stack addressing, then typically there is
a dedicated register that points to the top of the stack. This allows implicit
addressing; that is, push, pop, and other stack instructions need not contain an
explicit stack operand.
There are several design issues to be addressed here. An important issue
is whether to use completely ­general-​­purpose registers or to specialize their use.
We have already touched on this issue in the preceding chapter because it affects
instruction set design. With the use of specialized registers, it can generally be impli-
cit in the opcode which type of register a certain operand specifier refers to. The
operand specifier must only identify one of a set of specialized registers rather than
one out of all the registers, thus saving bits. On the other hand, this specialization
limits the programmer’s flexibility.
Another design issue is the number of registers, either general purpose or data
plus address, to be provided. Again, this affects instruction set design because more
registers require more operand specifier bits. As we previously discussed, somewhere
between 8 and 32 registers appears optimum [LUND77]. Fewer registers result in more
memory references; more registers do not noticeably reduce memory references (e.g.,
see [WILL90]). However, a new approach, which finds advantage in the use of hun-
dreds of registers, is exhibited in some RISC systems and is discussed in Chapter 15.
Finally, there is the issue of register length. Registers that must hold addresses
obviously must be at least long enough to hold the largest address. Data registers
should be able to hold values of most data types. Some machines allow two contigu-
ous registers to be used as one for holding ­double-​­length values.
A final category of registers, which is at least partially visible to the user, holds
condition codes (also referred to as flags). Condition codes are bits set by the pro-
cessor hardware as the result of operations. For example, an arithmetic operation
may produce a positive, negative, zero, or overflow result. In addition to the result
itself being stored in a register or memory, a condition code is also set. The code
may subsequently be tested as part of a conditional branch operation.
Condition code bits are collected into one or more registers. Usually, they
form part of a control register. Generally, machine instructions allow these bits to
be read by implicit reference, but the programmer cannot alter them.
Many processors, including those based on the ­IA-​­64 architecture and the
MIPS processors, do not use condition codes at all. Rather, conditional branch
instructions specify a comparison to be made and act on the result of the compari-
son, without storing a condition code. Table 14.1, based on [DERO87], lists key
advantages and disadvantages of condition codes.
14.2 / Register Organization   493

Table 14.1 Condition Codes


Advantages Disadvantages
1. Because condition codes are set by normal 1. Condition codes add complexity, both to the
arithmetic and data movement instructions, they hardware and software. Condition code bits are
should reduce the number of COMPARE and often modified in different ways by different
TEST instructions needed. instructions, making life more difficult for both
2. 
Conditional instructions, such as BRANCH are the microprogrammer and compiler writer.
simplified relative to composite instructions, such 2. Condition codes are irregular; they are typically
as TEST and BRANCH. not part of the main data path, so they require
3. Condition codes facilitate multiway branches. extra hardware connections.
For example, a TEST instruction can be followed 3. Often condition code machines must add special
by two branches, one on less than or equal to ­non-​­condition-​­code instructions for special situa-
zero and one on greater than zero. tions anyway, such as bit checking, loop control,
4. Condition codes can be saved on the stack and atomic semaphore operations.
during subroutine calls along with other register 4. In a pipelined implementation, condition codes
information. require special synchronization to avoid conflicts.

In some machines, a subroutine call will result in the automatic saving of all
­ ser-​­visible registers, to be restored on return. The processor performs the saving
u
and restoring as part of the execution of call and return instructions. This allows
each subroutine to use the ­user-​­visible registers independently. On other machines,
it is the responsibility of the programmer to save the contents of the relevant ­user-​
­visible registers prior to a subroutine call, by including instructions for this purpose
in the program.

Control and Status Registers


There are a variety of processor registers that are employed to control the operation
of the processor. Most of these, on most machines, are not visible to the user. Some
of them may be visible to machine instructions executed in a control or operating
system mode.
Of course, different machines will have different register organizations and
use different terminology. We list here a reasonably complete list of register types,
with a brief description.
Four registers are essential to instruction execution:
■■ Program counter (PC): Contains the address of an instruction to be fetched.
■■ Instruction register (IR): Contains the instruction most recently fetched.
■■ Memory address register (MAR): Contains the address of a location in
memory.
■■ Memory buffer register (MBR): Contains a word of data to be written to
memory or the word most recently read.
Not all processors have internal registers designated as MAR and MBR, but
some equivalent buffering mechanism is needed whereby the bits to be transferred
494  CHAPTER 14 / Processor Structure and Function

to the system bus are staged and the bits to be read from the data bus are temporar-
ily stored.
Typically, the processor updates the PC after each instruction fetch so that the
PC always points to the next instruction to be executed. A branch or skip instruc-
tion will also modify the contents of the PC. The fetched instruction is loaded into
an IR, where the opcode and operand specifiers are analyzed. Data are exchanged
with memory using the MAR and MBR. In a b ­ us-​­organized system, the MAR con-
nects directly to the address bus, and the MBR connects directly to the data bus.
­User-​­visible registers, in turn, exchange data with the MBR.
The four registers just mentioned are used for the movement of data between
the processor and memory. Within the processor, data must be presented to the
ALU for processing. The ALU may have direct access to the MBR and u ­ ser-​­visible
registers. Alternatively, there may be additional buffering registers at the boundary
to the ALU; these registers serve as input and output registers for the ALU and
exchange data with the MBR and ­user-​­visible registers.
Many processor designs include a register or set of registers, often known as
the program status word (PSW), that contain status information. The PSW typic-
ally contains condition codes plus other status information. Common fields or flags
include the following:
■■ Sign: Contains the sign bit of the result of the last arithmetic operation.
■■ Zero: Set when the result is 0.
■■ Carry: Set if an operation resulted in a carry (addition) into or borrow (sub-
traction) out of a ­high-​­order bit. Used for multiword arithmetic operations.
■■ Equal: Set if a logical compare result is equality.
■■ Overflow: Used to indicate arithmetic overflow.
■■ Interrupt Enable/Disable: Used to enable or disable interrupts.
■■ Supervisor: Indicates whether the processor is executing in supervisor or
user mode. Certain privileged instructions can be executed only in supervisor
mode, and certain areas of memory can be accessed only in supervisor mode.
A number of other registers related to status and control might be found in a
particular processor design. There may be a pointer to a block of memory contain-
ing additional status information (e.g., process control blocks). In machines using
vectored interrupts, an interrupt vector register may be provided. If a stack is used
to implement certain functions (e.g., subroutine call), then a system stack pointer is
needed. A page table pointer is used with a virtual memory system. Finally, regis-
ters may be used in the control of I/O operations.
A number of factors go into the design of the control and status register organ-
ization. One key issue is operating system support. Certain types of control infor-
mation are of specific utility to the operating system. If the processor designer has
a functional understanding of the operating system to be used, then the register
organization can to some extent be tailored to the operating system.
Another key design decision is the allocation of control information between
registers and memory. It is common to dedicate the first (lowest) few hundred or
14.2 / Register Organization   495

Data registers General registers General registers


D0 AX Accumulator EAX AX
D1 BX Base EBX BX
D2 CX Count ECX CX
D3 DX Data EDX DX
D4
D5 Pointers and index ESP SP
D6 SP Stack ptr EBP BP
D7 BP Base ptr ESI SI
SI Source index EDI DI
Address registers DI Dest index
A0 Program status
A1 Segment FLAGS register
A2 CS Code Instruction pointer
A3 DS Data
(c) 80386—Pentium 4
A4 SS Stack
A5 ES Extract
A6
A7´ Program status
Flags
Instr ptr
Program status
(b) 8086
Program counter
Status register

(a) MC68000
Figure 14.3 Example Microprocessor Register Organizations

thousand words of memory for control purposes. The designer must decide how
much control information should be in registers and how much in memory. The
usual ­trade-​­off of cost versus speed arises.

Example Microprocessor Register Organizations


It is instructive to examine and compare the register organization of comparable
systems. In this section, we look at two 16-bit microprocessors that were designed at
about the same time: the Motorola MC68000 [STRI79] and the Intel 8086 [MORS78].
Figures 14.3a and b depict the register organization of each; purely internal registers,
such as a memory address register, are not shown.
The MC68000 partitions its 32-bit registers into eight data registers and nine
address registers. The eight data registers are used primarily for data manipulation
and are also used in addressing as index registers. The width of the registers allows
8-, 16-, and 32-bit data operations, determined by opcode. The address registers con-
tain 32-bit (no segmentation) addresses; two of these registers are also used as stack
pointers, one for users and one for the operating system, depending on the current
execution mode. Both registers are numbered 7, because only one can be used at a
time. The MC68000 also includes a 32-bit program counter and a 16-bit status register.
The Motorola team wanted a very regular instruction set, with no ­special-​
­purpose registers. A concern for code efficiency led them to divide the registers into
496  CHAPTER 14 / Processor Structure and Function

two functional components, saving one bit on each register specifier. This seems a
reasonable compromise between complete generality and code compaction.
The Intel 8086 takes a different approach to register organization. Every
register is special purpose, although some registers are also usable as general pur-
pose. The 8086 contains four 16-bit data registers that are addressable on a byte
or 16-bit basis, and four 16-bit pointer and index registers. The data registers can
be used as general purpose in some instructions. In others, the registers are used
implicitly. For example, a multiply instruction always uses the accumulator. The
four pointer registers are also used implicitly in a number of operations; each
contains a segment offset. There are also four 16-bit segment registers. Three of
the four segment registers are used in a dedicated, implicit fashion, to point to
the segment of the current instruction (useful for branch instructions), a segment
containing data, and a segment containing a stack, respectively. These dedicated
and implicit uses provide for compact encoding at the cost of reduced flexibility.
The 8086 also includes an instruction pointer and a set of 1-bit status and control
flags.
The point of this comparison should be clear. There is no universally accepted
philosophy concerning the best way to organize processor registers [TOON81]. As
with overall instruction set design and so many other processor design issues, it is
still a matter of judgment and taste.
A second instructive point concerning register organization design is illus-
trated in Figure 14.3c. This figure shows the u ­ ser-​­visible register organization for
the Intel 80386 [ELAY85], which is a 32-bit microprocessor designed as an exten-
sion of the 8086.1 The 80386 uses 32-bit registers. However, to provide upward
compatibility for programs written on the earlier machine, the 80386 retains the
original register organization embedded in the new organization. Given this design
constraint, the architects of the 32-bit processors had limited flexibility in designing
the register organization.

14.3 INSTRUCTION CYCLE

In Section 3.2, we described the processor’s instruction cycle (Figure 3.9). To recall,
an instruction cycle includes the following stages:
■■ Fetch: Read the next instruction from memory into the processor.
■■ Execute: Interpret the opcode and perform the indicated operation.
■■ Interrupt: If interrupts are enabled and an interrupt has occurred, save the
current process state and service the interrupt.
We are now in a position to elaborate somewhat on the instruction cycle. First,
we must introduce one additional stage, known as the indirect cycle.

1
Because the MC68000 already uses 32-bit registers, the MC68020 [MACD84], which is a full 32-bit archi-
tecture, uses the same register organization.
14.3 / Instruction Cycle   497

The Indirect Cycle


We have seen, in Chapter 13, that the execution of an instruction may involve one
or more operands in memory, each of which requires a memory access. Further, if
indirect addressing is used, then additional memory accesses are required.
We can think of the fetching of indirect addresses as one more instruction
stages. The result is shown in Figure 14.4. The main line of activity consists of alter-
nating instruction fetch and instruction execution activities. After an instruction is
fetched, it is examined to determine if any indirect addressing is involved. If so, the
required operands are fetched using indirect addressing. Following execution, an
interrupt may be processed before the next instruction fetch.
Another way to view this process is shown in Figure 14.5, which is a revised
version of Figure 3.12. This illustrates more correctly the nature of the instruction
cycle. Once an instruction is fetched, its operand specifiers must be identified. Each
input operand in memory is then fetched, and this process may require indirect
addressing. ­Register-​­based operands need not be fetched. Once the opcode is exe-
cuted, a similar process may be needed to store the result in main memory.

Data Flow
The exact sequence of events during an instruction cycle depends on the design of
the processor. We can, however, indicate in general terms what must happen. Let us
assume that a processor that employs a memory address register (MAR), a memory
buffer register (MBR), a program counter (PC), and an instruction register (IR).
During the fetch cycle, an instruction is read from memory. Figure 14.6 shows
the flow of data during this cycle. The PC contains the address of the next instruc-
tion to be fetched. This address is moved to the MAR and placed on the address
bus. The control unit requests a memory read, and the result is placed on the data
bus and copied into the MBR and then moved to the IR. Meanwhile, the PC is
incremented by 1, preparatory for the next fetch.
Once the fetch cycle is over, the control unit examines the contents of the IR
to determine if it contains an operand specifier using indirect addressing. If so, an

Fetch

Interrupt Indirect

Execute

Figure 14.4 The Instruction Cycle


Indirection Indirection

Instruction Operand Operand


fetch fetch store

Multiple Multiple
operands results

Instruction Instruction Operand Operand


Data
address operation address address
Operation
calculation decoding calculation calculation

Instruction complete, Return for string


fetch next instruction or vector data
No
interrupt Interrupt
check

Interrupt

Interrupt

Figure 14.5 Instruction Cycle State Diagram

CPU

PC MAR
Memory

Control
unit

IR MBR

Address Data Control


bus bus bus
MBR = Memory buffer register
MAR = Memory address register
IR = Instruction register
PC = Program counter

Figure 14.6 Data Flow, Fetch Cycle


498
14.3 / Instruction Cycle   499

indirect cycle is performed. As shown in Figure 14.7, this is a simple cycle. The ­right-​
­most N bits of the MBR, which contain the address reference, are transferred to
the MAR. Then the control unit requests a memory read, to get the desired address
of the operand into the MBR.
The fetch and indirect cycles are simple and predictable. The execute cycle
takes many forms; the form depends on which of the various machine instructions
is in the IR. This cycle may involve transferring data among registers, read or write
from memory or I/O, and/or the invocation of the ALU.
Like the fetch and indirect cycles, the interrupt cycle is simple and predictable
(Figure 14.8). The current contents of the PC must be saved so that the processor
can resume normal activity after the interrupt. Thus, the contents of the PC are
transferred to the MBR to be written into memory. The special memory location
reserved for this purpose is loaded into the MAR from the control unit. It might,
for example, be a stack pointer. The PC is loaded with the address of the interrupt
routine. As a result, the next instruction cycle will begin by fetching the appropriate
instruction.

CPU

MAR
Memory

Control
unit

MBR

Address Data Control


bus bus bus
Figure 14.7 Data Flow, Indirect Cycle

CPU

PC MAR
Memory

Control
Unit

MBR

Address Data Control


bus bus bus
Figure 14.8 Data Flow, Interrupt Cycle

You might also like