Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
108 views19 pages

ARM Processors for EIE Students

The document provides information about ARM processors including the RISC design philosophy, ARM design philosophy for embedded systems, embedded system hardware components like AMBA bus protocol, and embedded system software components like boot code and operating systems. It discusses key differences between RISC and CISC architectures and how ARM implements a hybrid approach to achieve benefits of both.

Uploaded by

yukthiprema
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views19 pages

ARM Processors for EIE Students

The document provides information about ARM processors including the RISC design philosophy, ARM design philosophy for embedded systems, embedded system hardware components like AMBA bus protocol, and embedded system software components like boot code and operating systems. It discusses key differences between RISC and CISC architectures and how ARM implements a hybrid approach to achieve benefits of both.

Uploaded by

yukthiprema
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

ARM (Advanced RISC Machine) NOTES

TEACHER : G N SRIKANTH
CLASS : 5Th SEM
SUBJECT/CODE : ARM PROCESSOR 21EI53
DEPT : ELECTRONICS AND INSTRUMENTATION ENGINEERING
COLLEGE : R N S INSTITUTE OF TECHNOLOGY

Distinguish between RISC & CISC Processor w.r.t. architectural features

GN SRIKANTH, EIEDEPT, RNSIT PAGE 1


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

Module -1
ARM Embedded Systems [ PART-A ]
Introduction, RISC design philosophy, ARM design philosophy, Embedded system hardware
– AMBA bus protocol, ARM bus technology, Memory, Peripherals, Embedded system
software – Initialization (BOOT) code, Operating System, Applications.
MODULE-1 PART-A

ARM EMBEDDED SYSTEMS

The ARM processor core is a key component of many successful 32-bit embedded
systems. You probably own one yourself and may not even realize it! ARM cores are widely
used in mobile phones, handheld organizers, and a multitude of other everyday portable
consumer devices.
the ARM core is not a single core, but a whole family of designs sharing similar design
principles and a common instruction set.
For example, one of ARM’s most successful cores is the ARM7TDMI. It provides up to
120 Dhrystone MIPS and is known for its high code density and low power consumption,
making it ideal for mobile embedded devices.
(Dhrystone MIPS version 2.1 is a small benchmarking program.)

1.1 The RISC design philosophy


The ARM core uses a RISC (Reduced Instruction Set Computing) architecture.
RISC is a design philosophy aimed at
 Delivering simple but powerful instructions that execute within a single cycle at a high
clock speed.
 The RISC philosophy concentrates on reducing the complexity of instructions performed
by the hardware because it is easier to provide greater flexibility and intelligence in
software (compiler) rather than hardware. (whereas CISC relies more on Hardware)
 RISC design places greater demands on the compiler.
 Figure 1.1 illustrates these major differences.

The RISC philosophy is implemented with four major design rules:

1. Instructions—
 RISC processors have a reduced number of instruction classes. These classes provide
simple operations that can each execute in a single cycle.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 2


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

 The compiler or programmer synthesizes complicated operations (for example, a divide


operation) by combining several simple instructions
 Each instruction is a fixed length to allow the pipeline to fetch future instructions before
decoding the current instruction.
2. Pipelines—
 The processing of instructions is broken down into smaller units that can be executed in
parallel by pipelines.
 Ideally the pipeline advances by one step on each cycle for maximum throughput.
Instructions can be decoded in one pipeline stage.
3. Registers—
 RISC machines have a large general-purpose register set. Any register can
contain either data or an address.
 Registers act as the fast-local memory store (costly)for all data processing operations. In
contrast, CISC processors have dedicated registers for specific purposes.

4. Load-store architecture—
 The processor operates on data held in registers.
 Separate load and store instructions transfer data between the register bank and external
memory.
 Memory accesses are costly, so separating memory accesses from data processing
provides an advantage because you can use data items held in the register bank multiple
times without needing multiple memory accesses.

These design rules allow a RISC processor to be simpler, and thus the core can operate
at higher clock frequencies.
[ However, point to be noted is ARM is neither pure RISC nor pure CISC (complex instruction
set computing]

1.2 The ARM Design Philosophy


There are a number of physical features that have driven the ARM processor design.
 The ARM processor has been specifically designed to be small to reduce power
consumption and extend battery operation essential for applications such as mobile
phones and personal digital assistants (PDAs).
 High code density is another major requirement since embedded systems have limited
memory due to cost and/or physical size restrictions. High code density is useful for
applications that have limited on-board memory, such as mobile phones and mass storage
devices.
 Embedded systems are price sensitive and use slow and low-cost memory devices
providing substantial savings.
 The ability to use low-cost memory devices produces substantial savings.
 Another important requirement is to reduce the area of the die taken up by the embedded
processor. For a single-chip solution, the smaller the area used by the embedded
processor, the more available space for specialized peripherals. This in turn reduces
the cost of the design and manufacturing since fewer discrete chips are required for
the end product.
 ARM has incorporated hardware debug technology within the processor so that we can
view what is happening while the processor is executing code with greater visibility,
software engineers can resolve issues faster, which has a direct effect on the time to
market and reduces overall development costs.
 The ARM core is not a pure RISC architecture because of the constraints of its primary
GN SRIKANTH, EIEDEPT, RNSIT PAGE 3
ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

application the embedded system.

1.2.1 Instruction Set for Embedded Systems


The ARM instruction set differs from the pure RISC definition in several ways
that make the ARM instruction set suitable for embedded applications:
 Variable cycle execution for certain instructions—
Not every ARM instruction executes in a single cycle. For example, load-store-
multiple instructions vary in the number of execution cycles depending upon the number
of registers being transferred. The transfer can occur on sequential memory addresses,
which increases performance since sequential memory accesses are often faster than
random accesses. Code density is also improved since multiple register transfers are
common operations at the start and end
of functions.

 Inline barrel shifter leading to more complex instructions—


The inline barrel shifter is a hardware component that preprocesses one of the
input registers before it is used by an instruction. This expands the capability of many
instructions to improve core performance and code density.

 Thumb 16-bit instruction set—


ARM enhanced the processor core by adding a second 16-bit instruction set
called Thumb that permits the ARM core to execute either 16- or 32-bit instructions. The
16-bit instructions improve code density by about 30% over 32-bit fixed-length
instructions.

 Conditional execution—
An instruction is only executed when a specific condition has been satisfied. This
feature improves performance and code density by reducing branch instructions.

 Enhanced instructions—
The enhanced digital signal processor (DSP) instructions were added to the
standard ARM instruction set to support fast 16×16-bit multiplier operations and
saturation. These instructions allow a faster-performing ARM processor in some cases to
replace the traditional combinations of a processor plus a DSP.

These additional features have made the ARM processor one of the most commonly used 32-bit
embedded processor cores. Many of the top semiconductor companies around the world produce
products based around the ARM processor.

1.3 Embedded System Hardware

Embedded systems can control many different devices, from small sensors found on a
production line, to the real-time control systems used on a NASA space probe. All these devices
use a combination of software and hardware components. Each component is chosen for
efficiency and, if applicable, is designed for future extension and expansion.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 4


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

Figure 1.2 shows a typical embedded device based on an ARM core. Each box represents
a feature or function. The lines connecting the boxes are the buses carrying data. We can
separate the device into four main hardware components:

 The ARM processor controls the embedded device. Different versions of the ARM
processor are available to suit the desired operating characteristics. An ARM processor
comprises a core (the execution engine that processes instructions and manipulates data)
plus the surrounding components that interface it with a bus. These components can
include memory management and caches.
 Controllers coordinate important functional blocks of the system. Two commonly found
controllers are interrupt and memory controllers.
 The peripherals provide all the input-output capability external to the chip and are
responsible for the uniqueness of the embedded device.
 A bus is used to communicate between different parts of the device.

1.3.1 ARM Bus Technology

Embedded systems use different bus technologies than those designed for x86 PCs. The most
common PC bus technology, the Peripheral Component Interconnect (PCI) bus, connects
such devices as video cards and hard disk controllers to the x86 processor bus. This type
of technology is external or off-chip (i.e., the bus is designed to connect mechanically and
electrically to devices external to the chip) and is built into the motherboard of a PC.

In contrast, embedded devices use an on-chip bus that is internal to the chip and that
allows different peripheral devices to be interconnected with an ARM core.
There are two different classes of devices attached to the bus. The ARM processor core is
a bus master—a logical device capable of initiating a data transfer with another device across
the same bus. Peripherals tend to be bus slaves—logical devices capable only of responding
to a transfer request from a bus master device.
A bus has two architecture levels. The first is a physical level that covers the electrical
characteristics and bus width (16, 32, or 64 bits). The second level deals with protocol—the
logical rules that govern the communication between the processor and a peripheral.

1.3.2 AMBA Bus Protocol (Advanced Microcontroller Bus Architecture)

The Advanced Microcontroller Bus Architecture (AMBA) has been widely adopted as the on-
chip bus architecture used for ARM processors.
GN SRIKANTH, EIEDEPT, RNSIT PAGE 5
ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

 The first AMBA buses introduced were the ARM System Bus (ASB) and the ARM
Peripheral Bus (APB).
 Later ARM introduced another bus design, called the ARM High Performance Bus
(AHB). Using AMBA,
 peripheral designers can reuse the same design on multiple projects. because there are a
large number of peripherals developed with an AMBA interface, hardware designers
have a wide choice of tested and proven peripherals for use in a device.
 AHB provides higher data throughput than ASB because it is based on a centralized
multiplexed bus scheme rather than the ASB bidirectional bus design. This change
allows the AHB bus to run at higher clock speeds and to be the first ARM bus to support
widths of 64 and 128 bits.

ARM has introduced two variations on the AHB bus: Multi-layer AHB and AHB-Lite. (In
contrast to the original AHB, which allows a single bus master to be active on the bus at any
time,)
The Multi-layer AHB bus allows multiple active bus masters. AHB-Lite is a subset of the AHB
bus and it is limited to a single bus master. This bus was developed for designs that do not
require the full features of the standard AHB bus.
AHB and Multi-layer AHB support the same protocol for master and slave but have
different interconnects. The new interconnects in Multi-layer AHB are good for systems with
multiple processors. They permit operations to occur in parallel and allow for higher throughput
rates.

The example device shown in Figure 1.2 has three buses: an AHB bus for the high-performance
peripherals, an APB bus for the slower peripherals, and a third bus for external peripherals,
proprietary to this device. This external bus requires a specialized bridge to connect with the
AHB bus.

1.3.3 Memory
An embedded system has to have some form of memory to store and execute code. You
have to compare
 price,
 performance,
 power consumption
when deciding upon specific memory characteristics, such as hierarchy, width, and type. If
memory has to run twice as fast to maintain a desired bandwidth, then the memory power
requirement may be higher.

1.3.3.1 Hierarchy
All computer systems have memory arranged in some form of hierarchy. Figure 1.2 shows
a device that supports external off-chip memory. Internal to the processor there is an option
of a cache (not shown in Figure 1.2) to improve memory performance.
Figure 1.3 shows the memory trade-offs: the fastest memory cache is physically located
nearer the ARM processor core and the slowest secondary memory is set further away.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 6


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

If the closer memory is to the processor core, the more it costs and the smaller its capacity. The
cache is placed between main memory and the core. It is used to speed up data transfer between
the processor and main memory. A cache provides an overall increase in performance but with a
loss of predictable execution time. Although the cache increases the general performance of the
system, it does not help real-time system response. Note that many small embedded systems do
not require the performance benefits of a cache.
The main memory is large—around 256 KB to 256 MB (or even greater), depending on the
application—and is generally stored in separate chips. Load and store instructions access the
main memory unless the values have been stored in the cache for fast access. Secondary storage
is the largest and slowest form of memory. Hard disk drives and CD-ROM drives are examples
of secondary storage. These days secondary storage may vary from 600 MB to 60 GB.

1.3.3.2 Width
The memory width is the number of bits the memory returns on each access—typically 8, 16, 32,
or 64 bits. The memory width has a direct effect on the overall performance and cost ratio.
If you have an uncached system using 32-bit ARM instructions and 16-bit-wide memory chips,
then the processor will have to make two memory fetches per instruction. Each fetch requires
two 16-bit loads. This obviously has the effect of reducing system performance, but the benefit is
that 16-bit memory is less expensive.
In contrast, if the core executes 16-bit Thumb instructions, it will achieve better performance
with a 16-bit memory. The higher performance is a result of the core making only a single fetch
to memory to load an instruction. Hence, using Thumb instructions with 16-bit-wide memory
devices provides both improved performance and reduced cost.
Table 1.1 summarizes theoretical cycle times on an ARM processor using different
memory width devices.

Flash ROM: can be written to as well as read, but it is slow to write so you shouldn’t use it for
holding dynamic data. Its main use is for holding the device firmware or storing long term data
that needs to be preserved after power is off. The erasing and writing of flash ROM are
completely software controlled with no additional hardware circuity required, which reduces the
manufacturing costs. Flash ROM has become the most popular of the read-only memory types
and is currently being used as an alternative for mass or secondary storage.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 7


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

Dynamic random access memory (DRAM): is the most commonly used RAM for devices. It
has the lowest cost per megabyte compared with other types of RAM. DRAM is dynamic it
needs to have its storage cells refreshed and given a new electronic charge every few msec so
you need to set up a DRAM controller before using the memory.

Static random access memory (SRAM): is faster than the more traditional DRAM, but requires
more silicon area. SRAM is static—the RAM does not require refreshing. The access time for
SRAM is considerably shorter than the equivalent DRAM because SRAM does not require a
pause between data accesses. Because of its higher cost, it is used mostly for smaller high-speed
tasks, such as fast memory and caches.

Synchronous dynamic random access memory (SDRAM): is one of many sub categories of
DRAM. It can run at much higher clock speeds than conventional memory. SDRAM
synchronizes itself with the processor bus because it is clocked. Internally the data is fetched
from memory cells, pipelined, and finally brought out on the bus in a burst. The old-style DRAM
is asynchronous, so does not burst as efficiently as SDRAM.

1.3.4 Peripherals

The interaction of embedded systems with the outside world is possible only with peripheral
devices.
A peripheral device performs
 input and output functions for the chip by connecting to other devices or sensors
that are off-chip.
 Each peripheral device usually performs a single function and may reside on-
chip.
 Peripherals range from a simple serial communication device to a more complex
802.11 wireless device.
All ARM peripherals are memory mapped—the programming interface is a set of
memory-addressed registers. The address of these registers is an offset from a specific peripheral
base address.
Controllers are specialized peripherals that implement higher levels of functionality
within an embedded system.
Two important types of controllers
1. Memory controllers
2. Interrupt controllers.
1.3.4.1 Memory Controllers
Memory controllers connect different types of memory to the processor bus. On
power-up a memory controller is configured in hardware to allow certain memory devices to be
active. These memory devices allow the initialization code to be executed.
Some memory devices must be set up by software; for example, when using DRAM, you
first have to set up the memory timings and refresh rate before it can be accessed.
1.3.4.2 Interrupt Controllers
When a peripheral or device requires attention, it raises an interrupt to the processor.

An interrupt controller provides a programmable governing policy that allows software


to determine which peripheral or device can interrupt the processor at any specific time by
setting the appropriate bits in the interrupt controller registers.
There are two types of interrupt controller available for the ARM processor:
1. Standard interrupt controller
2. vector interrupt controller (VIC).
The standard interrupt controller sends an interrupt signal to the processor core when an
external device requests servicing. It can be programmed to ignore or mask an individual device
GN SRIKANTH, EIEDEPT, RNSIT PAGE 8
ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

or set of devices. The interrupt handler determines which device requires servicing by reading a
device bitmap register in the interrupt controller.

The VIC is more powerful than the standard interrupt controller because it prioritizes
interrupts and simplifies the determination of which device caused the interrupt. After
associating a priority and a handler address with each interrupt, the VIC only asserts an interrupt
signal to the core if the priority of a new interrupt is higher than the currently executing interrupt
handler. Depending on its type, the VIC will either call the standard interrupt exception handler,
which can load the address of the handler for the device from the VIC, or cause the core to jump
to the handler for the device directly.

1.4 Embedded System Software


An embedded system needs software to drive it.
Figure 1.4 shows four typical software components required to control an embedded device.
 Each software component in the stack uses a higher level of abstraction to separate the
code from the hardware device.
 The initialization code is the first code executed on the board and is specific to a
particular target or group of targets. It sets up the minimum parts of the board before
handing control over to the operating system.

 The operating system provides an infrastructure to control applications and manage


hardware system resources. Many embedded systems do not require a full operating
system, a simple task scheduler that is either event or poll driven is sufficient.
 The device drivers are the third component shown in Figure 1.4. They provide
a software interface to the peripherals on the hardware device.

 Application performs one of the tasks required for a device.


For example, a mobile phone might have a diary application. There may be multiple
applications running on the same device, controlled by the operating system.

[The software components can run from ROM or RAM. ROM code that is fixed on the device
(for example, the initialization code) is called firmware.]

1.4.1 Initialization (Boot) Code:

Initialization code (or boot code) takes the processor from the reset state to a state where
the operating system can run.
It configures
1) Memory controller
2) Processor caches

GN SRIKANTH, EIEDEPT, RNSIT PAGE 9


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

3) Also initializes some devices


The initialization code handles a number of administrative tasks prior to handing control
over to an operating system image. We can group these different tasks into three phases:
1) Initial hardware configuration,
2) Diagnostics,
3) Booting.
Initial hardware configuration involves setting up the target platform so it can boot
an image (Peace of software module). Some times modification standard configuration may be
required to satisfy the requirements of the booted image.
For example, the memory system normally requires reorganization of the memory map

Diagnostics are often embedded in the initialization code.


 Diagnostic code tests the system by exercising the hardware target to check if the target
is in working order.
 Also tracks down standard system-related issues.
 The primary purpose of diagnostic code is fault identification and isolation.

Booting an image is the final phase, but first you must load the image (s/w).
 Loading an image involves copying an entire program including code and data into
RAM, to just copying a data area containing volatile variables into RAM.
 Once booted, the system hands over control by modifying the program counter to point
into the start of the image.

Example: Initializing or organizing memory is an important part of the initialization code because
many operating systems expect a known memory layout before they can start.

Figure 1.5 shows memory before and after reorganization. It is common for ARM-based
embedded systems to provide for memory remapping because it allows the system to start the
initialization code from ROM at power-up. The initialization code then redefines or remaps the
memory map to place RAM at address 0x00000000—an important step because then the
exception vector table can be in RAM and thus can be reprogrammed.

1.4.2 Operating System:

First the initialization process prepares the hardware for an operating system to take control.

 An operating system organizes the system resources: the peripherals, memory, and
processing time. Then these resources, they can be efficiently used by different
applications running within the operating system environment.
 ARM processors support over 50 operating systems. The category of the OS are
1) Real-time operating systems (RTOSs)
GN SRIKANTH, EIEDEPT, RNSIT PAGE 10
ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

2) Platform operating systems.


Real-time operating system:
 Provide guaranteed response times to events.
 A hard-real-time application requires a guaranteed response to work
 Soft real-time application requires a good response time, but the performance
degrades more gracefully if the response time overruns.
 Systems running an RTOS generally do not have secondary storage.

Platform operating systems:


 Require a memory management unit to manage large, nonreal-time
applications and tend to have secondary storage.

ARM has developed a set of processor cores that specifically target each category

1.4.3 Applications

The operating system schedules applications ( Giving CPU time slices for applications)—code
dedicated to handling a particular task. An application implements a processing task;
the operating system controls the environment.
An embedded system can have one active application or several applications running
simultaneously.
ARM processors in market segments:
 Networking
 Automotive
 Mobile and consumer devices, mass storage, and imaging. Within each segment

ARM processors can be found in multiple applications.


For example
 The ARM processor is found in networking applications like home
gateways,
 DSL modems for high-speed Internet communication, and 802.11 wireless
communication.
 The mobile device segment is the largest application area for ARM processors
because of mobile phones.
ARM processors are also found in mass storage devices like
 Hard drives
 imaging products such as inkjet printers—applications that are cost sensitive and
high volume.

1.5 Summary:

Pure RISC is aimed at high performance, but ARM uses a modified RISC design philosophy that also targets good
code density and low power consumption.
An embedded system consists of a processor core surrounded by caches, memory, and peripherals.
The system is controlled by operating system software that manages application tasks.
The key points in a RISC design philosophy are to improve performance by reducing
the complexity of instructions, to speed up instruction processing by using a pipeline, to
provide a large register set to store data near the core, and to use a load-store architecture.
The ARM design philosophy also incorporates some non-RISC ideas:

It allows variable cycle execution on certain instructions to save power, area, and code size.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 11


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

It adds a barrel shifter to expand the capability of certain instructions.


It uses the Thumb 16-bit instruction set to improve code density.
It improves code density and performance by conditionally executing instructions.
It includes enhanced instructions to perform DSP type functions.

An embedded system includes the following hardware components:

ARM processors are found embedded in chips.


Programmers access peripherals through memory-mapped registers.
There is a special type of peripheral called a controller, which embedded systems use to configure
higher-level functions such as memory and interrupts.
The AMBA on-chip bus is used to connect the processor and peripherals together.

An embedded system also includes the following software components:

Initialization code configures the hardware to a known state. Once configured,


operating systems can be loaded and executed.
Operating systems provide a common programming environment for the use of hardware
resources and infrastructure.
Device drivers provide a standard interface to peripherals.
An application performs the task-specific duties of an embedded system.

[ EXTRA INFORMATION]

CISC and RISC


CISC is a Complex Instruction Set Computer. It is a computer that can address a large number of instructions.
In the early 1980s, computer designers recommended that computers should use fewer instructions with simple
constructs so that they can be executed much faster within the CPU without having to use memory. Such
computers are classified as Reduced Instruction Set Computer or RISC.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 12


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

ARM Processor Fundamentals

Module-1 [ PART-B ]
ARM core dataflow model, registers, current program status register, Pipeline, Exceptions,
Interrupts and Vector Table, Core extensions.

ARM Processor Fundamentals:


In this part of module1 focus is on the actual processor itself and deals with

 Overview of the processor core.


 Describe how data moves between its different parts.
 Describe the programmer’s model from a software developer’s view of the ARM
processor,
 Exhibit functions of the processor core and how different parts interact.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 13


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

 Show some core extensions that form an ARM processor. Core extensions speed up and
organize main memory as well as extend the instruction set.
 Cover the revisions to the ARM core architecture by describing the ARM core naming
conventions used to identify them and the chronological changes to the ARM instruction
set architecture.
 The final section introduces the architecture implementations by subdividing them into
specific
 ARM processor core families.

A programmer can think of an ARM core as functional units connected by data buses, as shown
in Figure 2.1, where, the arrows represent the flow of data, the lines represent the buses, and the
boxes represent either an operation unit or a storage area. The figure shows not only the flow of
data but also the abstract components that make up an ARM core.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 14


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

 Data enters the processor core through the Data bus.


 The data may be an instruction to execute or a data item.
 Figure 2.1 shows a Von Neumann implementation of the ARM—data items and
instructions share the same bus. In contrast, Harvard implementations of the ARM use
two different buses.
 The instruction decoder translates instructions before they are executed.
 Each instruction executed belongs to a particular instruction set.
 The ARM processor, like all RISC processors, uses a load-store architecture. This
 means it has two instruction types for transferring data in and out of the processor:
load instructions copy data from memory to registers in the core, and conversely the
store instructions copy data from registers to memory.
There are no data processing instructions that directly manipulate data in memory. Thus,
data processing is carried out solely in registers.

Data items are placed in the register file—a storage bank made up of 32-bit registers. Since the
ARM core is a 32-bit processor, most instructions treat the registers as holding signed or
unsigned 32-bit values. The sign extend hardware converts signed 8-bit and 16-bit numbers to
32-bit values as they are read from memory and placed in a register. ARM instructions typically
have two source registers, Rn and Rm, and a single result or destination register, Rd. Source
operands are read from the register file using the internal buses A and B, respectively.

The ALU (arithmetic logic unit) or MAC (multiply-accumulate unit) takes the register
values Rn and Rm from the A and B buses and computes a result. Data processing instructions
write the result in Rd directly to the register file. Load and store instructions use the ALU to
generate an address to be held in the address register and broadcast on the Address bus.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 15


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

One important feature of the ARM is that register Rm alternatively can be pre-processed
in the barrel shifter before it enters the ALU. Together the barrel shifter and ALU can calculate a
wide range of expressions and addresses. After passing through the functional units, the result in
Rd is written back to the register file using the Result bus. For load and store instructions the
incrementor updates the address register before the core reads or writes the next register value
from or to the next sequential memory location. The processor continues executing instructions
until an exception or interrupt changes the normal execution flow.
The key components of the processor: the registers, the current program status register
(cpsr), and the pipeline.
2.1 Registers

General-purpose registers hold either data or an address.


Registers are identified with the letter r prefixed to the register number.
For example, register 4 is given the label r4.
Figure 2.2 shows the active registers available in user mode—a protected mode normally

used when executing applications. The processor can operate in seven different modes,
( totally seven modes ).
 All the registers shown are 32 bits in size.
 There are up to 18 active registers:
 16 data registers
 2 processor status registers.
The data registers are visible to the programmer as r0 to r15.
The ARM processor has three registers assigned to a particular task or special function:
r13, r14, and r15. They are frequently given different labels to differentiate them from the
other registers.

 Register r13 is traditionally used as the stack pointer (sp) and stores the head of the
stack in the current processor mode
 Register r14 is called the link register (lr) and is where the core puts the return
address whenever it calls a subroutine

GN SRIKANTH, EIEDEPT, RNSIT PAGE 16


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

 Register r15 is the program counter (pc) and contains the address of the next
instruction to be fetched by the processor.
 Depending upon the context, registers r13 and r14 can also be used as general-
purpose
 registers, which can be particularly useful since these registers are banked during a
processor mode change.
 In ARM state the registers r0 to r13 are orthogonal any instruction that you can
apply to r0 you can equally well apply to any of the other registers.
 However, there are instructions that treat r14 and r15 in a special way.
 In addition to the 16 data registers, there are two program status registers: cpsr and
spsr (the current and saved program status registers, respectively).
 The register file contains all the registers available to a programmer. Which
registers are visible to the programmer depend upon the current mode of the
processor.

2.2 Current Program Status Register

 The ARM core uses the cpsr to monitor and control internal operations.
 The cpsr is a dedicated 32-bit register and resides in the register file.
 Figure 2.3 shows the basic layout of a generic program status register.
 The cpsr is divided into four fields, each 8 bits wide: flags, status, extension, and control.
 In current designs the extension and status fields are reserved for future use.
 The control field contains the processor mode, state, and interrupt mask bits.
 The flags field contains the condition flags.
 Some ARM processor cores have extra bits allocated. For example, the J bit, which can
 be found in the flags field, is only available on Jazelle-enabled processors, which execute

Figure 2.3 A Generic Program status register (psr)

Processor Modes:

GN SRIKANTH, EIEDEPT, RNSIT PAGE 17


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

 The processor mode determines which registers are active and the access rights to the
cpsr
register itself.
Each processor mode is either privileged or nonprivileged:
 A privileged mode allows full read-write access to the cpsr. Conversely, a nonprivileged
mode only allows read access to the control field in the cpsr but still allows read-write
access to the condition flags.
There are seven processor modes in total: six privileged modes (abort, fast interrupt
request, interrupt request, supervisor, system, and undefined) and one nonprivileged mode
(user).
 The processor enters abort mode when there is a failed attempt to access memory.
 Fast interrupt request and interrupt request modes correspond to the two interrupt levels
available
on the ARM processor.
 Supervisor mode is the mode that the processor is in after reset and is generally the mode
that an operating system kernel operates in.
 System mode is a special version of user mode that allows full read-write access to the
cpsr.
 Undefined mode is used when the processor encounters an instruction that is undefined or
not supported by the implementation.
 User mode is used for programs and applications.

Table 2.1 Processor mode.


Mode Abbreviation Privileged Mode[4:0]
Abort abt yes 10111
Fast interrupt request fiq yes 10001
Interrupt reques irq yes 10010
Supervisor svc yes 10011
System sys yes 11111
Undefined und yes 11011
User usr no 10000

Another important feature to note is that the cpsr is not copied into the spsr when a mode change is forced due to a
program writing directly to the cpsr. The saving of the cpsr only occurs when an exception or interrupt is raised.
Figure 2.3 shows that the current active processor mode occupies the five least significant bits of the cpsr. When
power is applied to the core, it starts in supervisor mode, which is privileged. Starting in a privileged mode is
useful since initialization code can use full access to the cpsr to set up the stacks for each of the other modes.
Table 2.1 lists the various modes and the associated binary patterns. The last column of the table gives the bit
patterns that represent each of the processor modes in the cpsr.

GN SRIKANTH, EIEDEPT, RNSIT PAGE 18


ARM PROCESSOR NOTES CODE: 21EI53 5TH SEM (EIE) 2023-24

GN SRIKANTH, EIEDEPT, RNSIT PAGE 19

You might also like