ARM
Advanced RISC Machines
Introduction and Architecture
ARM Applications
Apple iPod Ford Sync In-Car Comm &
Nano Entertainment System
Nokia N93 Sony Playstation 3
(60GB)
ARM Partnership
ARM Advantage
Why ARM here ???
• ARM is one of the most licensed and
thus widespread processor cores in the
world
• Used especially in portable devices
due to Low power consumption and
Reasonable performance
• Several interesting extensions
available like Thumb instruction set
and Jazelle Java machine
ARM History
• ARM – Acorn RISC Machine(1983–1985)
– Acorn Computers Limited, Cambridge,
England
• ARM – Advanced RISC Machine 1990
– ARM Limited, 1990
– ARM has been licensed to many
semiconductor manufacturers
ARM History
• Key component of many 32 – bit
embedded systems
• Portable Consumer devices
• ARM1 prototype in 1985
• One of the ARM’s most successful
cores is the ARM7TDMI,provides high
code density and low power
consumption
Advanced RISC Machines
• ARM Core uses a ________
architecture
• ARM is Physical hardware design
company.
• ARM licenses its cores out and other
companies make controllers based
on its cores.
Companies licensing with ARM
• 3com • Motorola
• Agilent Technologies • Panasonic
• Altera • Qualcomm
• Epson • Sharp
• • Sanyo
Freescale
• Sun Microsystems
• Fijitsu
• Sony
• NEC
• Symbian
• Nokia • Texas Instruments
• Intel • Toshiba
• IBM • Wipro
• Microsoft ………….And many more
Day 1
RISC Advantages
• A Smaller Die Size
• A Shorter Development Time
• Higher Performance (Bit Tricky)
– Smaller things have higher natural
frequencies
Day 1
RISC Disadvantages
• Generally poor code density (Fixed
Length Instruction)
Day 1
CISC vs. RISC
CISC RISC
Greater
Compiler Compiler
Complexity
Code Generation Code Generation
Greater
Processor Processor
Complexity
Day 1
Features used from RISC
• A Load/Store Architecture
• Fixed Length 32-bit Instructions
• 3- Address Instruction Formats
Day 1
3 Address Instruction Format
f bits n bits n bits n bits
Function op 1 addr. op 2 addr. dest. addr.
Example
Add d, s1, s2 ; d =s1+s2
Day 1
Features Rejected from Berkeley RISC
• Single Cycle Execution of ALL
Instructions
– Single Memory for Instruction & Data
– Even a simple load/store will require at
least two cycles
– Separate Data & Instruction was the
solution but was too costly those times
Day 1
ARM Design Policy
• Reduce power consumption
• High code density
• Price sensitive
• Reduce the area of the die taken up
by the embedded processor
• ARM Incorporated hardware debug
technology
Day 1
Pipeline
• Is a mechanism a RISC processor
uses to execute instructions
• Using a pipeline speeds up execution
by fetching the next instruction while
other instructions are being decoded
and executed
Day 1
ARM7 Three stage pipeline
Fetch Decode Execute
• Fetch loads an instruction from
memory
• Decode identifies the instruction to be
executed
• Execute processes the instruction and
writes the result back to a register
Day 1
Pipelined instruction sequence
Fetch Decod Execute
Cycle ADD e
Time
1
Cycle SUB ADD
2
Cycle CMP SUB ADD
3
• Filling the pipeline
• Allows the core to execute an
instruction every cycle
Day 1
ARM9 Five stage pipeline
Fetch Decode Execute Memory Write
• Higher operating frequency higher
performance
• Latency increases
• Increase in instruction throughput by
around 13% in 5 stage pipeline
• 1.1 Dhrystone MIPS per MHz
Day 1
ARM9 Five stage pipeline
• Fetch
– The instruction is fetched from memory and placed in the
instruction pipeline
• Decode
– The instruction is decoded and register operands read from the
register file
• Execute
– An operand is shifted and the ALU result generated
• Memory (Buffer/Data)
– Data memory is accessed if required. Otherwise the ALU result
is buffered for one clock cycle to give the same pipeline flow
for all instructions
• Write (Write-Back)
– The results generated by the instruction are written back to
the register file, including any data loaded from memory
Day 1
ARM10 Six stage pipeline
Fetch Issue Decode Execute Memory Write
• Increase in instruction throughput by
around 34% in 6 stage pipeline
• 1.3 Dhrystone MIPS per MHz
• Code written for the ARM7 will
execute on ARM9 and ARM10
Day 1
ARM Instruction Sequence
Fetch Decod Execut
e e
cpsr
Time Cycle MSR IFt_SVC
1 cpsr
Cycle ADD MSR IFt_SVC
2
cpsr
Cycle AND ADD MSR iFt_SVC
3
Cycle SUB AND ADD
4
Day 1
Pipeline Characteristics
• An instruction in the execute stage
will complete even though an
interrupt has been raised
• The execution of a branch instruction
or branching by the direct
modification of the PC causes the
ARM core to flush its pipeline
Day 1
Coprocessors
• Coprocessors can be attached to the ARM
processor
• A separate chip,that performs lot of calculations
for the microprocessor,relieving the CPU some of
its work and thus enhancing overall speed of
system.
• A secondary processor used to speed up operation
by taking over a specific part of main processors
work.
• The ARM processor uses coprocessor 15 registers
to control cache, TCMs, and memory management
Day 1
ARM processor families
• ARM7, ARM9, ARM10 and ARM11
• 7, 9, 10, 11 indicate different core
designs
Day 1
ARM family attribute comparison
ARM7 ARM9 ARM10 ARM11
Pipeline three-stage five-stage six-stage eight-stage
depth
Typical 80 150 260 335
mW/MHz
MHz 0.06 mW/MHz 0.19 mW/MHz0. 5 mW/MHz0.4
(+ (+ mW/MHz
(+ cache)
MIPS/MHz 0.97 1.1
cache) 1.3
cache) 1.2
Architecture Von Neumann Harvard Harvard Harvard
Multiplie 8 x 32 8 x 32 16 x 32 16 x 32
r
Day 1
ARM Processor Families
Day 1
ARM Processors
• ARM7 Family
ARM11 Family
– ARM7EJ-S ARM1136J-S
– ARM7TDMI ARM1136JF-S
– ARM7TDMI-S ARM1156T2(F)-S
– ARM720T ARM1176JZ(F)-S
• ARM11 MPCore
ARM9/9E Families
– ARM920T
– ARM922T Cortex Family
– ARM926EJ-S Cortex-A8
– ARM940T Cortex-M1
– ARM946E-S Cortex-M3
– ARM966E-S Cortex-R4
– ARM968E-S
• Vector Floating Point Families Other Processors/Microarchitectures
– VFP10 StrongARM (DEC-Intel)
• ARM10 Family Xscale (Intel- Marvell Tech)
– ARM1020E Other
– ARM1022E
– ARM1026EJ-S Day 1
Cortex Family
• ARM Cortex-A Series - Application
processors for complex OS and user
applications
– ARM Cortex-A8, ARM Cortex-A9
• ARM Cortex-R Series - Embedded
processors
for real-time systems
– ARM Cortex-R4(F)
• ARM Cortex-M Series – Embedded
processors optimized for cost sensitive
applications, as Mobile devices
– ARM Cortex-M0, ARM Cortex-M1, ARM Cortex-M3
Day 1
Switching States
• ARM to Thumb
– Execute the BX instruction with state
bit=1
• Thumb to ARM
– Execute the BX instruction with state
bit=0
– An interrupt or exception occurs
Day 1
“ In today’s systems the key is not raw processor speed but total
effective system performance and power consumption ”