U2 - ARM Processor
U2 - ARM Processor
U2 - ARM Processor
04/02/24 1
Instruction sets
04/02/24 2
von Neumann architecture
address
200
PC
memory data
CPU
200 ADD r5,r1,r3 ADD IR
r5,r1,r3
04/02/24 4
Harvard architecture
address
data memory
data PC
CPU
address
04/02/24 5
04/02/24 6
von Neumann vs. Harvard
04/02/24 7
RISC vs. CISC
04/02/24 8
04/02/24 9
Instruction set
characteristics
04/02/24 10
Programming model
04/02/24 11
Multiple implementations
04/02/24 12
Assembly language
04/02/24 13
ARM assembly language
example
04/02/24 14
Pseudo-ops
04/02/24 15
ARM FEATURES
Some of the general features of ARM are listed here.
ARM Processors have a good speed of execution to power consumption
ratio.
They have a wide range of clock frequency ranging from 1MHz to few GHz.
They support direct execution of Java bytecodes using ARM’s Java Jazelle
DBX.
ARM Processors have built in hardware for debugging.
Supports enhanced instructions for DSP operations.
04/02/24 16
04/02/24 17
ARM Processor Family
ARM has several processors that are grouped into number of families
based on the processor core they are implemented with. The architecture
of ARM processors has continued to evolve with every family. Some of
the famous ARM Processor families are ARM7, ARM9, ARM10 and
ARM11. The following table shows some of the commonly found ARM
Families along with their architectures.
04/02/24 18
ARM Nomenclature
The letters or words after “ARM” are used to indicate the features of a processor.
x – Family or series
y – Memory Management/Protection Unit
z – Cache
T – 16 bit Thumb decoder
D – JTAG Debugger
M – Fast Multiplier
I – Embedded In-circuit Emulator (ICE) Macrocell
E – Enhanced Instructions for DSP (assumes TDMI)
J – Jazelle (for accelerated JAVA execution)
F – Vector Floating-point Unit
S – Synthesizable Version
04/02/24 19
ARM Processors
04/02/24 20
Even though ARM7 or other classic ARM Processors can be used for
small scale embedded systems, newer embedded systems are built using
the advanced ARM embedded processors or the Cortex-M processors and
Cortex-R Processors.
04/02/24 21
ARM Embedded
Processors
ARM Cortex-M Processors have a Microcontroller profile while the Cortex-R
Processors have a Real time profile.
ARM Cortex-M Processors are energy efficient, simple to implement and are
mainly developed for advanced embedded applications. ARM Cortex-M
Processors are further divided into several processor cores like Cortex-M0, Cortex-
M0+, Cortex-M3, Cortex-M4 and Cortex-M7.
ARM Cortex-R Series of processors provide solution for real time embedded
systems. They provide high reliability, high fault tolerance and real time responses.
Cortex-R series of processors are used in systems where high performance is
required and timing deadlines are important.
The Cortex-R family includes the processor cores like Cortex-R4, Cortex-R5,
Cortex-R7 and Cortex-R8.
04/02/24 22
04/02/24 23
ARM Application
Processors
ARM Cortex-A Series of processors are the highest
performance processors from ARM. They are used in
powerful mobile devices, compelling technology products like
network devices, consumer appliances, automation systems,
automobiles and other embedded systems.
The Cortex-A Processors are again divided into high
performance, high efficiency and ultra-high efficiency type
processors. Each sub division has several types of processor
cores.
04/02/24 24
04/02/24 25
ARM instruction set
ARM versions.
ARM assembly language.
ARM programming model.
ARM memory organization.
ARM data operations.
ARM flow of control.
04/02/24 26
ARM versions
04/02/24 27
ARM assembly language
04/02/24 28
ARM programming model
r0 r8
r1 r9 0
31
r2 r10
r3 r11 CPSR
r4 r12
r5 r13
r6 r14 NZCV
r7 r15 (PC)
little-endian big-endian
04/02/24 30
ARM data types
04/02/24 31
ARM status bits
04/02/24 32
ARM data instructions
Basic format:
ADD r0,r1,r2
Computes r1+r2, stores in r0.
Immediate operand:
ADD r0,r1,#2
Computes r1+2, stores in r0.
04/02/24 33
ARM data instructions
ADD, ADC : add (w. AND, ORR, EOR
carry) BIC : bit clear
SUB, SBC : subtract LSL, LSR : logical shift
(w. carry) left/right
RSB, RSC : reverse ASL, ASR : arithmetic
subtract (w. carry) shift left/right
MUL, MLA : multiply ROR : rotate right
(and accumulate) RRX : rotate right
extended with C
04/02/24 34
Data operation varieties
Logical shift:
fills with zeroes.
Arithmetic shift:
fills with ones.
RRX performs 33-bit rotate, including C bit
from CPSR above sign bit.
04/02/24 35
04/02/24 36
ARM comparison
instructions
CMP : compare
CMN : negated compare(ADDITION)
TST : bit-wise test(AND)
TEQ : bit-wise negated test(EX-OR)
These instructions set only the NZCV bits
of CPSR.
04/02/24 37
ARM move instructions
04/02/24 38
ARM load/store
instructions
04/02/24 39
ARM ADR pseudo-op
04/02/24 40
Example: C assignments
C:
x = (a + b) - c;
Assembler:
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
ADR r4,b ; get address for b, reusing r4
LDR r1,[r4] ; get value of b
ADD r3,r0,r1 ; compute a+b
ADR r4,c ; get address for c
LDR r2[r4] ; get value of c
04/02/24 41
C assignment, cont’d.
SUB r3,r3,r2 ; complete computation of x
ADR r4,x ; get address for x
STR r3[r4] ; store value of x
04/02/24 42
Example: C assignment
C:
y = a*(b+c);
Assembler:
ADR r4,b ; get address for b
LDR r0,[r4] ; get value of b
ADR r4,c ; get address for c
LDR r1,[r4] ; get value of c
ADD r2,r0,r1 ; compute partial result
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
04/02/24 43
C assignment, cont’d.
MUL r2,r2,r0 ; compute final value for y
ADR r4,y ; get address for y
STR r2,[r4] ; store y
04/02/24 44
Example: C assignment
C:
z = (a << 2) | (b & 15);
Assembler:
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
MOV r0,r0,LSL 2 ; perform shift
ADR r4,b ; get address for b
LDR r1,[r4] ; get value of b
AND r1,r1,#15 ; perform AND
ORR r1,r0,r1 ; perform OR
04/02/24 45
C assignment, cont’d.
ADR r4,z ; get address for z
STR r1,[r4] ; store value for z
04/02/24 46
Additional addressing
modes
Base-plus-offset addressing:
LDR r0,[r1,#16]
Loads from location r1+16
Auto-indexing increments base register:
LDR r0,[r1,#16]!
Post-indexing fetches, then does offset:
LDR r0,[r1],#16
Loads r0 from r1, then adds 16 to r1.
04/02/24 47
ARM flow of control
04/02/24 48
04/02/24 49
Example: if statement
C:
if (a > b) { x = 5; y = c + d; } else x = c - d;
Assembler:
; compute and test condition
ADR r4,a ; get address for a
LDR r0,[r4] ; get value of a
ADR r4,b ; get address for b
LDR r1,[r4] ; get value for b
CMP r0,r1 ; compare a < b
BGE fblock ; if a >= b, branch to false block
04/02/24 50
If statement, cont’d.
; true block
MOV r0,#5 ; generate value for x
ADR r4,x ; get address for x
STR r0,[r4] ; store x
ADR r4,c ; get address for c
LDR r0,[r4] ; get value of c
ADR r4,d ; get address for d
LDR r1,[r4] ; get value of d
ADD r0,r0,r1 ; compute y
ADR r4,y ; get address for y
STR r0,[r4] ; store y
B after ; branch around false block
04/02/24 51
If statement, cont’d.
; false block
fblock ADR r4,c ; get address for c
LDR r0,[r4] ; get value of c
ADR r4,d ; get address for d
LDR r1,[r4] ; get value for d
SUB r0,r0,r1 ; compute a-b
ADR r4,x ; get address for x
STR r0,[r4] ; store value of x
after ...
04/02/24 52
Summary
Load/store architecture
Most instructions are RISC, operate in
single cycle.
Some multi-register operations take longer.
All instructions can be executed
conditionally.
04/02/24 53
CPUs
04/02/24 54
I/O devices
status
mechanism
reg
CPU
data
reg
04/02/24 55
Application: 8251 UART
04/02/24 56
Serial communication
no
char
time
04/02/24 57
Serial communication
parameters
04/02/24 58
8251 CPU interface
status
(8 bit)
CPU xmit/
8251
rcv
data serial
(8 bit) port
04/02/24 59
Programming I/O
04/02/24 61
Interrupt I/O
04/02/24 62
Interrupt interface
intr request
status
mechanism
intr ack reg
PC
IR
CPU
data/address data
reg
04/02/24 63
Interrupt behavior
04/02/24 64
Priorities and vectors
04/02/24 65
Prioritized interrupts
interrupt
acknowledge
L1 L2 .. Ln
CPU
04/02/24 66
Interrupt prioritization
04/02/24 67
Interrupt vectors
Interrupt handler 0
vector
handler 1
table head
handler 2
handler 3
04/02/24 68
Interrupt sequence
04/02/24 69
Sources of interrupt
overhead
04/02/24 70
Supervisor mode
May want to provide protective barriers between
programs.
Avoid memory corruption.
Need supervisor mode to manage the various program
C55x does not have a supervisor mode.
CPU in supervisor mode is called SWI
The old value of the CPSR just before the SWI
is stored in a register called the saved program statu
register (SPSR).
04/02/24 71
Exception
04/02/24 72
Trap
04/02/24 73
Co-processor
CPU performance
CPU power consumption.
04/02/24 75
Elements of CPU
performance
Cycle time.
CPU pipeline.
Memory system.
04/02/24 76
Pipelining
04/02/24 77
Performance measures
04/02/24 78
ARM7 pipeline
04/02/24 79
ARM pipeline execution
time
1 2 3
04/02/24 80
Pipeline stalls
04/02/24 81
CPU power consumption
04/02/24 82
CMOS power consumption
04/02/24 83
CPU power-saving
strategies
04/02/24 84
Power management
04/02/24 85