A0b36apo Prednaska03 Memory en
A0b36apo Prednaska03 Memory en
A0b36apo Prednaska03 Memory en
Memory
Pavel Pa, Michal tepanovsk, Miroslav norek
Main source of inspiration: Patterson
A: B:
int matrix[M][N]; int matrix[M][N];
int i, j, sum = 0; int i, j, sum = 0;
for(i=0; i<M; i++) for(j=0; j<N; j++)
for(j=0; j<N; j++) for(i=0; i<M; i++)
sum += matrix[i][j]; sum += matrix[i][j];
ctrl
ALU
Input Output
28. 12. 1903 -
8. 2. 1957
5 functional units control unit, arithmetic logic unit, memory, input (devices),
output (devices)
An computer architecture should be independent of solved problems. It has to
provide mechanism to load program into memory. The program controls what the
computer does with data, which problem it solves.
Programs and results/data are stored in the same memory. That memory consists
of a cells of same size and these cells are sequentially numbered (address).
The instruction which should be executed next, is stored in the cell exactly after
RAM MC MC
MC MC
Northbridge MC Northbridge
CPU 1 CPU 2
RAM
SATA SATA SATA
USB Southbridge USB Southbridge USB Southbridge
PCI-E PCI-E PCI-E
Control ALU
Unit Unit
a 2a
32 4G (4096M, M=K2)
000000H
Basic memory parameters:
Access time delay or latency between a request and the access
being completed or the requested data returned
Memory latency time between request and data being available
(does not include time required for refresh and deactivation)
Throughput/bandwidth main performance indicator. Rate of
transferred data units per time.
Maximal, average and other latency parameters
10,000.00
Processor-Memory
Performance
1.00
1980 1985 1990 1995 2000 2005 2010
Source: Hennesy, Patterson
Year CaaQA 4th ed. 2006
Robotic
access
system
Input/output Secondary storage Off-line storage
channels
Source: Wikipedia.org
AE0B36APO Computer Architectures 20
Contemporary price/size examples
CPU
Address
comparator
Hit Data
Capacity C
Number of sets S
Block size b
Number of blocks B
Degree of associativity N
C = 8 (8 words),
S = B = 8,
b = 1 (one word in the block),
N=1
AE0B36APO Computer Architectures 32
Direct mapped cache
Capacity C
Number of sets S
Block size b
Number of blocks B
Degree of associativity N
C = 8 (8 words),
S = 4,
b = 1 (one word in the block),
B=8
N=2 What is main advantage of higher associativity?
AE0B36APO Computer Architectures 36
4-way set associative cache
Virtual
address
space
process-A Page
frame
Virtual
address Disk
space
process-B
Page size = frame size
Physical memory
AE0B36APO Computer Architectures 49
Virtual/physical address and data
Data
Page Table
Root pointer/page directory base register (x86 CR3=PDBR)
Page table directory PTD
Page table entries PTE
Basic mapping unit is a page (page frame)
Page is basic unit of data transfers between main
memory and secondary storage
Mapping is implemented as look-up table in most cases
Address translation is realized by Memory Management
Unit (MMU)
Example follows on the next slide:
PDBR
Page directory is represented as data structure stored in main memory. OS task is
to allocate physically continuous block of memory (for each process/memory
context) and assign its start address to special CPU/MMU register.
PDBR - page directory base register for x86 register CR3 holds physical
address of page directory start, alternate names PTBR - page table base register
the same thing, page table root pointer URP, SRP on m68k
PA physical address
Page fault
Processor procession by OS
Z
a Address Main Secondary
translation memory store
a'
Virtual address Physical address
OS process
data transfer
Translation would take long time, even if entries for all levels were
present in cache. (One access per level, they cannot be done in
parallel.)
The solution is to cache found/computed physical addresses
Such cache is labeled as Translation Look-Aside Buffer
Even multi-level translation caching are in use today
AE0B36APO Computer Architectures 59
Fast MMU/address translation using TLB
Translation-Lookaside Buffer, or may it be, more descriptive name
Translation-Cache
RS
pit land
track
Encoded
data
Record on
media (one
track)
Ones are encoded by signal change!
Zeros as no change. Bit stuffing etc.
Read: