Elec6036-Mod0-Intro HPC Upd 2022

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

ELEC 6036 – High Perf. Comp.

Architecture
Module 0 :

AN INTRODUCTION to
Basic Computer Systems Concepts
& High Perform ance Com puting
(HPC)
ELEC 6036 - HPC Written by Dr. V. Tam 1
About the Evolution of
Computer Systems..
[**mainly taken from Chp.1, p. 6 ~ 12, of “Advanced Computer
Architecture – Parallelism, Scalability & Programmability” by Kai
Hwang]
 Since the birth of 1st generation electronic computers
(like the IBM 701 based on vacuum tubes) thru’ 4th
generation (incl. the VAX 9000 or IBM 3090 based on
VLSI), and now up to the 5th generation massively
parallel computers (like Fujitsu VPP500, Cray/MPP, CM-
5 using ULSI processors) to achieve teraflops (1012
floating-point operations per sec.), it is long recognized
that the concept of computing architecture is

N O LON GER restricted to the bare m achine hardw are !

ELEC 6036 - HPC Written by Dr. V. Tam 2


System Software…
 A modern computer is an integrated system
consisting of machine hardware, an
instruction set [to be elaborated later],
system software [e.g. pre-processor,
compiler, linker, loader, or
instruction/process scheduler],
application programs [e.g. examples of
assembly programs we showed in this
course or reference books], and user
interfaces.
ELEC 6036 - HPC Written by Dr. V. Tam 3
Architecture of A Modern Computer
System = H/W + S/W

ELEC 6036 - HPC Written by Dr. V. Tam 4


System S/W Supports…
 System software (including
compiler or loader) is needed for the
development of “efficient programs”,
esp. for parallel computation, in high-
level languages (HLL);

ELEC 6036 - HPC Written by Dr. V. Tam 5


System S/W Supports…
 The compiler is generally used to translate
source code written in HLL into object
code. The (optimizing) compiler assigns
variables to registers or to memory words,
and reserves functional units (FUs)
(or sometimes called processing
elements [PEs] – similar to the
processor cores in each CPU) for
operators;
ELEC 6036 - HPC Written by Dr. V. Tam 6
System S/W Supports…
 An assembler is used to translate the
compiled object code into machine code
which can be directly recognized by the
machine H/W;

 A loader is used to initiate the program


execution through the OS kernel /
manager.
ELEC 6036 - HPC Written by Dr. V. Tam 7
To Achieve Greater Performance
Gain Thru’ Parallelism !!
 Over the past few decades (~ 40 yrs),
we found that greater performance gain
can be achieved thru’ executing the
computer instructions or blocks of code
from a sequential mode (one after
another) to the concurrent / parallel
execution
Assuming: all the involved instructions
are independent of each other !
ELEC 6036 - HPC Written by Dr. V. Tam 8
Models of the Parallel
Computation…

 Basically, the development of computer


architecture for parallel computation in the
past decades has gone thru’ evolutional
rather than revolutional changes, to
be depicted as follows.

ELEC 6036 - HPC Written by Dr. V. Tam 9


Evolutionary Models of the
Parallel Computation..

The scope of this course


refers all the way from
“scalar” thru’ “pipeline”
to “implicit vector” in the
above diagram.

[taken from Fig 1.2 of Kai Hwang’s “Advanced Comp. Arch.”]


ELEC 6036 - HPC Written by Dr. V. Tam 10
Further Notes…
 SIMD – stands for “single instruction
stream over multiple data streams”.
SIMD is an example of the vector
computers.
 MIMD – stands for “multiple instruction
streams over multiple data streams”.
MIMD is an example of the parallel,
specifically MPP, computers.

ELEC 6036 - HPC Written by Dr. V. Tam 11


KEY Modules of this Course..
 This course covers 5 KEY modules as a fully
integrative approach covering BOTH
fundamental and NEW developments in HPC
! Mod. 5 –
* Cloud + GPU Computing
&
Sys. Architectures
Mod. 2 – Adv. Mod. 3 –
Mod. 1 – Basic Pipelining &
Issues in Dynamic
Pipelining Dynamic Tomasulo’s
Scoreboard Approach
Mod. 4 –
* ARM Design &
Predictive Approach
Core / Fundamental HPC Concepts
ELEC 6036 - HPC Written by Dr. V. Tam 12
Fundamental Concepts/Def’s
for Computer Systems
 To recall on p.10 of “Motivational Notes on
HPC”, a central processing unit (CPU) is the
“brain” of each computer system;
 Each CPU consists of a few [say 2 − 8]
(processor) cores in which each processor
core is a processing unit which reads in
instructions to perform specific actions.
 Thus, to look into the behavior / performance
of each computer system, we would firstly
study the structure of its CPU / processor.
ELEC 6036 - HPC Written by Dr. V. Tam 13
Structure of the ARM
processor…
 Below is the structure of the ARM1176JZF-S processor
commonly used in many micro-controllers / mobile devices.

We can see each


processor / core
are made up of
different
components /
functional units
to serve various
purposes.

ELEC 6036 - HPC Written by Dr. V. Tam 14


About the core/processor…
 The ARM1176JZF-S is a 32-bit
processor/core, i.e. the computer word
length of each computer system –
meaning the processor / core can
handle a 32-bit instruction / data in
each clock cycle.
(32-bit instruction / data)
00101011 01101001 10110110 00110101
>>each byte = 8 bits, 32-bit = 4 bytes
ELEC 6036 - HPC Written by Dr. V. Tam 15
Supporting Units to the Core…
 Each core / processor is well supported with a no. of
components/units inside the processor chip for fast
computation and data storage/retrieval;
 The ARM1176JZF-S has: integer arithm/ instructions
 33 general-purpose 32-bit registers (R0…R32);

 7 dedicated / specialized registers; > (floating-pt values)

 Arithmetic Logic Unit (ALU) - the ALU performs all

arithmetic and logic operations, and generates the


condition codes for instructions to set specific flags;
 Vector Floating Point (VFP) Co-Processor – for much

faster floating-point arithmetic.


ELEC 6036 - HPC Written by Dr. V. Tam 16
main = primary (e.g. RAM or ROM)

Memory Management Unit..


 The processor memory management unit (MMU)
works with the cache memory system to control
accesses to and from external/main memory;
 The MMU also controls the translation of virtual

addresses to physical addresses;


 Capacity of the Main / External Memory :

Storage Size (i.e. No. of Memory Addresses) X


[Size of Each Memory Address / Cell]
e.g. 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 = 232 X [32 bits]
= 210 = 1𝐾𝐾 X 1𝐾𝐾 X1𝐾𝐾 X 22 X [4 X 8 bits]
= 16 G bytes (since 8 bits = 1 byte)
ELEC 6036 - HPC Written by Dr. V. Tam 17
The Prefetch Unit &
Instruction Cache..
 The prefetch unit fetches (16-bit or 32-bit)
instructions from the instruction cache (also
called I-cache), Instruction Tightly Coupled
Memory (TCM), or from external memory and
predicts the outcome of branches in the
instruction stream (to be covered in Mod-4);
 Modern microprocessors make extensive uses of
caches for fast access and storage of data /
instructions, (L1/L2/L3) D-cache or I-cache.
Performance of caches : Registers >> Caches >>
Main memory (>> means faster)
ELEC 6036 - HPC Written by Dr. V. Tam 18
register of a CPU << (load) << Main Memory
register of a CPU >> (store) >> Main Memory

Load & Store Unit (LSU)…


 The Load Store Unit (LSU) manages all
LOAD and STORE operations, e.g. LOAD
a value from the memory address
$A0A1B007 to register R0;

 The load-store pipeline decouples load


and store operations from the other
pipelines such as those for the ALU
operations.
ELEC 6036 - HPC Written by Dr. V. Tam 19
5 Basic Types of Instructions
for ALL Computer Systems..
 To facilitate our subsequent discussion on all upcoming lecture
notes about the pipelining computer systems like RS/3000 or
/4000 architecture, we generally categorize ALL assembly
instructions into 5 BASIC TYPES for MOST computer
systems;
 In the 2nd (Interpretation) column of the following tables,
 the first comment highlighted in blue is for

interpretation/meaning of the instruction on the


MIPS R3000/R4000 pipeline or similarly the DLX
architecture as commonly adopted in our subsequent
lecture notes;
 the second comment highlighted in orange is for

possible interpretation as on other architecture.

ELEC 6036 - HPC Written by Dr. V. Tam 20


5 Basic Types of Instructions
for ALL Computer Systems..
Interpretation/Meaning
Types of Instructions & Examples

 R-type : instructions relating - the interpretation is


ADD [DST] , [SRC1],
to registers only [SRC2]
So, R2 + R3  (i.e.
assign to) R1 as the
e.g. ADD R1, R2, R3 destination

- other possible
interpretation: ADD
[SRC1], [SRC2], [DST]
So, R1 + R2  R3

ELEC 6036 - HPC Written by Dr. V. Tam 21


5 Basic Types of Instructions
for ALL Computer Systems..

 ORi : Operands (as already stored - the interpretation


is ADD [DST] ,
in memory) and Register i [SRC1], [SRC2]
So, [$1010] + R3 
(i.e. assign to) R1 as
e.g. ADD R1, $1010, R3 the destination

- other possible
$ - hexa-decimal (base 16) interpretation: ADD
$1010_(16) each digit : 0 ...9.A (=10) .B...F(=15) [SRC1], [SRC2],
[DST]
So, R1 + [$1010] 
R3

ELEC 6036 - HPC Written by Dr. V. Tam 22


5 Basic Types of Instructions
for ALL Computer Systems..
 LW : Load a (computer) Word from the main - the interpretation is
[$1010]  R1 as the
memory to a register destination

e.g. LW R1, $1010 - SAME AS ABOVE

 SW : Store a (computer) Word from a - the interpretation is


R1  [$1010] as the
register to the main memory destination

e.g. SW R1, $1010 - SAME AS ABOVE

ELEC 6036 - HPC Written by Dr. V. Tam 23


5 Basic Types of Instructions
for ALL Computer Systems..
 BEQ : Branch on Equal (or - the interpretation
is
Branch instructions in general) IF (R1 == 0) then
jump to addr.
$1832 to execute
e.g. BEQ R1, $1832 the instr. There

- SAME AS ABOVE

ELEC 6036 - HPC Written by Dr. V. Tam 24


ARM Instructions & MIPS/DLX..
 In fact, ARM instructions (for mobile devices
like smartphones or Raspberry Pi) are also very
similar to the 5 basic formats of MIPS/DLX
(i.e. the first [blue] highlighted
interpretation), e.g.
Common ARM Description
Instructions
ADD x, y, z y + z → x [DST] (ref. :R-type)
LDR r, addr Load into register r from addr (ref. )
STR r, addr Store from register r to addr (ref. )
BEQ <label> Branch on Equal to <label> (ref. )
ELEC 6036 - HPC Written by Dr. V. Tam 25
~~ END of Module 0 ~~

ELEC 6036 - HPC Written by Dr. V. Tam 26

You might also like