AC CourseWork MihaiTrofim
AC CourseWork MihaiTrofim
AC CourseWork MihaiTrofim
Course Work
Computer Architecture
Topic: Floating point multiplication (algorithm nr.1)
Performed by:
Trofim Mihai
Verified by:
Sudacevschi Viorica
Chiinu 2016
Content
Introduction...............................................................................................................2
1. Central Processing Unit.......................................................................................2
1.1 CPU basics.......................................................................................................2
1.2 The register set.................................................................................................3
1.3 Instruction cycle...............................................................................................4
2. I8086 microprocessor architecture......................................................................5
2.1 Execution Unit.................................................................................................6
2.2 Bus Interface Unit............................................................................................6
2.3 Registers set of I8086.......................................................................................7
3. Instruction set architecture..................................................................................9
3.1
Instruction Format.......................................................................................11
3.2
Instruction Types.........................................................................................12
Introduction
A computer consists of a set of physical components (hardware) and system
programs (system software) that are responsible for data processing according to
an algorithm, specified by the user through an application program (application
software).
Computer systems have conventionally been defined through their interfaces
at a number of abstraction levels, each providing functional support to its
predecessor. Included among the levels are the application programs, the high-level
languages, and the set of machine instructions.
In the past, the term computer architecture often referred only to
instruction set design that represents an interface between hardware and the lowest
level software - machine instructions (binary coded programs).
A different definition of computer architecture is built on four basic
viewpoints:
structure (defines the interconnection of various hardware components),
organization (defines the dynamic interplay and management of the various
components),
implementation (defines the detailed design of hardware components),
performance (specifies the behavior of the computer system).
program counter (PC) (is the register that contains the address of the next
instruction to be fetched). After a successful instruction fetch, the PC is
updated to point to the next instruction to be executed.
instruction register (IR) in which the fetched instruction is loaded
Two registers are essential in memory write and read operations:
memory data register (MDR)
memory address register (MAR).
The MDR and MAR are used exclusively by the CPU and are not directly
accessible to programmers.
In order to perform a write operation into a specified memory location, the
MDR and MAR are used as follows:
1. The word to be stored into the memory location is first loaded by the CPU
into MDR.
2. The address of the location into which the word is to be stored is loaded by
the CPU into a MAR.
3. A write signal is issued by the CPU.
Similarly, to perform a memory read operation, the MDR and MAR are used
as follows:
1. The address of the location from which the word is to be read is loaded into
the MAR.
2. A read signal is issued by the CPU.
3. The required word will be loaded by the memory into the MDR ready for
use by the CPU.
Some architectures contain a special program status word (PSW) register
or a Flag register. The PSW contains bits that are set by the CPU to indicate the
current status of an executing program. These indicators are typically for arithmetic
operations, interrupts, memory protection information, or processor status.
It only works if BIU keeps ahead of EU. Thus BIU has a buffer of queue. (6
bytes). If the execution of any instruction takes to long, the BIU is filled to its
maximum capacity and busses will stay idle. It starts to fetch again whenever there
is 2-byte room in the queue.
When there is a jump instruction, the microprocessor must flush out the
queue. When a jump instruction is executed BIU starts to fetch information from
the new location in the memory. In this situation EU must wait until the BIU starts
to fetch the new instruction. This is known as branch penalty.
The Execution Unit does not connect directly to the system bus. It obtains
instructions from a queue maintained by the Bus Interface Unit. When an
instruction requires access to memory or a peripheral device, the Execution Unit
requests the Bus Interface Unit to read and write data.
The data registers can be addressed by their upper or lower halves. Each data
register can be used interchangeably as a 16-bit register or two 8-bit registers. The
pointer and index registers are always accessed as 16-bit values. The p can use
data registers without constraint in most arithmetic and logic operations.
Arithmetic and logic operations can also use the pointer and index registers. Some
instructions use certain registers implicitly allowing compact encoding.
SP - Stack Pointer: Always points to top item of the stack.
BP - Base Pointer: It is used to access any item in the stack;
SI - Source Index: Contains the address of the current element in the source string;
DI - Destination Index: Contains the address of the current element in the
destination string.
2. Segment registers
The microprocessor 8086 has a 20-bit address bus for 1 Mbyte
external memory but inside the CPU registers have 16 bits that can access 64
Kbytes. The 8086 family memory space is divided into logical segments of
up to 64 Kbytes each. The segment registers contain the base addresses
(starting locations) of these memory segments.
8
Control flags:
8 bit - Trap Flag (TF) System flag - Used for on-chip debugging (pas cu
pas) when TF=1. In this case the interrupt is generated (int 1) which calls a
special routine to show the state of internal registers. There are no
instructions to change this flag. The content of PSW is written in one general
Rg through the stack to can change it.
9 bit - Interrupt enable Flag (IF) System flag - when this flag is set to 1
CPU reacts to interrupts on INTR input of the microprocessor from external
devices. When IF=0 interrupts are not allowed (masked). IF do not react to
NMI (non maskable) interrupts and to internal interrupts performed by
instruction INT. Instructions CLI (clear interrupt) and STI (set interrupt) are
used to control this flag.
10 bit - Direction Flag (DF) - this flag is used by some instructions to
process data chains, when this flag is set to 0 - the processing is done
forward (increment of SI and DI registers), when this flag is set to 1 the
processing is done backward - decrement (instructions CLD and STD).
10
and fast code could be critical in some embedded and portable applications, where
resources may be very limited. In such cases, small portions of the program that
may be heavily used can be written in assembly language.
Assembly programmers have access to all the hardware features of the target
machine that might not be accessible to high-level language programmers.
Learning assembly languages can be of great help in understanding the low
level details of computer organization and architecture.
Machine language is the native language of a given processor. Since
assembly language is the symbolic form of machine language, each different type
of processor has its own unique assembly language. Before we study the assembly
language of a given processor, we need first to understand the details of that
processor. We need to know the memory size and organization, the processor
registers, the instruction format, and the entire instruction set.
Labels are used to provide symbolic names for memory addresses. A label is
an identifier that can be used on a program line in order to branch to the labeled
line. It can also be used to access data using symbolic names. The operation code
(opcode) field contains the symbolic abbreviation of a given operation. The
operand field consists of additional information or data that the opcode requires.
The operand field may be used to specify constant, label, immediate data, register,
or a memory address. The comments field provides a space for documentation to
explain what has been done for the purpose of debugging and maintenance. In
I8086 instruction consists from one to six bytes.
12
Arithmetic operations can use all addressing modes but one operand should
be a register.
ADD dst, src: dst (dst) + (scr) src can be also immediate value of 8 or 16
bits
ADC dst,src: dst (dst) + (src) + CF. It is used in multiple precision
operations
SUB dst, src: dst (dst) - (src) Subtract byte from byte or word from word.
SBB dst, src: dst (dst) - (src) - CF
INC opr: opr (opr) + 1 do not change CF.
DEC opr: opr (opr) - 1
NEG opr: opr - (opr) Negate invert each bit of a specified byte or word and
add 1 (form 2s complement).
CMP opr1, opr2: Compare two specified bytes or two specified words and do
not keep the result, just for flags (OF, SF, ZF, AF, PF, CF according to result). It is
used with conditional jump instructions.
CBW: (no opr) (for signed binary) converts byte to word. If the high digit in AL is
0 then all AH bits are 0, if high bit in AL is 1 then all AH bits are 1.
CWD: convert word to double word. Works with AX and DX (high word)
MUL src: (AX) (AL) * (src) for bytes CF and OF =1 if the high byte is not 0
(DX : AX) (AX) * (src) for words
IMUL src: Multiply signed byte by byte or signed word by word CF and OF =1
if the high byte is not the extension of sign
DIV src:
divisor is a byte
(AL) quotient (AX) / (src)
(AH) remainder (AX) / (src)
divisor is a word
(AX) quotient (DX : AX) / (src)
(DX) remainder (DX : AX) / (src)
IDIV src: Divide signed word by byte or signed double word by word.
15
binary format. The Cray T90 series had an IEEE version, but the SV1 still uses
Cray floating-point format.
The standard provides for many closely related formats, differing in only a few
details. Five of these formats are called basic formatsand others are
termed extended formats; three of these are especially widely used in computer
hardware and languages:
Single precision, usually used to represent the "float" type in the C language
family (though this is not guaranteed). This is a binary format that occupies 32
bits (4 bytes) and its significand has a precision of 24 bits (about 7 decimal
digits).
Increasing the precision of the floating point representation generally reduces the
amount of accumulated round-off error caused by intermediate calculations.[8]
Less common IEEE formats include:
Any integer with absolute value less than 224 can be exactly represented in the
single precision format, and any integer with absolute value less than 253 can be
exactly represented in the double precision format. Furthermore, a wide range of
powers of 2 times such a number can be represented. These properties are
sometimes used for purely integer data, to get 53-bit integers on platforms that
have double precision floats but only 32-bit integers.
The standard specifies some special values, and their representation:
positive infinity (+), negative infinity (), a negative zero (0) distinct from
ordinary ("positive") zero, and "not a number" values (NaNs).
Comparison of floating-point numbers, as defined by the IEEE standard, is a bit
different from usual integer comparison. Negative and positive zero compare
equal, and every NaN compares unequal to every value, including itself. All values
except NaN are strictly smaller than + and strictly greater than . Finite
floating-point numbers are ordered in the same way as their values (in the set of
real numbers).
A project for revising the IEEE 754 standard was started in 2000 (see IEEE 754
revision); it was completed and approved in June 2008. It includes decimal
floating-point formats and a 16-bit floating-point format ("binary16"). binary16 has
the same structure and rules as the older formats, with 1 sign bit, 5 exponent bits
and 10 trailing significand bits. It is being used in the NVIDIA Cg graphics
language, and in the openEXR standard.
ey
y=m y 2
ez
z=m z 2
The result is stored in
Here are the steps of the algorithm:
19
sign(my)
As the 8086 processor sees the numbers if Twos Complement code, the sign
of a number is its MSB (Most Significant Bit)
2. Find modules of mantissas
If a number is positive, its module remain unchanged. If it is negative (MSB
= 1) it is converted in CC by inverting all bits, and adding 1 to the number.
if( sign(mx) = 0 ) | mx | = mx
if( sign(mx) = 1 ) | mx | = neg( mx )
neg is an assembly instruction which performs the conversion in CC.
20
5. Result normalization
Shift mz to left how many times MSB is repeated.
if ( mz = 1.1 or mz = 0.0 ) mz = mz
and ez = ez - 1
|MY|
01110011
COMMENTS
+|mx|
m y
00111001
mx
, +|
0 1101110
0000001 01001010
mx|
m y
00011100
mx
, +|
01 101110
00000001 01001010
mx|
m y
2
00000111
mx
, +|
0110 1110
00001000 00101010
mx|
m y
00000011
mx
, +|
01101 110
00010101 11101010
mx|
m y
00000001
mx
, +|
011011 10
00110001 01101010
mx|
m y
00000000
mx
0110111 0
5. Normalization
mz = 0.0110001 01101010
mz = 0.110001011010100
ez = ez - 1 = 0.0010011 1 = 0.00010010
6. The sign of result remain the same
5. Code explanation
22
The assembly program performs the floating point multiplication (algorithm nr.1).
It uses the following macros:
FirstAlgMult macro x,y
It performs the fixed-point multiplication (algorithm nr.1) of sent peremeters
x and y, and stores the result in mz variable
Normalize macro z,exp
It checks if the number is denormalized. It uses a mask (C000h) for
highlighting the first 2 bits of the number and checks if it is 00 or 11. If so, it
shift z to the left and decrement exp.
print macro message
Print a message to the screen
print8Bits macro x
Print a 8-bit number in binary
print16Bits macro x
Print a 16-bit number in binary
6. Results
23
7. Conclusion
Assembler is a low level programming language. It allows the programmer to
interact with the processor, to manage memory.
While performing this course work I learned a lot of things about assembly
language. In my code I used macros which are very useful things because it allows
to pass parameters and make the program more modular.
The implementation of floating point multiplication algorithm was an interesting
performance. The arithmetic algorithms are very important because they are the
basics of data management in a computer. Their implementation and complexity
influences the performance of the computer.
Appendix
Source code in Assembler:
.model small
.stack 100h
.data
mx db 3Ah ; 0011 1010 b
my db 0D1h ; 1101 0001 b
ex db 42h
ey db 0D1h
sign db ?
mz dw ?
ez db ?
mask dw 0xC000h ; 1100 0000 0000 0000 b - for checking 1st two
bits at normalization
24
.code
;First Multiplication Fixed-Point Algorithm
;-----------------------------------------FirstAlgMult macro x, y
; save data from registers
push dx
push ax
push bx
xor dx, dx ; adder
xor ax, ax
xor bx, bx
mov al, x ; AX <-- mx
mov bl, y ; BX <-- my
CheckLSB:
test bl, 1
jz Shift
;if LSB = 1
add dx, ax ; adder = adder + mx
Shift:
shl ax, 1
;shift mx left
shr bl, 1
;shift my right
test bl, 0FFh ;check if my is 0
jnz CheckLSB
mov
mz, dx
pop ax
pop dx
endm
jmp Jump16
Zero16:
mov al, '0'
mov ah, 0Eh
int 10h
Jump16:
test dx, 0FFFFh
jz Quit16
jmp show16
Quit16:
push dx
push cx
push bx
endm
ax, ax
al, mx
al, my
sign, al
neg my
myIsPositive:
; find exponent of result ez
mov al, ex
add al, ey
mov ez, al
; perform fixed point multiplication of mantissas
FirstAlgMult mx, my
Normalize mz, ez
; convert the resulting mz in CC
; by checking the sign stored at the beginning
test sign, 80h
jz DoNotConvert
neg mz
DoNotConvert:
print mxInput
print8Bits mx
print exInput
print8Bits ex
print myInput
print8Bits my
print eyInput
print8Bits ey
print mzResult
print16Bits mz
print ezResult
print8Bits ez
end start
30