1.2 Assembler Notes

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

Syllabus: Assemblers: First pass and second pass of assembler and their algorithms.

Assemblers
for CISC Machines: case study x85 & x86 machines.

1. Why We Need Translator?


 A computer will not understand any program written in a language, other than its
machine language. The programs written in other languages must be translated into the
machine language.
 Such translation is performed with the help of software. Example: Compiler,
Assembler, Interpreter.
 A program which translates an assembly language program into a machine language
program is called an assembler.

2. Basic Concept of Assembler:


 Assembler is a translator that accepts assembly language as input and converts it into
machine code.
 Easy rather than binary and use mnemonic.
 The assembler's job is to
(a) Create table data structure for each operator, pseudocode keyword, symbol, constants,
and register used in the assembly program.
(b) Generate machine code

Figure1. Assembler Basic Architecture

1|Page
Terminologies:

1. Location Counter (LC): Indicate next instruction to be executed

2. Literals: Constant value. Example: R1=a+8, 8 is literal. Literals are stored in literal table
data structure.

3. Symbols/labels: Symbols or labels are just like variables in any other language. Example
int a=5; Symbol table is used to handle such things

4. Mnemonics: A small word that acts as an identifier for the instruction. The mnemonics
are written in code segment. In following examples mov, sub, add, jmp, call, and mul are
the mnemonics:

MOV Move/assign one value to another (label)


SUB Subtract one value from another
ADD Adds two values
JMP Jump to a specific location
CALL Call a procedure/module
MUL Multiply two values
Example: ADD R1, R2

Advantages of assembly language

 Since mnemonics replace machine instruction it is easy to write, debug and understand in
comparison to machine codes.

 Useful to write lightweight application (in embedded system like traffic light) because it
needs fewer codes than high level language.

Disadvantages of assembly language

 Mnemonics in assembly language are in abbreviated form and in large number, so they
are hard to remember.

 Program written in assembly language are machine dependent, so are incompatible for
different type of machines.

2|Page
 A program written in assembly language is less efficient to same program in machine
language.

 Mnemonics can be different for different machines according to manufacturers so


assembly language suffers from the defect of non-standardization

5 Data Structure/Tables Used by Assembler:

1. Machine-Opcode Table (MOT) or (Operation Code Table) or Mnemonics table :

 A mnemonic is an abbreviation for an operation.

 This table consists of the fields: Name of mnemonic, binary value, instruction length,
format of instruction.

 MOT table is used to look up mnemonic operation codes and translate them to their
machine language equivalents.

Example:

Name of Mnemonic Binary Values Instruction length Instruction


Format

MUL 010111010101 4 CISC


CISC
ADD 11010101010 4
CISC
SUB 01010010010 4
CISC
Load 011001010101 4

2. Pseudo Opcode Table (POT):

This table consists of the fields:


 Name of Pseudo code
 Action associated with Pseudo code.
POT is the fixed length table. This direct assembler what action should be taken
corresponding to any pseudo code given in the program.

3|Page
Pseudo Actions
START It is used to specify the starting execution
of a program.
USING It specifies the base table register that is
been used.
DROP It is used to remove the register used in
the base table.

3. Symbol Table

 Symbol table is used for keeping the track of symbol that are defined in the program.
 It is used to give a location for a symbol specified.
 The assembler creates the symbol table section for the object file. It makes an entry in the
symbol table for each symbol that is defined or referenced in the input file and is needed
during linking
• In pass 1, whenever a symbol is defined corresponding entry is made in symbol table.
• In pass2, symbol table is used for generating machine code of a symbol.

Symbol Description Address assigned in


RAM
A Variable A defined in 0040 0000
Assembly language.
B Variable B defined in 0040 0004
Assembly language

4. Literal Table
Value of literal Length of literal
28 4 byte
15 4 byte

Literal table is used for keeping track of literals that are encountered in the programs.
• We directly specify the value, literal is used to give a location for the value.
• In pass 1, whenever a Literal is defined and for entry is made in Literal table.
• In pass2, Literal table is used for generating binary code of a Literal.
Literals are always encountered in the operand field of an instruction.

4|Page
5. Base Table
This store the information of available register in hardware of the system. Example,
Register1 is free but R2 and R3 are not free.
Register Available

R1 Free

R2 Not Free

R3 Not Free

2.1. TWO TYPES OF ASSEMBLER

a) One-pass assembler:

Purpose:

 One pass assembler converts assembly code into machine language just using single scan.

 Define symbols and literals

 Determine length of machine instruction

 Keep track of location counter ( LC).

 Remember values of symbol until pass 2.

 Remember literals

One pass assembler is the assembler which assigns the memory addresses to the variables (i.e.
label definitions) and translates the source code into machine code (i.e. assembly) in the first
pass simultaneously.

A one pass assembler passes over the source file exactly once, in the same pass collecting the
labels, resolving future references and doing the actual assembly. The difficult part is to resolve
future label references (the problem of forward referencing) and assemble code in one pass. The
one pass assembler prepares an intermediate file, which is used as input by the two pass
assembler.

5|Page
Forward Reference Problem in Pass-1 Assembler:

R2=? Forward Reference problem means R2 not available now. So, Pass1- assembler gives error.
To avoid this issue solution is Two pass assembler. In Pass2 assembler first complete code is
analyzed and all values will be stored in table.

In

START % START is Pseudocode

R1=3,
MUL R1, R2
A = B+28
MUL = 1101001
R1 = 0011
R2 = ? (R2 not available here)
……..
R1 DC ‘2’ (Decrement value of R1 by 2)
R2 = 10, DC ‘7’ (Decrement value of R2 by 7)

Figure 2. Assembler Pass 1 Flow Chart

6|Page
7|Page
ALGORITHM STEPS FOR PASS 1 ASSEMBLER

Vertical Left Part Explanation:

Step1: Initialize LC=0, i.e. Initialize Pass 1 Assembler

Step2. Read assembly language program and separate all the symbols into symbol table. Now,
verify whether opcode is Machine opcode or Pseudo opcode

Step3: If Pseudo opcode, search POT: and check the type of Pseudo opcode.

Step4: If machine opcode,: i.e obtain information about binary code, format and length of
instruction.

Step5: Process the literals.

Step6: Increment the value of counter by length of the instruction.

Vertical Right Part Explanation:

Step6: If Label present then store in symbol table

Step7: Now determine length of opcode and update location counter.

Step8: Now free the register used by DROP command during the operation.

Step9: Finally forward all tables to Pass 2 for converting into machine code.

b) Two-pass-assembler:

A Two Pass Assembler is the assembler which reads the source code twice. In the first pass, it
reads all the variables and assigns them memory addresses. In the second pass, it reads the source
code and translates the code into object code.

OR

A two pass assembler does two passes over the source file (the second pass can be over an
intermediate file generated in the first pass of the assembler). In the first pass all it does is looks
for label definitions and introduces them in the symbol table (a dynamic table which includes the
label name and address for each label in the source program). In the second pass, after the

8|Page
symbol table is complete, it does the actual assembly by translating the operations into machine
codes and so on.

 Note: In First phase complete code is analyzed and all variables with their values are
assigned in table. Pass2 now along with table code is translated into binary format. So, no
chances of any error because already we have find out all values in the table first.

Figure 3. Flow Chart and Algorithm Steps for Pass 2 Assembler

9|Page
DC: Declare constant, formed binary of constant values.

DS: Declare Storage, reserve memory

USING: It is used to evaluate base register values into binary format

DROP: It is used to free base register.

END: Clear and exit.

10 | P a g e
11 | P a g e
TWO PASS ASSEMBLER ALGORITHM STEPS

Pass 1- Define symbols and literals

Step1 : Determine length of machine instruction MOT

Step 2: Keep track of location counter.

Step3: Remember values of symbol until pass 2

Step 4: Process some pseudo-opcodes

Step5: Remember literals.

Pass 2- Generate machine code

Step6- Look up value of symbols

Step7- Generate Instructions

Step8- Process pseudo-op codes and convert into binary

Difference between Pass 1 and Pass 2 Assembler

Pass 1 Assembler Pass 2 Assembler

Performs single pass Perform two passes

In first pass itself complete assembly For same operation it performs two passes.
program is converted into machine code i.e., First pass it collects the collects the symbols
it collects the symbols or labels, pseudo or labels, pseudo keyword, operators, literals
keyword, operators, literals in their in their respective table data structure. Now
respective table data structure. After this i.e run Second pass and converts everything into
within same pass immediately fetch the binary format.
stored information from table and converts
into machine code. Both of these processes
could be done together.

All entries for symbols (i.e variable) and Literals are stored in literal table and

12 | P a g e
literals (i.e constants like 18) are entered into symbols are stored in symbol table.
symbol table only.

Suffers from forward reference problem It does not have forward reference problem

Less accurate More accurate

Topic: Case Study of Assemblers for x85 and x86 machines


X85 Processor Introduction:
1. It is an 8-bit microprocessor designed by Intel in 1977 using NMOS technology.

2. The 8085 is a conventional von Neumann design based on the Intel 8080.

3. The 8085 supports up to 256 input/output (I/O) ports

4. It has the following configuration −

 8-bit data bus

 16-bit address bus, which can address upto 64KB

 A 16-bit program counter

 A 16-bit stack pointer

 Six 8-bit registers arranged in pairs: BC, DE, HL

 Requires +5V supply to operate at 3.2 MHZ single phase clock

5. It is used in washing machines, microwave ovens, mobile phones, etc.

X86 Processor Introduction:

1. 8086 Microprocessor is an enhanced version of 8085Microprocessor that was designed


by Intel in 1978. ( x86 = 32bit, x64 = 64bit)

2. It is a 16-bit Microprocessor having 20 address lines and16 data lines that provides up to
1MB storage

13 | P a g e
3. It supports two modes of operation, i.e. Maximum mode and Minimum mode. Maximum
mode is suitable for system having multiple processors and Minimum mode is suitable
for system having a single processor.

4. x86 assembly languages are used to produce object code for the x86 class of processors

5. Like all assembly languages, it uses short mnemonics to represent the fundamental
instructions that the CPU in a computer can understand and follow.
6. The x86 architecture has:

 8 General-Purpose Registers (GPR)


 6 Segment Registers
 1 Flags Register
 1 Instruction Pointer

1. Difference between X85 and X86 Architecture

The Intel 8085 is an 8 bit The Intel 8086 is a 16 bit


microprocessor created in 1976 microprocessor created in 1978

8085 is a 8 bit processor, number of flags are 8086 is a 16 bit processor, number of flags
5 and memory capacity is 64KB are 9 and memory capacity is 1 MB

8085 doesn’t have an instruction queue 8086 has an instruction queue.


8085 doesn’t support a pipelined architecture 8086 supports a pipelined architecture.

14 | P a g e
Table1: Assembly Language Instructions for X85 and X86

Operation X-85 X-86 X-85 Assembly X-86 Assembly


Example Example
 MOV- Move byte or Ex1. MOV B, A
Data  MOV- Move byte or Ex1. MOV B, A
word to register or Move contents
Transfer word to register or memory of A into B
(Move contents of A
Instruction memory. into B)
 IN, OUT--Input byte or
word from port, output Ex2. LDA M1 (Load Ex2: LEA M1 (Load
 LDA—Load memory address
word to port memory address
Accumulator Directly stored from M1 in stored from M1 in
Accumulator) Accumulator)
from memory  LEA-Load effective
address Ex3. PUSH 20
 PUSH, POP-Push Ex3. PUSH 20
(Push 20 onto stack) (Push 20 onto
word onto stack, pop  PUSH, POP-Push word
word off stack stack)
onto stack, pop word off
stack

NOT-Logical NOT of byte Ex1: ANA R1, R2 Ex1: AND R1,


Logical NOT-Logical NOT of
or word (one's complement) R2
Instructions byte or word (one's (Logical AND of
AND-Logical AND of byte values store in (Logical AND of
complement)
or word register R1 and R2) values store in
ANA-Logical AND of register R1 and
OR -Logical OR of byte or R2)
byte or word word

ORA -Logical OR of XOR-Logical exclusive-OR


byte or word of byte or word

XRA-Logical exclusive- CMP- Compare

OR of byte or word

CMP- Compare

JMP JZ when Z=1 JMP


Transfer JZ Jump on zero Z = 1
JE (JZ)-Jump if equal (zero)
Instructions C Jump on Carry CY = 1

15 | P a g e
IRET- Enable Disable Ex1. EI if cpu Ex1. IRET if cpu
Processor EI-Enable Interrupt
Interrupt utilization>90% utilization >90%
Control DI-Disable Interrupt ESC-Escape to external
processor interface
Instructions NOP- No operation
LOCK-Lock bus during
next instruction
HLT-Halt processor
NOP-No operation

HLT-Halt processor

16 | P a g e

You might also like