Assembly-Language Additional Resources

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Assembly Language

Assembly language is a low-level programming that allows people to write


programs at a mnemonic level instead of binary form.

Every language uses mnemonics to represent every specific instruction that a


particular microprocessor can recognize and execute.

In computer assembler (or assembly) language, a mnemonic is an abbreviation


for an operation. It's entered in the operation code field of each assembler program
instruction. For example, on an Intel microprocessor, MOV (MOVE) is to transfer a data
from register to the other, is a mnemonic.

Inside the CPU

We have learned from the previous chapters that the CPU contains registers as
the highest hierarchy in the CPU structure. These registers have its own function in
Assembly language. Figure 2.17 presents the different registers in the CPU.

Figure 2.17. A CPU containing different registers.

General Purpose Registers

8086 CPU has 8 general purpose registers, each register has its own name:

• AX - the accumulator register (divided into AH / AL).


• BX - the base address register (divided into BH / BL).
• CX - the count register (divided into CH / CL).
• DX - the data register (divided into DH / DL).
• SI - source index register.
• DI - destination index register.
• BP - base pointer.
• SP - stack pointer.

The main purpose of registers is to keep a number (variable). The size of the above
registers is 16 bit, like: 0110010110100101b (in binary form), or 12345 in decimal
(human) form.

Four general purpose registers (AX, BX, CX, DX) are made of two separate 8-bit
registers, for example if AX = 0110101011101110b, then AH = 01101010b and AL =
11101110b. AH refers to higher order accumulator and AL refers to lower order
accumulator. Figure 2.18 illustrates the bit placement.

Figure 2.18. The AX register showing its bit placement.

Therefore, when you modify any of the 8 bit registers, 16 bit register is also
updated, and vice-versa. The same is for other 3 registers.

Because registers are located inside the CPU, they are much faster than the
memory.

Segment Registers

• CS - points at the segment containing the current program.


• DS - generally points at segment where variables are defined.
• ES - extra segment register, it's up to a coder to define its usage.
• SS - points at the segment containing the stack.
Although it is possible to store any data in the segment registers, this is never a
good idea. The segment registers have a very special purpose, that is pointing
at accessible blocks of memory.
We will be discussing more about segment registers in the addressing
mode chapter.

Special Purpose Registers

• IP - the instruction pointer.


• Flags Register - determines the current state of the processor.

IP register always works together with CS segment register and it


points to currently executing instruction.

Flags Register is modified automatically by CPU after mathematical


operations, this allows to determine the type of the result, and to determine conditions
to transfer control to other parts of the program. Generally you cannot access these
registers directly.

Emu8086 IDE (Integrated Development Environment)

Every programming language needs an IDE in order to write the source code,
compile, debug and execute the program. In this course, we shall be learning about
the emu8086 IDE.

There are other IDE’s where you can write the source code for assembly language
but as to my experience, the emu8086 provides a simpler, user-friendly environment.
It has a much easier syntax than any of the major assemblers, but will still generate a
program that can be executed on any computer that runs 8086 machine code.

1. Download the file at any source. In this case, I found it here:


https://emu8086-microprocessor-emulator.en.softonic.com/download
2. After downloading, install the file. In your desktop, you should be able to see
this icon once installed successfully.

3. If not yet open, you can double click this icon to launched the file. You should
see something like this in your screen:

4. Click the “New” button to start writing your code. Click OK.
5. Saving, Opening, closing a file (program) are the same as with any Word or text
editor.
6. The emulate button is the most important part in the IDE because this is where
you compile and execute the program:

This is it for now. We shall continue using the IDE later as we proceed with sample
programs.
MOV Instruction

One of the most common commands used in Assembly program is the MOV
command. It has the following features:
• Copies the second operand (source) to the first operand (destination). For
example:

mov AX, BX

this command allows the data to be transferred (move) from BX (source) to AX


(destination). After the execution, both registers will contain the same data.

• The source operand can be an immediate value, general-purpose


register or memory location.

• The destination register can be a general-purpose register, or


memory location.

• Both operands must be the same size, which can be a byte or a


word.

These types of operands are supported:

MOV SREG, memory → MOV DS,[BX]


MOV memory, REG → MOV [BX],AX
MOV REG, REG → MOV AX, DX
MOV memory, immediate → MOV [BX],3Fh
MOV REG, immediate → MOV DX, 5

SREG: DS, ES, SS, CS

REG: AX,BX,CX,DX,AH,AL,BL,BH,CH,CL,DH,DL,DI,SI,BP,SP.

memory: [BX], [BX+SI+7], variable, etc…


immediate: 5, 3Fh, 10010011b, etc…

Other examples:

MOV AX,0B800h ;set ax to hexadecimal value of B800h


MOV DS,AX ;copy value of AX to DS
MOV CL,’A’ ;set CL to ASCII code ‘A’, it is 41h
MOV CH,01011111b ;set CH to binary value
MOV BX,15Eh ;set BX to 15Eh
MOV [BX],CX ;copy contents of CX to memory at B800:015E

Variables

Variable is a memory location. Programmers use variables to store value. It much


easier to use the variable named “var1” to store value than at the address
5A73:235B, especially when you have multiple variables.

Assembler supports two types of variables: BYTE and WORD.

Syntax for variable declaration in Assembly:

name DB value
name DW value

DB means Define Byte


DW means Define Word

name can be any letter or digit combination, though it should start with a letter.

value can be any numeric value in any supported numbering system.

Before going further, let us study a typical structure of an assembly language.


I’m sure you are familiar with C language. For simplicity, we will look at the C
language program structure and compare it with your assembler.

C language Assembly Language

#include<stdio.h> //header file org 100h


.model small ;header file
char name[30]; //variable .data ;var declaration
declaration
.code ;main function
int main(void) //main function
{ ;body of the program

//body of the program ret ;return to main


return 0; //return to main
}
org 100h a directive telling the assembler to start writing at memory
location 100v(default memory)

.model small header file that tells the assembler that you intend to use the small
memory model - one code segment, one data segment and one stack segment.

.data this section defines your data, values and initialization. (variable
declaration section)

.code signifying the code segment. The source code follows. Let’s examine
carefully the code segment:
• Divided into four columns: labels, mnemonics, operands, and comments,
• Labels refer to the positions of variables and instructions, represented by the
mnemonics.
• Operands are required by most assembly language instructions
• Comments aid in remembering the purpose of various instructions

Example:

Label Mnemonic Operand Comment


Org 100h

.data
var1 db 7 ;var1 with a value of 7
var2 dw 1234h ;var2 with a value of 1234h
.code

mov ax,@data ;initialize DS to address


mov ds, ax ;of data segment

mov al, var1 ; move value of var1 to al


mov bx, var2 ; move value of var2 to bx

mov ah, 4ch ; end


int 21h

ret

The Label Field

• Labels mark places in a program which other instructions and directives


reference

• Labels in the code segment always end with a colon


• Labels in the data segment never end with a colon

• Labels can be from 1 to 31 characters long and may consist of letters, digits, and
the special characters? @ _ $ %

• If a period is used, it must be the first character

• Labels must not begin with a digit

• The assembler is case insensitive

Legal and Illegal Labels


• Examples of legal names
• COUNTER1
• @character
• SUM_OF_DIGITS
• $1000
• DONE?
• .TEST

• Examples of illegal names


• TWO WORDS contains a blank
• 2abc begins with a digit
• A45.28 . not first character
• YOU&ME contains an illegal character

The Mnemonic Field

• For an instruction, the operation field contains a symbolic operation code


(opcode)
• The assembler translates a symbolic opcode into a machine language opcode
• Examples are: ADD, MOV, SUB
• In an assembler directive, the operation field contains a directive (pseudo-op)
• Pseudo-ops are not translated into machine code; they tell the assembler to do
something

The Operand Field


• For an instruction, the operand field specifies the data that are to be acted on
by the instruction. May have zero, one, or two operands

NOP ;no operands -- does nothing


INC AX ;one operand -- adds 1 to the contents of AX
ADD WORD1,2 ;two operands -- adds 2 to the contents
; of memory word WORD1

• In a two-operand instruction, the first operand is the destination operand. The


second operand is the source operand.

The Comment Field


• A semicolon marks the beginning of a comment field
• The assembler ignores anything typed after the semicolon on that line
• It is almost impossible to understand an assembly language program without good
comments
• Good programming practice dictates a comment on almost every line

ret allows your program to return to its main function after executing.

Now that we have defined the structure of the Assembler, we will create our first
program. Typically, the Hello World program is the simplest code for all
programming examples.

As you probably know from the previous lesson about MOV instruction, we will also
use it in this example.

Figure 2.19. Hello World Program using Assembly Language.

Executing the program will yield an output like this:


Figure 2.20. Hello World output.

Let’s look at the program itself:

msg db "Hello World $"

we declared a variable named msg with its value in byte (DB) followed by a string
“Hello World $”. The infusion of the dollar sign $ signals the assembler to stop
displaying other characters. Removing the dollar sign will cause the assembler to
display other characters not defined in the variable msg.

mov ax, @data


mov ds, ax

this code is what we call a directive – a part of assemblers command, that is written
after the .code segment. Their function is to tell the assembler to start looking for
the address of the variables declared in the .data segment. In this case, the
variable msg.

mov ah, 09h


mov dx, offset msg
int 21h

this line of code is where the string “Hello World” is displayed. The mov ah, 9h and
int 21h are always paired to function as a output directive in assembly much like a
printf(); function in C.
mov dx, offset msg allows the assembler to point the address of the variable msg
and transfer its contents to the dx register. Once it is transferred, the mov ah,09h
and int 21h will do the printing in the screen.

mov ah, 4ch


int 21h

this code simply tells the assembler to end the process. Again it is a directive which are
paired together.

ret

Return to main function.

Let’s see how this program is written in C language

C Language Assembly Language

;Hello World Program ;Hello World Program


;Author: Agustin R. Veras Jr. ;Author: Agustin R. Veras Jr.
;Date : June 24, 2020 ;Date : June 24, 2020

#include <stdio.h> org 100h

.MODEL SMALL
.DATA
char msg[20]=”Hello World”; msg db "Hello World $"

int main() .CODE


{ mov ax, @data
mov ds, ax

mov ah, 09h


printf(“%s”, msg); mov dx, offset msg
int 21h

mov ah, 4ch


int 21h

return 0; ret
}

Now that you have an idea of how Assembly program is written, we will continue with
the variable declaration.

Let us see another example with the MOV instruction;

this next example illustrates the use of variable declaration in numerical form:
As you can see, we declared two (2) variables in this program: var1 and var2.

var1 has a value of 7(decimal) in DB (define byte) while var2 has a value of
1234h(hexadecimal)in DW (define word) because it require a bigger byte.

mov al, var1


mov bx, var2

this line of code allows the variable values to be transferred in the registers AL and
BX. In this case AL will have a value of 7 and BX will have a value of 1234.

Note the AL is used instead of AX because var1 has a value of 7 which can be
represented in an 8-bit. While var2 with a value of 1234h cannot be represented in
an 8-bit. So the whole length of the BX register will be used.

Note too that we did not display the contents of AL and BX. Though we can do that but
it suffice as this time.

We can also reserve a space for variables, in our previous example var1 db 7, we
have actually allocated a size/space/value in our variable. But sometimes, we can also
assigned values which are undefined like var3 db ?. The question mark (?) will
command the assembler to allocate uninitialized byte.

Let’s have an example of multi-line text. Let’s create an program that will display a
name and an address.
Note that all the programs that we are creating at this time requires displaying only,
we have not yet started data entry (inputting text).

 LEARNING ACTIVITY

Create a program in Assembly Language that displays the


following:
1. Your name
2. Your course
3. Complete birthday
Locating the values of registers

When a MOV instruction is executed, the destination’s value was changed. We


can easily track the values using emulator of the emu8086 IDE. To do this, we need to
open your IDE and encode the following.

org 100h

.model small
.data

.code
mov ax, @data
mov ds, ax

mov ax, 1234h


mov bx, 5678h
mov ax, bx

int 21h

mov ah, 4ch


int 21h
end

ret

make sure that you have copied exactly the code. You may wish to copy and paste it in
your IDE if you wish.

Click the emulate icon :

If your program does not encounter an error, your will see the emulator window and
the source code window displayed on top of your source code screen:
Figure 2-21. Emulator and original source code

The emulator box contains the registers with their initialized values when the program
was not yet executed. Clicking the single step icon is very helpful to track down values
of each register as the program is executed line-by-line. Clicking the run icon will
execute the whole program.

Since we are interested in determining the values of each register using the MOV
instruction, we shall be using the single step icon at this time.

Figure 2.22. Emulator box


The original source code box allows you to see what line of code is being executed.
When the code is highlighted with yellow, it means it will be executed next. Once it is
executed, you may refer at the emulator box for the result of values or changes of
values in the register being executed.

Figure 2-23. Original source code box.

Continue to click the single step icon until the code mov ax, 1234h is highlighted.
Then click the single step icon once more passing that line of code.

You noticed that the value of register AX is now 1234 ?

Click again the single step icon. What is the value of BX? Right, it is not 5678.

Click agai the single step icon. What is now the value of AX? You should get a value of
5678.

So therefore, the codes


mov ax, 1234h
mov bx, 5678h
mov ax, bx

will give a value of 5678 in the AX register


Interrupts

Interrupts can be seen as a number of functions. These functions make the


programming much easier, instead of writing a code to print a character you can simply
call the interrupt and it will do everything for you. There are also interrupt functions
that work with disk drive and other
hardware. We call such functions software interrupts.

To make a software interrupt there is an INT instruction, it has very simple syntax: INT
value where value can be a number between 0 to 255 (or 0 to 0FFh), generally we
will usehexadecimal numbers.

To specify a sub-function AH register should be set before calling interrupt. In


general AH register is used, but sometimes other registers maybe in use.

Use INT 10h sub-function 0Eh to type a single character.

Figure 2-24. Printing characters using interrupts.


Figure 2-25. Prints AB on the screen.

 LEARNING ACTIVITY

Create a program is assembly language that prints the characters


‘H’,’E’,’L’,’L’,’O’; using interrupt.

You might also like