ASM Irvine 32 Cheat Sheet (P1)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43
At a glance
Powered by AI
The document provides an overview of key concepts in assembly language including data types, operations, loops, labels and more.

The basic data types discussed are bytes, words, doublewords, strings and their representation in binary, hexadecimal, ASCII codes.

Common operations discussed include arithmetic operations like addition, logical operations like AND and OR, data movement operations like MOV, and conditional/looping operations like JMP, LOOP.

Assembly (x86)

Cheat Sheet(part-1)

By: Zain Arshad Sandhu

This book as a cheat sheet is originated from the book Assembly


Language for x86 by Kip Irvine and some other articles to help my
fellows and readers to have a quick overview of Assembly Language .

@All rights reserved. Reproduction of article sheet is


Prohibted.
ASCII TABLE
Data Representation
Computer understand Machine Language ( 0 or 1 ).

Languages like Assembly , C/C++ , Java and Python just


help human to give instruction in human readable language
which then convert all the code in Machine code.
Most Commonly , hexadecimal numbers (0-F) are used to show
the content of computer memory.
Binary Integers
As per requirement to work with computer we have to talk
with computer with its own language(0 or 1).
Each ASCII character has its own number of bits to be used
to represent it in computer.

Example : 2 is set as a sequence of bits using 10 ( one and


zero).

MS : Most Significant Bit - Bit on the left


LSB : Least Significant Bit- Bit on the right

Unsigned Integer
Positive number i.e All Numbers >= 0.
Signed Integer
Negative number i.e All Numbers < 0.

Storage Sizes in Assembly


Basic storage unit in x86 machines : BYTE
Hexa Decimal Integers
Hexa decimal numbers system is used to represent machine
code.
0001.0110.1010.0111.1001.0100 in Hex is:

ASCII String
String : Sequence of one or more characters.
Terminology: ASCII string in memory is stored as a succes-
sion of bytes containing ASCII code /numeric code.

Example : ABC123 is 41h , 42h , 43h , 31h , 32h, 33h in


numeric code.

Boolean Operations
NOT ( ~ ) : Reverse the bit
AND ( ^ ) : Implement logical AND expression
OR ( + or v ) : Implement logical OR expression
Registers
Registers are high speed direct memory storage locations in-
side the CPU.

General Purpose Register : Primarily used for Arithmetic and


data movement purposes.
Other registers , flags and linear data Management :
See kip Irvine Assembly Language for x86
(eddition 6 page 128)...
Label: Place marker of instruction and data

Data Label : Identifies the location of variable


NOTE: One Data Label can be referred for multiple data
items
i.e Array DWORD 1,2,3

Code Label : Area where instructions are placed


code label is ended with ( : ) operator.
Example :
codeLabel :
instructions...
jmp codeLabel
jmp codeLabel is the line refer to codeLabel to perform all
the instruction again.

zero operand : stc ;example of setting carry flag (No oper-


and)

One operand : inc ecx


Increment operator adding ecx by 1 ( one operand)
Two Operand: MOV has two operand
Example: MOV eax,2

Three Operands : imul has three operands


Example : imul eax , ebx , 20
eax ( destination operand) , ebx and 20 ( operands )

imul eax , ebx ,20 is equivalent to eax = ebx * 20


Comments : Not Executable simple English to understand the
code written
1. Single line comments
; comments here
2.Block-Comments
COMMENT !
This code define comments...
This is another line of definition of
comments
!
another way is:
COMMENT &
This code define comments...
This is another line of definition of com-
ments
&

First Look to the Assembly Code

INCLUDE Irvine32.inc
INCLUDE : A directive extract all the information from
Irvine32.inc

.Code Directive: Area where all executable code is written

PROC Directive: identifies the where the procedure start


from (main)
Call : Display the current values of CPU for respective regis-
ters that directly related to Call instructions.
Example:
call writrInt display whatever in eax
call writeString display whatever in edx

Exit : halts the main procedure

ENDP : marks the end of main procedure

END Directive : last line of procedure to be assembled.


.386 Directives : identifies the minimum CPU require for pro-
gram

.model flat , stdcall

.MODEL directive
1. Identifies the segmentation model to be used for program
2. Identifies the code convention used for passing the param-
eters to procedures.
flat keyword : tells the assembler to generate the code for
protected mode
stdcall Keyword : enables the calling of MS-WINDOWS
function.

PROTO Directives : declares the prototype for procedure


EXITPROCCESS Proccess : MS-WINDOWS function halts
the
current program.

DumpRegs : display the registers’ content

INVOKE Directive : assembler directives that calls a pro-


cedure or function.

Template for ASM program


Variable Initialization
Syntax: Variable_name Variable_Type Variable_Content
i.e
var Word 5
name Byte “XYZ”

Variable without Content:


Syntax: Variable_name Variable_Type ?

var BYTE ?

Note : ? leaves the variable uninitialized

Initializer : integer constant or expression that exactly


matches the storage capacity of type of variable.

Example : value BYTE 256


The line written above will generate the error as:
The maximum number BYTE can store is 255

var SBYTE -129


ERROR : Minimum Value SBYTE can store is -128.
Multiple Initializers :
A signal Variable name can be used as initilezer of one or more
than one data items i.e ARRAY
Syntax: variable_name Data_type C1, C2 , ..... , Cn

C1, C2 , ..... , Cn are values separated by ( , ) operator.


i.e
array BYTE 10, 20 , 30 , 40

Note : Initially , the array variable has the offset of


first entry of
data list ( 10 ) in our case.

Mix Data as different redixes


list1 and list2 have same meaning in term of their content.

String Definitions
Syntax :
String_Name Type "string here.." or
String_Name Type ‘string here..’

i.e
Name BYTE "ABC...XYZ"

NULL-Terminated String: end with null byte (containing 0 ).

Example : Name BYTE "ABC...XYZ",0

Term to know :
Each character of string uses a byte of storage
A,B,.....,Z each uses 8-bit of storage place.

Meaning
Þ Name BYTE 'A','B','C',....,'X','Y','Z',0 is same as:
Name BYTE "ABC...XYZ",0
Þ Name[0]="A" Name[2]="C" and so on.
Multiple lines of string:

String BYTE "Do you know how to prgram?",0dh,0ah


BYTE "It is not programming that is tricky"
BYTE "But the Problem to be solved by Language"
BYTE "Good Luck!",0dh , 0ah ,0

0dh : carriage-return -CR


0Ah: Line Feed -LF
0dh and 0ah or CRLF when are used break the current
pointed line.

call crlf give the same meaning as 0dh , 0ah.

line Continuation character ( \ ): Concatinates two state-


ments in a single statement.
Def_0f_Scientist_By_Me \
BYTE “You are the biggest scientist if you explore yourself”,0

DUP operator
m DUP stands for Duplicate
m DUP allocates storage for multiple data elements of same
type
m DUP is usefull declaring array and strings

Examples:
Terminology of Byte 4 DUP("STACK") :

S T A C K each uses A byte for storage that means :


Each STACK word uses 5 bytes then for 4 Similar Words will
use 20 bytes ( 4-times data * 5 byte for each item ) .

BYTE 20 DUP(0) takes 20 byte for storage How?

Size of list = 20 * size of Data Type = 20 * 1byte = 20 bytes

For Word data type : Size = 20 * 2 bytes = 40 bytes


For Dword data Type : Size = 20 * 32 bytes = 640 bytes

Little endian Method (low to High access of data) : Least sig-


nificant byte is stored at the 1st memory address.

int dword 12345678h 78 will be stored at : 0000 then


56 will be stored at : 0001 and so on.
Declaring uninitialized data :

m .DATA? directive use uninitialized data efficiently.


m .DATA? reduce the size of compiled program.

( = ) Symbol :
variable_name= constant or expression
Size of Arrays and Strings:

arrayList BYTE 1,2,3,4

Explicitly : ArraySize = 4
Implicitly : ArraySize = ($ - ArraySize)

Referencing and Dereferencing

variable content is accessed through address of variable.

.data
var BYTE 30
mov AL , var is dereferencing technique to access the
data in var.
Let address of var is (0x1000) then :

0x1000 is offset of var ( an other term of Address )

[ ] for Dereferencing
AL
i.e mov AL , [var] 0x1000

Assembly C/C++

Mov destination , source destination =source


mov eax , 10 eax = 10

left-Operand : destination ( mostly )


right-Operands : source

variant of MOV instruction

MOV eax , ebx

MOV var , eax

MOV eax , var

MOV var , 5

MOV eax , 5
Rules to use MOV instruction:
Size of Destination & Source must be of same.
MOV EAX , BL ;ERROR
Both Destination and Source cannot be memory operands
MOV var1 , var2 ;ERROR

MOVZX ( Move with zero-Extend)

MOVZX mov the content of source to destination with ex-


tending the
zeros to remaining bits.

byteVal BYTE 10001111b

movzx ax , byteVal ; ax = 0000000010001111b


MOVSX ( mov with sign-extend)

MOVSX Ax , 10001111b ;Ax=1111111110001111b

XCHG instruction ( swap 2 values )

XCHG reg , reg


XCHG reg , mem
XCHG mem , reg

Basic Operational Instructions

mov - move a value to another location


add - add two values
sub - subtract a value from another
jmp - jump to a new location in the program
mul - multiply two values call - call a procedure

INC inc ecx ; ecx = ecx +1


DEC dec ecx ; ecx = ecx -1
ADD add eax,ebx ; eax = eax + ebx
SUB sub eax , ebx ; eax = eax - ebx

NEG ( 2’s compliment of number / data)

mov eax , 5
var = -1
NEG reg NEG eax ; -5
NEG mem NEG var ; 1

Offset Operator
returns the distance of variable in bytes form base location
PTR Operator
used to access specific size of a register/memory

CASE : There is 32-bit(Dword) array of 5 element and


1st element is to be stored in ax. how can this possible?

array dword 1,2,3,4,5

mov ax , array ;instruction operand must be the same


size

ERROR : Recall ( mov instruction rules do not allow to deal


with
different sizes of source and destination ).

Generally : ax can only have 16-bit size of integer.

Solution:
mov ax ,word PTR array
Syntax:
instruction destination , size of variable PTR varia-
ble_name
mov AL , 256 ;error : invalid Operands
Example:
val byte 5
ADD EAX , DWORD ptr val
TYPE Operator
Returns the size of variable according to data type in
Byte/s.

LENGHTOF Operator
counts the number of elements in array
returns the size/length of array

arr byte 1 , 2 ,3 ,4 ,5
SIZEOF Operator
Returns the size of variable/array/string as:
SIZEOF = TYPE * LENGTHOF
Example :
intArray BYTE 1,2,3
mov eax , sizeOf intArray ;eax = 3

type (in byte) = 1 (size of byte data type )


lengthof = 3
sizeof = 3*1

A.
B.
C.
D.
E.
F.

A.eax = 1 (Type of byte)

B.eax = 4 (no. of elements)


C.eax = 4 (type * length)

D.eax = 2 (type of word)


E.eax = 4 (#of elements)
F. eax = 8 (type * length)
G.eax = 5 (type * length)
LEA(load effective address) Operator:
loads the calculated offset/address of memory in
specified register.

i.e lea eax, array


mov eax,offset array works correspondent

Arrays Data Structure


An Array is set of elements.
i.e 1 ,2 3 ,4 , 5 ,6 are elements that is set of integers.
In Assembly language :
name_of_Array Type_of_array val1,val2....,valn
i.e IntegerArray BYTE 1,2,3,4

Inside The memory for 8-bit size array:

OFFSETs Access
0000 IntegerArray+0
1
0001 2 IntegerArray+1

0002 3 IntegerArray+2

0003 4 IntegerArray+3
Accessing Elements of Array

lea esi, IntegerArray


mov AL,[esi] 1st element
mov AL,[esi+1] 2ND element
mov AL,[esi+2] 3rd element
mov AL,[esi+3] 4th element
i.e IntegerArray Word 1,2,3,4

Inside The memory for 16-bit size array:


OFFSETs element Access
0000 1 base+0
0002 2 base+2
0004 3 base+4
0006 4 base+6
Accessing Elements of Array

lea esi, IntegerArray ;base address


mov AL,[esi+0] 1st element
mov AL,[esi+2] 2ND element
mov AL,[esi+4] 3rd element
mov AL,[esi+6] 4th element

i.e IntegerAray DWORD 1,2,3,4


Inside The memory for 32-bit size array:
keep in mind to get next element of Arrays:
for 8-bit array Add 1 in General purpose reg
for 16-bit array Add 2 in General purpose reg
for 32-bit array Add 4 in General purpose reg

OFFSETs element Access


0000 1 base+0
0004 2 base+0
0008 3 base+8
0012 4 base+12
Accessing Elements of Array

lea esi, IntegerArray


mov AL,[esi+0] 1st element
mov AL,[esi+2] 2ND element
mov AL,[esi+4] 3rd element
mov AL,[esi+6] 4th element

you can use any register for offset of array as:


Simple Debugging
Other technique to access element of array:

1.
mov esi,0 ;Any general purpose reg

mov al,[IntegerArray+esi]
inc esi

2. Easy Syntax as in C/C++(arrayName[index])

Syntax: mov reg , arrayName[index]

i.e mov al,IntegerArray[esi]


inc esi
i.e mov al,IntegerArray[4]

3. Scale Factor as Index ( index * TYPE )

mov esi,1 * TYPE IntegerArray ;esi=1*1byte


mov al,IntegerArray[esi]
Also
mov esi,0
mov al,IntegerArray[esi * TYPE IntegerArray]

For 16-bit and 32-bit array the formula will be same


because the type operator automatically give the
number of byte

Pointers
A variable that contains the address/offset of
other variable.
Used for Dynamic Memory Allocation

ptrB and ptrW now are pointing to arrayB and arrayW


respectively and can also be written as:
ptrB DWORD OFFSET arrayB
ptrW DWORD OFFSET arrayW
Accessing data of arrays

mov esi,ptrB ;esi = 0000 (supposed)


mov al,[esi] ;al = 10

TYPEDEF Operator

To create User-defined data types.


Ideal for creating pointer variable.
Syntax:
Type_name TYPEDEF PTR Data_type
i.e
BytePtr TYPEDEF PTR BYTE ;pointer to BYTE
WordPtr TYPEDEF PTR WORD ;Pointer to Word
DWordPtr TYPEDEF PTR DWORD ;Pointer to DWord

Use Case:

JMP and LOOP instruction


JMP instruction skip the next instruction and go to the
jump label to execute the statement based on the
values of CPU status Flags (ZF , CF , SF , PF etc.)

LOOP instruction execute a cycle of instructions based


on ecx/cx register value. ( if cx=0 loop terminates )
JMP Syntax: JMP destination
i.e JMP labelXYZ

Use Case : Endless Loop ( infinite Loop )


loopX :
......
.....
jmp loopX

LOOP Sytax: loop destination

label1:
....
....
loop label1

Use case: Print 1 to 5 elements


Some Coding ERRORs

Nested Loops : loop within another loop


Basic Syntax:
Outer-loop label1:
...
...
label2:
... inner-loop
...
loop label2
loop label1

Use Case: Each PAKISTAN with 2 ZindaBaad

while loop
Another technique to use loop structure is while loop.
Syntax:
.while(condition is true )
....
....
.endw
Use Case: Print PAKISTAN 5 TIMES

Using while loop ecx register not effected so that if you


are using ecx in condition , you have to define the base condition
through which loop will terminate.
while( destination < source )
while( destination <= source )
while( destination > source )
while( destination <= source )
while( destination != source )
while( !destination ) all conditions are allowed
Equal-Sign Directive ( = ) : Associates symbol name with inte-
ger expression.

Syntax: Symbol_name = expression

i.e lengthofArray = 5
i.e counter = 10 etc..

where to Initialize the = :


we can initialize the ( = ) directive in .data and .code
module.

Note : count can be modified anywhere in the program.


Current location counter ( $ ) :
Returns the offset of associated with current program
statement.
selfPtr BYTE $ ;contains the offset of selfPtr
Understanding:

array BYTE 1,2,3,4


arraySize = ( $ - array )

array Offset $ (current Offset )

let current offset = 00406004


distance
and offset of array= 00406000

( $ - array ) = 00000004 = arraySize is 4

Keep in mind:
if you will not count the size of array just after the array
declaration , the size will be incorrect.

See Example:
array BYTE 1,2,3,4
count BYTE ?
arraySize = ( $ - array )
Assuming the offsets mentioned in example given above
the arraySize = 5
Number of Array Elements : 32-bit and 16-bit
WordArray Word 1,2,3,4
DwordArray Dword 1,2,3,4

To get the number of element of array of 16-bit :


arraySize = ($-WordArray) / 2
why to divide by 2?

arraySize symbol is offset of 32-bit in size.

And each element is far from each other at the distance


of 16-bit(2-BYTES) in storage point of view.

1 2 3 4
Elements
Offsets 0000 0002 0004 0006

if offset of WordArray = 0x0000


the offset of arraySize = 0x0008
then
arraySize = 0x0008 - 0x0000 = 8 /2 = 4 elements
To get the number of element of array of 32-bit:
arraySize = ($-WordArray) / 4

why to divide by 4?

1 2 3 4
Elements
Offsets (hex) 0000 0004 0008 000C
if offset of WordArray = 0x0000
the offset of arraySize = 0x0010 ::10 hex = 16 deci
then arraySize = 0x0010 - 0x0000 = 16 /4 = 4 elements
NOTE : I have tried to solve and debug the example used in
this sheet with the help of MASM assembler. BUT being a
human I would say : “Mistakes are made and solved by human”.
Regards: Z.A Sandhu

THANKS

You might also like