Intel Architecture
Intel Architecture
Intel Architecture
Intel Architecture
Intel Architecture
References
www.intel.com/design/pentiumii/manuals/
Number Systems
Decimal-to-Hexadecimal:
420.62510 =
420.62510 = 42010 + .62510
420.62510 = 1A4.A16
413510 = 102716
625.62510 = 271.A16
Number Systems
1111 1111
-0100 1100
1011 0011
Two’s complements
1011 0011
+0000 0001
1011 0100
The 80x86 MICROPROCESSOR
...ha?
Some buzz words
CISC – Complex Instruction Set Computers
• Refers to number and complexity of instructions
• Improvements was: Multiply and Divide
• The number of instruction increased from
• 45 on 4004 to:
• 246 on 8085
• 20,000 on 8086 and 8088
• Pipelining
• Registers
Inside The 8088/8086…pipelining
• Pipelining
– Two ways to make CPU process information faster:
• Increase the working frequency – technology dependent
• Change the internal architecture of the CPU
AH AL
BH BL
CH CL
DH DL
Overview
• Registers
– General purpose registers (8)
• Operands for logical and arithmetic operations
• Operands for address calculations
• Memory pointers
– Segment registers (6)
– EFLAGS register
– The instruction pointer register
• The stack
Inside The 8088/8086…registers
• Registers
AX
– To store information temporarily 16-bit register
AH AL
8-bit reg. 8-bit reg.
Extended Register
Word Register
EAX EBP AX BP AH AL
EBX ESI BX SI BH BL
ECX EDI CX DI CH CL
EDX ESP DX SP DH DL
Bits 16-31 Bits 8-15 Bits 0-7
General Registers I
• EAX – ‘Accumulator’
• accumulator for operands and results data
• usually used to store the return value of a procedure
• ECX – ‘Counter’
• counter for string and loop operations
ESP
Current
stack
frame stack
EBP
Caller’s growth
stack
frame
Intel Assembly
Intel Assembly
Goal: to gain a knowledge of Intel 32-bit assembly instructions
References:
• M. Pietrek, “Under the Hood: Just Enough Assembly Language to Get By”
• MSJ Article, February 1998 www.microsoft.com/msj
• Part II”, MSJ Article, June 1998 www.microsoft.com/msj
• Assembly Language
• mnemonics
• assembler
• High-Level Language
• Pascal, Basic, C
• compiler
Assembly Language Programming
What Does It Mean to
Disassemble Code?
Preprocessing
& Compiling
Source Code Assembly Code
Assembly
DLLs
What Does It Mean to
Disassemble Code?
Preprocessing
& Compiling
Source Code Assembly Code
Assembly
DLLs
Why is Disassembly Useful in
Malware Analysis?
• It is not always desirable to execute malware:
disassembly provides a static analysis.
• Logical address:
– Consist of a CS (code segment) and an IP (instruction pointer)
format is CS:IP
• Offset address
– IP contains the offset address
• Physical address
– generated by shifting the CS left one hex digit and then adding it to the IP
– the resulting 20-bit address is called the physical address
give me some numbers…ok
Program Segments…example
Suppose we have:
CS 2500
IP 95F3
• Logical address:
– Consist of a CS (code segment) and an IP (instruction pointer)
format is CS:IP 2500:95F3H
• Offset address
– IP contains the offset address which is 95F3H
• Physical address
– generated by shifting the CS left one hex digit and then adding it to the IP
25000 + 95F3 = 2E5F3H
Program Segments
Data segment
Example:
Suppose DS:6826 = 48, DS:6827 = 22,
Show the contents of register BX in the instruction MOV BX,[6826]
Little endian conversion: BL = 48H, and BH = 22H
Program Segments
Stack segment
Stack
A section of RAM memory used by the CPU to store
information temporarily.
ZF
Flag Register and ADD instruction
• The zero flag is set (ZF=1), when the counter becomes zero
(CX=0)
;PA = DS (sl) + DI + 20
;PA=DS(sl)+BX+DI +8
;PA=SS(sl)+BP+SI +29
Assembly Language
Programming
Assembly Programming
• Assembly Language instruction consist of four fields
• Labels
• See rules
• mnemonic, operands
• MOV AX, 6764
• comment
• ; this is a sample program
Model Definition
MODEL directive –selects the size of the memory model
• MODEL MEDIUM
• Data must fit into 64KB
• Code can exceed 64KB
• MODEL COMPACT
• Data can exceed 64KB
• Code cannot exceed 64KB
• MODEL LARGE
• Data can exceed 64KB (but no single set of data should exceed 64KB)
• Code can exceed 64KB
• MODEL HUGE
• Data can exceed 64KB (data items i.e. arrays can exceed 64KB)
• Code can exceed 64KB
• MODEL TINY
• Data must fit into 64KB
• Code must fit into 64KB
• Used with COM files
Segments
Segment definition:
The 80x86 CPU has four segment registers: CS, DS, SS, ES
Segments of a program:
.STACK ; marks the beginning of the stack segment
example:
.STACK 64 ;reserves 64B of memory for the stack
PAGE [lines],[columns]
• To tell the printer how the list should be printed
• Default mode is 66 lines per page with 80 characters per line
• The range for number of lines is 10 to 255 and for columns
is 60 to 132
TITLE
• Print the title of the program
• The text after the TITLE pseudo-instruction cannot be more
than 60 ASCII characters
Control Transfer Instructions
• Conditional Jumps
• Short Jump
– All conditional jumps are short jump
– The address of the target must be within –128 to +127 bytes
of the IP
– The conditional jump is a two-byte instruction:
• One byte is the opcode of the J condition
• The 2nd byte is between 00 and FF
- 256 possible addresses:
- forward jump to +127
- backward jump to –128
Control Transfer Instructions
Ref: https://thestarman.pcministry.com/asm/2bytejumps.htm
Control Transfer Instructions
example1:
number 510(1012) will be 0000 01010
example2:
number 51410(10 0000 00102) will be 0000 0010 0000 0010
Data Types and Data Definition
• Assembler data directives
• Acceccing structure:
MOV [structVar.var1], 20 ;move 20 in var1 in mystruct
Segment Directive
• The SEGMENT directive identifies the start of a memory segment and
ENDS identifies the end of a segment when full-segment definitions are in
use.
Syntax:
<logical-segment_name> SEGMENT
..
..
<logical-segment_name> ENDS
• E.g
mySegment SEGMENT
Mov Ax,BX
..
..
mySegment ENDS
•
Assume Directive
The ASSUME statement tells the assembler what names have been
chosen for the code, data, extra, and stack segments.
• Without the ASSUME statement, the assembler assumes nothing and
automatically uses a segment override prefix on all instructions that
address memory data.
• The ASSUME statement is only used with full-segment definitions
Syntax:
ASSUME <Physical-Segment>:<logica-segment_name>
E.g:
mySegment SEGMENT
ASSUME CS : mySegment ;Code segment is initialized to mySegment
Mov AX,BX
mySegment ENDS
Standard I/O
• DOS function calls are used for Standard input/output in Assembly
language(8086).
• To use a DOS function call in a DOS program,
1. Place the function number in AH (8 bit register) and other data that might be
necessary in other registers.
2. Once everything is loaded, execute the INT 21H instruction to perform the task.
• After execution of a DOS function, it may return results in some specific
registers.
• 01H: Read the Keyboard
– This function waits until a character is input from the keyboard.
– Returns ASCII key code of character in AL register.
E.g: MOV AH,01H ;load DOS function number in AH
INT 21H ;access DOS
;returns with AL = ASCII key code
Standard I/O (CONT..)
• 02H: Write to Standard Output device
• COM files
• Smaller in size (max of 64KB)
• Does not have header block
• EXE files
• Unlimited size
• Do have header block (512 bytes of memory, contains
information such as size, address location in memory, stack
address)
Converting from EXE to COM
2. Assemble
3. Link