Assembly Language Abreviated
Assembly Language Abreviated
Assembly Language Abreviated
Donaldson
REGISTERS
GENERAL PURPOSE REGISTERS
Also called data registers, the general purpose registers are used for arithmetic and data movement. Each register can be addressed as either a 16 bit (or 32 bit) or 8 bit ( or 16 bit) value. For example, the AX register is a 16 bit register. Its upper 8 bits are called AH and its lower 8 bits are called AL. Bit 0 in AL corresponds to bit 0 in AX and bit 0 in AH corresponds to bit 8 in AX. The upper halves of the 32 bit registers do not have names. To put a 16 bit value into the upper half of a 32 bit register, you must first put the value in the lower half (such as AX) and then shift the bits leftward into the upper half of the register.
SI (source index) SI takes from the string movement instructions in which the source string is pointed to by the SI register. DI (destination index) DI acts as the destination for
string movement.
BX (base) BX can hold the address of a procedure or variable. Can also perform arithmetic and data movement. CX (counter) CX acts as a counter for repeating or
looping instructions. These instructions automatically repeat and decrement CX.
EBP - 32 bit BP. ESP - 32 bit SP. ESI - 32 bit BSI EDI - 32 bit DI. STATUS AND CONTROL REGISTERS IP (instruction pointer) IP always contains the offset of
the next instruction to be executed within the current code segment. IP and CS combine to form the complete address of the next instruction.
FLAGS FLAGS is a special register with individual bit positions assigned to show the status of the CPU or the results of arithmetic operations. EFLAGS 32 bit FLAGS. EIP - 32 bit IP.
EAX 32 bit AX. EBX 32 bit BX. ECX 32 bit CX. EDX 32 bit DX. SEGMENT REGISTERS
The segment registers are used as base locations for program instructions, data, and the stack. All refere3nces to memory involve a segment register used a base location.
FLAGS
CONTROL FLAGS
Individual bits can be set in the FLAGS register to control the CPU operation.
DF (direction) DF affects block data transfer instructions CS (code segment) CS holds the base location of all
executable instructions (code) in the program. such as MOVS,CMPS, and SCAS. The flag values are 1 = up and 0 = down.
STATUS FLAGS
The Status flags reflect the outcomes of arithmetic and logical operations.
INDEX REGISTERS
Index registers contain the offsets of data and instructions.
Page 1 of 7
Carry flag would equal 1. The flag values are 1 = carry and 0 = no carry.
SF (sign) SF is set when the result of an arithmetic or logical operation generates a negative result. Because a negative number always has a 1 in the highest bit position, the Sign flag is always a copy of the destinations sign bit. The flag values are 1 = negative and 0 = positive. ZF (zero) ZF is set when the result of arithmetic or logical operation generates a result of zero. The flag is used primarily by jump and loop instructions to allow branching to a new location in a program based on the comparison of two values. The flag values are 1 = zero and 0 = not zero. Auxiliary Carry Auxiliary Carry is set when an operation
causes a carry (or borrow) from bit 3 to bit 4 of an operand. The flag values are 1 = carry and 0 = no carry.
DB allocates storage for one or more 8 bit values. DW creates storage for one or more 16 bit values. DD allocates storage for one or more 32 bit doublewords. char1 char2 signed1 signed2 unsigned1 val1 list Cstring Pstring LongString db A db A-10 db 128 db +127 db 255 db ? db 10, 20, 30, 40 db Good afternoon, 0 db 14, Good afternoon db This is a long string that db takes more than one line. db 20 dup(0) db 4 dup(ABC) dw 65535 dw 256, 258, 259 dw list dd, 100h dd subroutine1
Parity Parity reflects the number of 1 bits in the result of an operation. If there are an even number of bits, the Parity is even. If there is an odd number of bits, the Parity is odd. This flag is used by the operating system to verify memory integrity and by communications software to verify the correct transmission of data.
SYMBOLIC CONSTANTS
Equal Sign (=) Known as the redefinable equate, the
equal sign directive creates an absolute symbol by assigning the value of a numeric expression to the name. Allocates no storage and all occurrences of name are replaced by expression. Can be redefined any number of times.
MICROSOFT ASSEMBLER (MASM) ml /Zi hello.asm ; assemble and link ml /Zi /Fl /Fm hello.asm ; assemble and link cv hello ; debug To link to a library file with Masm:
ml /Zi /Fl /Fm file.asm /link /co e:\masm611\lib\irvine Zi = produces object file with debugging information Fl = listing file Fm = map file Zm = necessary for masm511 Irvine.lib = library file name Batch files asmTasm and asmMasm can be used to control assembly and linking process:
prod = 10 * 5 string = XY maxInt = 7fffh count = 5 EQU EQU assigns a symbolic name to a string or numeric
constant. A symbol defined with EQU cannot be redefined later in the program. String equates may be enclosed in angle brackets (<>) to ensure their interpretation as string expressions.
count equ 10 * 20 float1 equ <2.345> TEXTEQU TEXTEQU creates what is called a text macro.
You can assign a sequence of characters to a symbolic name and then use the name later in the program.
continueMsg textequ <Continue (Y/N)?> .data prompt1 db continuMsg .data myString db A string,0 .code p1 textequ <offset myString> mov bx,p1 ; bx = offset myString p1 textequ <0> mov si,p1 ; si = 0
add cl,al add bx,1000h add var1,ax add dx,var1 add var1,10 add dword ptr memVal,ecx SUB (subtract) SUB subtracts a source operand from a
destination operand. The sizes of the two operands must match and only one can be a memory operand. Inside the CPU, the source is first negated and then added to the destination.
INSTRUCTIONS
DATA TRANSFER MOV (move data) MOV copies data from one operand
to another. The sizes of both operands must be the same ( a 16 bit register must be moved to a 16 bit memory location). mov reg,reg ; reg = register mov mem,reg ; mem= memory address mov reg,mem mov reg,immed ; immed = immediate data mov reg,immed mov ax,[si] mov eax,ebx mov cl,20h mov si,offset var1 mov dl,X mov ax,(40 * 50)
sub eax,12345h sub cl,al sub edx,eax sub bx,1000h sub var1,ax sub dx,var1 sub var1,10 XADD (exchange and add) XADD, for the 80486, adds the source and destination operands and stores the sum in the destination. At the same time, the original value in the destination is moved to the source operand. mov ax,1000h mov bx,2000h xadd ax,bx
; AX = 3000h, BX = 1000h
XCHG (exchange data) XCHG exchanges the contents of two registers, or the contents of a register and a variable. xchg reg,reg ; reg = register xchg reg,mem ; mem= memory address xchg mem,reg exch var1,bx ; exch memory oper with register MOVZX (move with zero-extend) MOVZX, for the
80386, moves an 8 bit or 16 bit operand into a larger 16 bit or 32 bit destination register.. The unfilled bits in the destination register are cleared to Zero.
JMP AND LOOP JMP (jump) JMP tells the CPU to continue execution at a
different location. The location must be identified by a label, which is translated by the assembler into an address. There are three formats: SHORT: Jump to a label in the range 128 to +127 bytes from the current location (in the same code segment). An 8 bit signed value is added to IP. NEAR PTR: Jump to a label anywhere in the current code segment. A 16 bit displacement is moved to IP. FAR PTR: Jump to a label in another segment. The labels segment address is moved to CS, and its offset is moved to IP.
MOVSX (move with sign-extend) MOVSX, for the 80386, moves and sign extends the source operand into the upper half of the destination register. ARITHMETIC INC (increment) and DEC (decrement) INC and
DEC add 1 or subtract 1 from a single operand. Destination can be a register or memory operand. All status flags are affected except the Carry flag. inc al dec bx inc membyte dec byte ptr membyte dec memword inc word ptr memword
LOOP (loop) Loop is used to repeat a block of statements a specific number of times. CX is automatically used as a counter and is decremented each time the loop repeats. mov cx,5 start: . loop start ; cx is the loop counter
; jump to start
Page 3 of 7
The Megabyte - The 8088 and 8086 CPUs can see a full
megabyte of memory. They have 20 address pins and can pass a full 20 bit address ( One megabyte) to the memory system. The address of a byte in a memory bank is just the number of that byte starting from zero. The addresses in a megabyte of memory run from 00000H to 0FFFFFH. A megabyte of memory is some arrangement of memory chips within the computer, connected by an address bus of 20 lines.
MEMORY MODELS
Model Tiny Small Medium Compact Large Huge Flat Description
Code and data combined must be less than 64K. Creates a COM program. Code <= 64K, data <= 64K. One code segment, one data segment. Data<= 64K, code any size. Multiple code segments, one data segment. Code <= 64K, data any size. One code segment, multiple data segments. Code > 64K, data > 64K. Multiple code and data segments. Same as the Large model, except that individual variables such as arrays may be larger than 64K. No segments. 32-bit addresses are used for both code and data. Protected mode only.
OPERATORS
Operator .TYPE Description
Returns a byte that defines the mode and scope of an expression. The result is bit mapped and is used to show whether a label or variable is program related, data related, undefined, or external in scope. Addition, subtraction, multiplication, and division. Bitwise operations on constant integers. Relational operators. Assembler returns a value of 0FFFFh when a relation is true or 0 when it is false. Returns the high 8 bits of a constant expression. Returns the high 16 bits of a 32 bit operand (MASM only).
SEGMENT:OFFSET
In x86 CPUs, memory addresses are composed of two parts: the segment address and the offset. These two are added together to produce the "real" address of the memory location, by shifting the segment address one hex digit to the left (which is the same as multiplying it by 16, since memory addresses are expressed in hexadecimal notation) and then adding the segment address to it. The address itself is often referred to using the notation segment:offset. The standard way to refer to C8000h is C000:8000. To get to the linear address you take C000, shift it one digit to the left to get C0000, and then add 8000 to get C8000. However, C800:0000 results in the same linear address.
+, -, *, / AND, OR, NOT EQ, NE, LT, LE, GT, GE HIGH HIGHWORD
Page 4 of 7
LENGTH
SHORT SIZE
Field (.)
THIS
TYPE WIDTH
Returns the number of byte, word, dword, qword, or terabyte elements in a variable. This is meaningful only if the variable is initialized with the DUP operator. Returns the low 8 bits of a constant expression. Returns the low 16 bits of a 32 bit operand (MASM only). Returns a bit mask for the bit positions in a field within a variable. A bit mask preserves just the important bits, setting all others equal to zero. The variable must be defined with the RECORD directive. Modulus operator. Returns the integer remainder of a division operation. Returns the offset of a label or variable from the beginning of its segment. Specifies the size of an operand, particularly when its size is not clear from the context. Returns the segment value of an expression, whether it be a variable, a segment/group name, a label, or any other symbol. Sets a labels attribute to SHORT. Often used in JMP instructions. Returns the total number of bytes allocated for a variable. This is calculated as the LENGTH multiplied by the TYPE. The type of a near label is FFFFh, and the type of a far label is FFFEh. The name following (.) identifies a field within a predefined structure by adding the offset of the field tot he offset of the variable. The format is variable.field. Creates an operand of a specified type as the current program location. The type can be any of those used with the PTR operator or the LABEL directive. Returns an integer that represents either the size of a variable or its type. For example, the TYPE of a word variable is 2. Returns the number of bits of a given field within a variable that has been declared with the RECORD directive.
.386P
.486P
.586P
DIRECTIVES
EXTRN Directive RECORD LABEL Description
Allows you to insert a label and give it a size attribute, without allocating any storage. Andy of the standard size attributes can be used. Often used as an alias. A way to get around the assemblers requirement that the size attribute of a variable must match the other operand in an instruction. A typedef. Aligns the next instruction in the code segment to an even 16-bit offset. Insets a null byte (0) before an array. Tasm only. Allows you to define a set of enumerated constants and automatically assign an integer to value to each one. By default, the first constant is assigned a value
of 0, the second a value of 1, the third a value of 3, etc. Indicates memory model type and size. Sets aside stated number bytes of stack space for program. Marks the beginning of data segment where variables are stored. Marks the beginning of code segment where executable instructions are located. Enables assembly of 8086, 8087, and 8088 instructions. Disables assembly of instructions for the 80186 and latter processors. Enables assembly of 80186 instructions and disables assembly of instructions all latter processors. Enables assembly of 80286 instructions and disables assembly of instructions all latter processors. Enables assembly of 80386 instructions and disables assembly of instructions all latter processors. Enables assembly of 80486 instructions and disables assembly of instructions all latter processors. Enables assembly of nonpriviledged Pentium instructions. Enables assembly of floating point instructions for the 80287 math coprocessor. Enables assembly of floating point instructions for the 80387 math coprocessor. Enables assembly of privileged mode instructions of the 80286 and disables assembly of privileged mode instructions all latter processors. Enables assembly of privileged mode instructions of the 80386 and disables assembly of privileged mode instructions all latter processors. Enables assembly of privileged mode instructions of the 80486 and disables assembly of privileged mode instructions all latter processors. Enables assembly of privileged mode instructions of the 80586 and disables assembly of privileged mode instructions all latter processors. Used to identify source names that exist outside the current source file. Begin procedure. End of procedure. End of program assembly. Set a page format for the listing file. Title of the listing file.
Page 5 of 7
Size is 8 bits Size is 16 bits Size is 32 bits Size is 48 bits Size is 64 bits Size is 10 bytes
next byte, and press the Enter key to stop. To enter a string into memory starting at location CS:100 type E CS:100 This is a string.
F (Fill) Fills range of memory with a single value or list of values. The range must be specified as two offset addresses or segment-offset addresses. Command format:
F range list Examples: F 100 500, F 100 L 20 A
DEBUG COMMANDS
? (Help) Press ? at the debug prompt to see a list of all
commands. A (Assemble) Assemble a program into machine language. If only the offset portion of address is supplied, it is assumed to be an offset from CS. Command format: A A address Examples: A A 100 A DS:2000
Fill locations 100 through 500 with spaces Fill 20h bytes with letter A, starting at 100
Load named file into memory at CS:0100 Load named file into memory at DS:0200 5 sectors drive C starting at sector 0Ah 2 sectors at CS:100 from A at sector 0
Dump 128 bytes from last reference Dump DS:0150 through 015A Dump bytes at offsets 0-5 from SS Dump 128b offset zero from seg 0915h Dump offsets 0-200 from DS Dump 20b starting at offset 100h from DS
Example To begin entering hexadecimal or character data at DS:100 type E 100. Press space bar to advance to the
Page 6 of 7
subroutine calls, the P command simply executes subroutines. Also, LOOP instructions and string primitives instructions (SCAS, LODS) are executed completely up to the instruction that follows them. Command formats: P P =address P =address number Examples: P =200 P =1506 P6
Disassemble the next 32 bytes Disassemble 32 bytes at CS:0 Disassemble bytes CS:100 to CS:108
Execute a single instruction at CS:0200 Executes 6 instr starting at CS:0150 Execute next 5 instructions
W (Write) Writes a block of memory to a file or to individual disk sectors. To write to a file, its name must first be initialized with the N (Name) command. The command format is identical to the L (Load) command format: W address W address drive firstsector number Examples: W W0 W W DS:0200 Write 20h bytes to file starting at CS:100 Write from location CS:0 to the file Write named file from location CS:0100 Write name file from location DS:0200
Q (Quit) Quits Debug and returns to DOS. R (Register) Used to display the contents of one
register allowing it to be changes, to display registers, flags, and the next instruction to be executed, and to display the eight flag settings allowing any or all of them to be changed. Commands formats: R R register Examples: R R IP R CX RF
Display contents of all registers Display contents of IP & prompt new value Display contents of CX register Display all flags & prompt new flag value
S (Search) Searches a range of addresses for a sequence of one or more bytes. Command format:
S range list Examples: S 100 1000 0D S 100 1000 CD, 20 S 100 9FFF COPY Search DS:100-DS:1000 for 0Dh Search for sequence CD 20 Search for word COPY
T (Trace) Executes one or more instructions starting at either the current CS:IP location or at an optional address. The contents of the registers are shown after each instruction is executed. Command formats:
T T count T =address count Examples: T T5 T =105 10
Trace the next instruction Trace the next five instructions Trace 16 instructions starting at CS:105
Page 7 of 7