CSE 232 Systems Programming: Lecture Notes #2
CSE 232 Systems Programming: Lecture Notes #2
CSE 232 Systems Programming: Lecture Notes #2
Advantages of coding in assembly language are: Provides more control over handling particular hardware components May generate smaller, more compact executable modules Often results in faster execution Disadvantages: Not portable More complex Requires understanding of hardware details (interfaces)
Assembler: An assembler does the following: 1. Generate machine instructions - evaluate the mnemonics to produce their machine code - evaluate the symbols, literals, addresses to produce their equivalent machine addresses - convert the data constants into their machine representations 2. Process pseudo operations
2. Two Pass Assembler A two-pass assembler performs two sequential scans over the source code: Pass 1: symbols and literals are defined Pass 2: object program is generated Parsing: moving in program lines to pull out op-codes and operands
Data Structures:
-
Location counter (LC): points to the next location where the code will be placed
Op-code translation table: contains symbolic instructions, their lengths and their op-codes (or subroutine to use for translation)
-
Symbol table (ST): contains labels and their values String storage buffer (SSB): contains ASCII characters for the strings
Configuration table: contains pointer to the string in SSB and offset where its value will be inserted in the object code
Pass1 Symbol table Configuration table String storage buffer Partially configured object file
Pass 2
placed in Pass 1
Op-code Table Mnemonic LDA SUB COMP LDX ADD TIX JLT JGT Addressing mode immediate immediate immediate immediate indexed direct direct direct Opcode 01 1D 29 05 18 2C 38 34
implied
4C
Value 0103
COUNT: WORD 6 END Symbol Table Symbol LOOP LIST COUNT Address 0106 0112 0115
Configuration Table Offset 0007 000A SSB pointer for the symbol DC00 DC05 SSB 4CH 49H 53H 54H 5EH
Pass1 All symbols are identified and put in ST All op-codes are translated Missing symbol values are marked
LC = origin Read next statement Parse the statement Y Comment N END N N N Label Y N Enter label in ST Enter label in ST Y Enter label in ST Call translator Place constant in machine code Advance LC Advance LC by the number of bytes specified in the pseudo-op Y Enter label in ST Label N Label EQU WORD/ BYTE RESW/RESB pseudo-op Y what kind? Y Pass 2
Translator Routine
Find opcode and the number of bytes in Op-code Table Write opcode in machine code
Write the data or address that is known at this time in machine code more information will be needed in Pass 2 ? N
Figure 4. Flowchart of a translator routine Pass 2 - Fills addresses and data that was unknown during Pass 1.
Pass 1 More lines in Configuration Table Y Get the next line Retrieve the name of the symbol from SSB Get the value of the symbol from ST Compute the location in memory where this value will be placed (starting address + offset) Place the symbol value at this location N Done
Figure 5. Second pass of a two-pass assembler. Relocatable Code Producing an object code, which can be placed to any specific area in memory. Direct Address Table (DAT): contains offset locations of all direct addresses in the program (e.g., 8080 instructions that specify direct addresses are LDA, STA, all conditional jumps...). To relocate the program, the loader adds the loading point to all these locations. assembly language program machine language program and DAT Figure 6. Assembler output for a relocatable code. Assembler
Example 3: Following relocatable object code and DAT are generated for Example 1.
assembly language program ----------------------START LDA #0 LDX LOOP: ADD TIX JLT RSUB LIST: WORD 200 #0 LIST, X COUNT LOOP memory address 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000A 000B 000C 000D 000E 000F 0010 0011 0012 0013 0014 0015 0016 0017 object code in memory 01 00 00 05 00 00 18 00 12 2C 00 15 38 00 06 4C 00 00 00 02 00 00 00 06
DAT 0007 000A 000D Forward and backward references in the machine code are generated relative to address 0000. To relocate the code, the loader adds the new load-point to the references in the machine code which are pointed by the DAT.
7
One-Pass Assemblers Two methods can be used: - Eliminating forward references Either all labels used in forward references are defined in the source program before they are referenced, or forward references to data items are prohibited. - Generating the object code in memory No object program is written out and no loader is needed. The program needs to be re-assembled every time. Multi-Pass Assemblers Make as many passes as needed to process the definitions of symbols. Example 3: A B C EQU B EQU C DS 1 3 passes are required to find the address for A.
Such references can also be solved in two passes: entering symbol definitions that involve forward references in the symbol table. Symbol table also indicates which symbols are dependent on the values of others. Example 4: A B C D EQU EQU EQU DS B D D 1
200