DHC11 - Disassembler For Motorola 68HC11 Latest

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

DHC11 - A 68HC11 Disassembler

This page described a multi-pass code-seeking disassembler for the Motorola 68HC11
and other compatible processors includig the 6800, 6801, 6802, 6803, etc. It includes
a number of features to enhance the readability of traditional disassemblies. It has
been used by the author for various applications including disassembling GM (including
Holden) vehicle ECMs.

Get latest version of DHC11

The disassembler is described in enough detail to enable anyone familiar with disassembler
and/or assembler concepts to begin using it immediately. There is a disassembler tutorial for
DCH11. The disassembler's output is designed to be assembled by the author's companion
macro assembler - ASHC11. but the output is flexible enough to be used by most good 68HC11
assemblers. The mnemonics chosen are not Motorola standard, but have been designed for
readability by a majority of programmers familiar with either the 68xx, Z80, or other
manufacturer's microcontrollers. The disassembler's output can be immediately assembled back
to the original binary image for verification of the disassembly process.

Features of the Disassembler.


Code Seeker: A code-seeking disassembler will "look for" instructions that modify the target CPU's PC. Target
addresses for CALL and BRANCH instructions are tagged as such. On each disassembly pass, new target addresses may
be found as more code is "discovered" from the CPU's address space. The CPU's address space (ie. the loaded
ROM/code image) is initially assumed to be data, and when a disassembly pass produces no more new target addresses
for code, the code seeker has finished.

Flexible Output: A number of options allow the output to be tailored for firstly finding out the structure of an
unknown binary file, and later, for producing an output that can be most easily commented and made ready for re-
assembly.

Few Limitations: There are very few limitations on the disassembler. It will disassemble up 64 kbyte (65536 byte)
binary files with the only real limitation being on the size of the user's symbol table, such that up to 8,192 labels of up to
128 characters each may be defined. Commands that declare labels are Label, Entry, Vectors, and Indirect.

Supporting Tools: The disassembler is a complete package, but a couple of tools round out its use. These include:

Tutorial for the DCH11 disassembler.


Motorola 68HC11 Macro Assembler - ASHC11.
Formatted Hex to/from Binary converter - BINCVT.

The next sections describe the disassembler.

Configuration File
Information about the code is stored in a configuration file, and controls how the disassembler operates. Labels, address
tables, and data names can all be stored here, and re-used in subsequent disassemblies. Each file line is made up of a
command with optional parameters. The following lists all supported commands, and each is further described below.
Case (upper or lower) is not significant except in label names if they are supplied. Optional parameters are shown in
square brackets [<optional>].
Command & Arguments Action
INput <file> Specify source file to disassemble.
OUTput <file> Specify where disassembly is written.
LOad <addr> Source file will be loaded to address <addr>.
Entry <addr> [<name>] Provide a code entry point <addr> with optional label <name>.
Label <addr> <name> Assign a label <name> to address <addr>.
INDEXed <start> <end> Define address range where indexed address are entered.
Addresses Show addresses in disassembly.
OPcodes Show opcodes in disassembly.
ASCii Show byte data as ASCII strings.
Bytes <addr> <count> [<name>] Define a byte table at <addr> of <count> length.
Words <addr> <count> [<name>] Define a word table at <addr> of <count> length.
Indirect <addr> [<name>[ <here>]] Define a pointer to to an (indirect) address.
Vectors <addr> <count> [<name>[ <here>]] Define a range of indirect addresses.

Command Line Switches: A number of options are handled by supplying command line switch options. Command line
switches override any matching options that are also specified in the configuration file. Only an abbreviated form of the
command line switch is required, eg, -a is enough to specify that the output should contain addresses. The minimum
number of characters required for each switch is indicated by the upper case characters in the following description (eg
for OVerwrite, just OVA, or OVA, is required):

Switch Effect
-INput= Name of binary file to read.
-OUTput= Name of disassembly file produced.
-OVerwrite Forces output file to be overwritten if it already exists.
-LOad= Specify hex start address to load binary file into memory.
-Addresses Show addresses on left of each disassembled line.
-OPcodes Show opcodes for instructions disassembled.
-ASCii Show data byte ASCII equivalents.
-@ Use procedure local labels, ie. "@ labels".
-LColons Use a colon (:) suffix on labels (default=TRUE).
-LPrefix= Prefix string for labels (default=L).
-HPrefix= Prefix string for hex constants (default=$).
-Bitimmediate Display immediate bytes as a bit# [+bit# ..] mask.
-Defsperline= Maximum number of db, or dw items per line (default 10).
-FILLminimum= db count of same value to force the fill pseudo-op (default 10).
-FRagment Decode a code Fragment, don't relocate it to high memory.
-Verbose Show control file information as it's decoded.
-# Calculate data addresses from probable IX/IY immediates.

A switch option can be negated with a "-" suffix, or asserted with a "+" suffix (the default), as in: -op- to turn the option
off. Switches requiring a parameter must use either an equals "=" or colon ":" separator, as in: LOAD=$c000 to define
the load address. Note also that DHC11 does not really need to use the initial "-" when defining a switch.

Option and Switch Details


The following describes all commands (as they appear in the control file) and switches (as they appear on the command
line). Command that have equivalent switches are shown with a leading "-".

File Specification -OUTput=, -INput=, -OVerwrite, -Load=

-OUTput=<filename>
Specifies the output file name the disassembly will be written to. The disassembler will first test to see if the file exist,
and will exit without any action if so. To over come this situation you can use the -OVA option (described below) to
overwrite the old file.

-INput=<filename>
Specifies the input file. It is assumed to be in a BINARY format.

-OVerwrite
This tells the disassembler to overwrite the old output file (which results in the old file's contents being lost).

-Load=<loadaddress>
Is the address the Binary file image will be loaded into. If the binary image is too large, or the load address selected
causes the data to overflow, then an error message is generated and the disassembler aborts. Note that the load address is
not required as the the disassembler assumes the last word of the binary file will be at address $FFFE, as this is the
HC11's reset vector.

Disassembly display: -Addresses, -OPcodes

-Addresses
Displays the instruction/code address at each disassembly line, as in:
D063 beq LD071
D065 LD065: ldaA LC008
D068 cmpA #$AA
-OPcodes
Displays the opcode bytes for each instruction (note: this does not display data bytes, that are already decoded):

5F clrB
08 incX
18 BC C0 06 cmpY LC006
Combining the two options -A -OP produces:

F091 12 2D 40 11 LF091: brset L002D, #%01000000, LF0A6


F095 CE F3 17 ldX #$F317
F098 18 1F 00 10 0B brclr 0, Y, #%00010000, LF0AC

Label Options: -@ (local labels), -LColons, -LPrefix

-@ (local labels)
Specifies that procedure local labels are to be used instead of the default label (described below). Below is an example of
code disassembled with the -@ switch. Note that the instruction at label @21 branches to a default style label (LE277).
Local labels are bounded by data or entry points that are the target of call instructions.

@19 brset L0039, #%0000010, @21 ; @19 and @20 are local labels
bset L0039, #%0000010
@20 ldaA LC682 ; LC682 is an entry point
staA L00C5 ; L00C5 is a data lable
@21 brset L0001, #%0000010, LE277 ; local labels & an entry point

-LColons
Specifies that a colon (:) suffix is to be used on labels. Note that local labels are always shown without a colon. By
default colons are used.

-LPrefix=<labelPrefix>
Specifies the prefix string used for non-local labels automatically generated by the disassembler. The default prefix is "L"
so the labels for address $1A2B would be shown as L1A2B. A prefix string of more than two characters may cause
undesirable indenting of the disassembly.

Constant Options : -HPrefix, -Bitimmediate

-HPrefix=<hexPrefixString>
Specifies the prefix string for hex constants. The default is $ and another possible prefix is 0x. The prefix you use may
depend on what your assembler will accept. Here's an example using HPrefix=0x, LPrefix=x and LColons-.

adcA #0x00 ; 0x is the hex prefix


call xEDDB ; x is the label prefix
xC134 ldX #0xC69B ; xC134 is label for this instruction
staA x00C1 ; x00C1 is a data byte label

-Bitimmediate
Display immediate bytes as either a bit mask or an inverted bit mask. Normally the immediate byte field used for
instructions such as ldaA is shown as a binary value %00100010. When this option is enabled, this value would be
shown as (bit5+bit1). If more than 4 bits in a mask are set then the inverted form of the mask is used as shown:

bitB #bit0 ; same as "bitB #$01"


xorB #(bit5+bit4) ; same as "xorB #%00110000"
andA #~bit3 ; same as "andA #%11110111"
brclr L00C3, #$FF, @22 ; $FF still is used if all bits set
LEEBF: andB #~(bit5+bit0) ; same as "andB #%11011110"

Disassembly Control - Label, Entry, Indirect, Vectors


The following commands, all within the control file, tell the disassembler information about the binary image it will
process. The more information that can be determined and supplied here, the better the resulting disassembly will be.
Optional parameters are enclosed in square [ ] brackets.

Label <addr> <label>


This simply assigns a label to an address. No assumptions is made that this address corresponds to data or code. Note
that all commands can use optional labels, so this command is generally not required.

Entry <addr> [<entrylabel>]


Tells the disassembler the location of starting points for code. Optionally, a label will be assigned to this entry point. The
code seeking algorithm scans memory for these locations, and automatically adds new entry points as branch and call
instructions are encountered. When no new entry points are added in a single pass, then the disassembler has completed
the code seek phase.

Indirect <addr> [<indirectlabel> [<herelabel>]]


An indirect address is a 16 bit quantity (ie. word, or two bytes) that is used by the processor to form a target (jump or
call) address. This is illustrated by the disassembler's output:

<indirectlabel>: ldS #$01FF


.
.
<addrlabel>: dw <indirectlabel>
The word at memory address <addrlabel> points to an address that is tagged as an entry point, and the optional
<indirectlabel> is the label for this address. And finally, <addrlabel> will be tagged as a data word. The ordering of
labels was chosen to maintain compatibility with existing disassemblers.

Vectors <addr> <count> [<labelbase>[ <herelabel>]]


The Vectors command describes a list of indirect data words, as would be produced by a jump table, or list of procedure
addresses. The number of data words (or vectors) is defined by <count>. The optional <labelbase>, if supplied, is used to
create a label, for each indirect address, of the form <labelbase>_NN, where NN starts from 00. The optional
<herelabel> is the address (ie. <addr>) of the word table. Note that NN is one less than <count>.

<labelbase>_00: ldS #$01FF


.
<labelbase>_01: ldX #$1234
.
.
<labelbase>_NN: xorB #%00110000
.

<herelabel>: dw <labelbase>_00
dw <labelbase>_01
.
dw <labelbase>_NN

Advanced Disassembly Commands - INDEXed, -# (calculate index


addresses)

INDEXed <startaddr> <endaddr>


The INDEXed command specifies a range of addresses for which the disassembler will attempt to use the last know
value of the IX and IY registers in indexed address calculations.

-# (calculate index addresses)


Indexed data access calculations will only be made when the -# command line switch is supplied. The purpose of this
switch, and the INDEXed command, is to ensure that all data accesses are recorded as well as can be done.

Different Mnemonics
The following mnemonics are different to those as specified by Motorola.
DHC11's Mnemonics Motorola's Function Performed
call JSR Call
callr BSR Call Relative (short call)
cmpD, cmpX cmpY CP? Compare (16 bit register)
decX, decY, decS DE? Decrement (16 bit register)
di SEI Disable Interrupts
ei CLI Enable Interrupts
incX, incY, incS IN? Increment (16 bit register)
jr BRA Jump Relative (short jump)
push, pushB, pushX, pushY PSH? Push on to stack
popA, popB, popX, popY PUL? Pop off stack
ret RTS Return (from subroutine)
reti RTI Return From Interrupt
xorA, xorB EOR? eXclusive Or

As you can see, DHC11's mnemonics use, at most, one extra character, but this makes their meaning much clearer, and is
closer to a majority of other assembler syntaxes. In addition, the mnemonics are displayed in a mixed case that is
designed to highlight the registers use by the instruction. For example, LDA, the Load A instruction is displayed as ldA
to emphasise that the A register is used in this ld instruction. The tAB and xgDY are examples of instructions that use
two registers in the one mnemonic.

Last Updated 29th November 2001 (links)

Statistics by www.digits.com
Shows approximate hits since 15 May 2000.

This document is copyright © 2000, 2001, Tech Edge Pty. Ltd.


Author P. Gargano

Home | e-mail DHC11 Feedback | Copyright

You might also like