Delphi Internal Data Structures
Delphi Internal Data Structures
Delphi Internal Data Structures
J.R.
2012
Abstract
This document describes the types of executable file generated by the Embarcadero Delphi programming
environment, the specific data structures used in such executables, and how to parse these to allow proper
analysis of the code in a Delphi executable.
Contents
1 Introduction 2
1.1 History of Delphi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Existing Documentation and Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
7 Code Samples 28
7.1 Brute Force Search for Virtual Method Table Structures . . . . . . . . . . . . . . . . . . . . . . . 28
7.2 Explanation of Functions not Defined in the Listing . . . . . . . . . . . . . . . . . . . . . . . . . 29
1 Introduction
Delphi is a popular Windows-based programming environment. Executables created by Delphi are compliant
with the PE specification but are organised internally in a specific way and contain several features not found
in executables created by other tools. Delphi was originally designed for “Rapid Application Development” at
a time when compilation speed was an issue. As a result, it makes excessive use of precompiled library code.
The Delphi linker always places the code from any library functions into the executable first, with the object
code produced from compiling the user’s source going in at the end.
The challenges of disassembling and running heuristics on a Delphi executable are as follows: firstly, there
is generally a large amount of uninteresting library code preceding the interesting bits, and it is easy to use
up all the available time or space limitations simply analysing such library code. Code signatures help with
this, but there are so many libraries and versions of functions that building a complete set of such signatures
would be a Sisyphean task. Secondly, Delphi places large amounts of data in the code section, interleaving it
between functions. A basic linear-sweep disassembler, encountering such data, will disassemble it into garbage
and become desynchronised. Even more intelligent disassemblers such as IDA Pro occasionally have problems
recognising Delphi’s embedded data structures.
EP function has an individual CALL instruction for each library init function
Virtual Method Tables (VMTs) referenced by a pointer to the first function entry, e.g. after all the prefix
fields
1 For instance, the construct ‘set of’ when applied to an enumerated type is equivalent to a bitfield in C.
2 http://docwiki.embarcadero.com/RADStudio/en/
3 http://www.freepascal.org/docs-html/rtl
4 http://sourceforge.net/projects/dede/develop
5 http://www.ggoossen.net/revendepro
6 http://www.the-interweb.com/serendipity/index.php?archives/3-Protecting-the-Oracle-A-proof-of-concept-for-a-Delphi-obfuscat
html
3
Old Era (Delphi 3 to Delphi 7)
code section is called ‘CODE’
init and finalize function addresses are placed in a package info table located before the EP function
RTTI records are more elaborate and always preceded by a pointer
DVCLAL and PACKAGEINFO resources introduced
Virtual Method Tables get a vmtSelf pointer as the first entry, and references to a VMT are pointers to
this entry, which occurs before the VMT prefix fields
New Era (Delphi 2007 onwards)
code section is called ‘.text’, and other section names conform to Microsoft convention
init functions and the entry point are in a separate ‘.itext’ section, although the package info table itself
is still in ‘.text’
RTTI has a lot of extra data, and the RTTI record section at the beginning of ‘.text’ is much bigger
InitTable has extra fields, including a list of unit names like those in the PACKAGEINFO resource
Virtual Method Tables and their associated sub-tables have many extra and extended fields
Additionally, the definitions of some internal structures (such as the Virtual Method Table) were altered
in the New executable. Embarcadero appear to be continuing to modify various structures which have been
virtually unaltered since Delphi 3: more modifications have been made in Delphi XE 2.
Table 2 shows the sections of a New executable. The first three sections are now named according to the
standard Microsoft convention, and two additional sections, ‘.itext’ and ‘.didata’, have been added.
Name From To Size VAddr VSize Characteristics
.text 0x00000400 0x00158C00 01411072 0x00401000 01411048 CNT CODE MEM EXECUTE MEM READ
.itext 0x00158C00 0x0015A200 00005632 0x0055A000 00005244 CNT CODE MEM EXECUTE MEM READ
.data 0x0015A200 0x0015F200 00020480 0x0055C000 00020328 CNT INITIALIZED DATA MEM READ MEM WRITE
.bss 0x0015F200 0x0015F200 00000000 0x00561000 00021428 MEM READ MEM WRITE
.idata 0x0015F200 0x00162C00 00014848 0x00567000 00014502 CNT INITIALIZED DATA MEM READ MEM WRITE
.didata 0x00162C00 0x00163600 00002560 0x0056B000 00002272 CNT INITIALIZED DATA MEM READ MEM WRITE
.tls 0x00163600 0x00163600 00000000 0x0056C000 00000076 MEM READ MEM WRITE
.rdata 0x00163600 0x00163800 00000512 0x0056D000 00000024 CNT INITIALIZED DATA MEM READ
.reloc 0x00163800 0x00180E00 00120320 0x0056E000 00120100 CNT INITIALIZED DATA MEM DISCARDABLE MEM READ
.rsrc 0x00180E00 0x001B0200 00193536 0x0058C000 00193536 CNT INITIALIZED DATA MEM READ
The differences in the entry point function of executables created by different versions of Delphi may be
observed in IDA Pro. Ancient executables have no package info table and call all the init functions individually
from the entry point, as can be seen in Figure 3.
Old executables have a package info table immediately preceding the entry point (the very end of it can be
seen here in Figure 4 ). The virtual address of the package info table is passed to the InitExe function. The
package info table’s ‘unit entry table’ contains the addresses of the initialize and finalize functions for each unit.
If a unit has no initialize or finalize function, the corresponding unit entry table location contains zero. All the
initialize functions are called by InitExe() – a file infector which targeted Delphi executables could conceal itself
by placing a pointer to the virus code in a spare unit entry table slot.
New executable entry points look similar to those of Old executables, but the package info table no longer
precedes the entry point function, which is in a separate section called .itext (Figure 5).
5
CODE:0044395C public start
CODE:0044395C start proc near
CODE:0044395C push ebp
CODE:0044395D mov ebp, esp
CODE:0044395F add esp, -0Ch
CODE:00443962 call sub_403188
CODE:00443967 call sub_404470
CODE:0044396C call sub_4077B4
CODE:00443971 call sub_40E360
CODE:00443976 call sub_40E720
CODE:0044397B call sub_4157BC
CODE:00443980 call sub_41D02C
CODE:00443985 call sub_41EF88
CODE:0044398A call sub_430020
CODE:0044398F call sub_431DF0
CODE:00443994 call sub_4356A4
CODE:00443999 call sub_436774
CODE:0044399E call sub_437448
CODE:004439A3 mov eax, ds:dword_44556C
CODE:004439A8 add eax, 30h
CODE:004439AB mov edx, (offset aImagedit_hlp+4)
CODE:004439B0 call sub_403278
CODE:004439B5 mov ecx, offset unk_4456E0 ; Ancient-style CreateForm call
CODE:004439BA mov edx, offset VMT_TMainForm
CODE:004439BF mov eax, ds:dword_44556C
CODE:004439C4 call sub_414E34
CODE:004439C9 mov eax, ds:dword_44556C
CODE:004439CE call sub_414EC4
CODE:004439D3 call sub_403FB3
CODE:004439D8 mov esp, ebp
CODE:004439DA pop ebp
CODE:004439DB retn
CODE:004439DB start endp
~
7
3 Structure of a Delphi Executable
The Delphi compilation environment takes Pascal code written by the user and compiles it into one or more
‘units’. Then, the user code units are combined with code from a number of standard or custom library units
to make the final executable. Classically, library units are statically linked into the executable, but dynamic
linking is also available, where the library units’ code is in an external dynamic library file (Borland did not
use standard DLLs for this but created their own proprietary ‘BPL’). Dynamically-linked Delphi executables
appear to be quite rare, so this document will only consider statically-linked types.
As a general rule, library code always precedes user code in the final executable file, with the entry point
function usually being the last function in its section. Each unit may define an initialisation and finalisation
function, with the initialisation function being called when the unit is loaded and the finalisation function when
it is unloaded. The addresses of all such functions are stored in the package info table (where present). New-
style Delphi executables have a separate section (‘.itext’) just for initialisation functions and the entry point
function, but finalisation functions and the package info table data structure itself are still located in the main
.text section.
Compiled units (libraries and compiled user code) have the extension ‘.dcu’ and use a proprietary file format
which frequently changed between versions of Delphi (originally a deliberate measure on the part of Borland to
prevent third-party tools from being able to read the file format). There are a few utilities online that claim to
be able to parse some versions of this format, but none work consistently well on more than a few versions.
The TAttrData structure (Table 7) is also used in several places in new-style Delphi executables, usually
occurring as a field at the end of other structures.
Delphi structures whose first field is called some variation on ‘length’ generally mean that the first field
contains the total size of the structure in bytes, including itself. This is true of TAttrData, and it is therefore
possible to skip over TAttrData without parsing every field (if the Len field’s value is 2, this means the
structure is empty, and such empty TAttrData structures are very common).
8
struct TAttrData
{
uint16 t Len ;
TAttrEntry AttrEntry [ ] ;
};
struct TAttrEntry
{
uint32 t AttrType ;
uint32 t AttrCtor ;
uint16 t ArgLen ;
uint8 t ArgData [ ArgLen ] ;
};
3.2 Alignment
All Delphi data structures are aligned: the starting virtual address of each structure must be divisible by four.7
Padding bytes are used to ensure alignment, and it is therefore necessary to take these into account when
parsing a structure. Until recently, Delphi used set padding consisting of one, two, and three-byte ‘no-op’ x86
instruction sequences, as shown in Table 8. Recent versions of Delphi have started using zeroes as padding
instead.
7 NB: this is not always true of structures which are embedded inside a larger structure.
9
4 Code Section Data Structures
The main code section of a Delphi executable contains many data structures in addition to executable code.
These structures may be inserted virtually anywhere in the code, making it very difficult for a linear sweep
disassembler without knowledge of their location to avoid disassembling data into garbage. If, on the other
hand, the location and size of these structures can be determined in advance, the disassembler can skip over
them: but because most of the structures are complex and can vary in length, it is usually necessary to parse
them before this information is available.
1. If their virtual addresses are used as an argument to a function called from the entry point, e.g. Create-
Form, InitExe, the addresses may be obtained from here.
2. If they are referred to by a pointer in another structure, for example the VMT typeinfo subtable has
pointers to RTTI records, and VMTs themselves contain a pointer to their parent VMT.
3. If the structures always occur in the same place (for instance, the RTTI records at the beginning of the
main code section).
4. Via a brute-force search. This is not terribly efficient but may sometimes be the only possibility.
enum OrdType
{
otSByte ,
otUByte ,
otSWord ,
otUWord ,
otSLong ,
otULong
};
10
struct t k I n t e g e r
{
TOrdType OrdType ;
int32 t MinValue ;
int32 t MaxValue ;
TAttrData AttrData ; // *
}
struct t k F l o a t
{
TFloatType FloatType ;
TAttrData F l o a t A t t r D a t a ; // *
}
11
4.2.7 0x06 - tkSet
Type tkSet, in addition to its OrdType, contains a 32-bit integer pointing to the RTTI record for the type it
contains (e.g. a set of integers would contain a pointer to the tkInteger RTTI record).
struct t k S e t
{
TOrdType OrdType ;
uint32 t CompType ;
TAttrData S e t A t t r D a t a ; // *
}
struct TPropData
{
uint16 t PropCount ;
TPropInfo P r o p L i s t [ PropCount ] ;
};
struct TPropInfo
{
uint32 t PropType ; // p o i n t e r t o RTTI r e c o r d
uint32 t GetProc ;
uint32 t SetProc ;
uint32 t StoredProc ;
uint32 t Index ;
uint32 t Default ;
uint16 t NameIndex ;
PASCAL STRING Name ;
};
struct TPropDataEx
{
uint16 t PropCount ;
TPropInfoEx P r o p L i s t [ PropCount ] ;
};
struct TPropInfoEx
{
uint8 t Flags ;
uint32 t Info ; // p o i n t e r t o p o i n t e r t o RTTI r e c o r d
TAttrData AttrData ;
}
12
enum TMethodKind
{
mkProcedure ,
mkFunction ,
mkConstructor ,
mkDestructor ,
mkClassProcedure ,
mkClassFunction ,
mkClassConstructor ,
mkClassDestructor ,
mkOperatorOverload ,
mkSafeProcedure ,
mkSafeFunction
};
enum TCallConv
{
ccReg ,
ccCdecl ,
ccPascal ,
ccStdCall ,
ccSafeCall
};
struct ParamList
{
TParamFlags Flags ; // b y t e = s i z e d f i e l d
DELPHI STRING ParamName ;
DELPHI STRING TypeName ;
}
struct TProcedureParam
{
uint8 t Flags ;
uint32 t ParamType ;
PASCAL STRING Name ;
TAttrData ParamAttrData ;
};
struct T P r o c e d u r e S i g n a t u r e
{
uint8 t Flags ;
TCallConv CC;
uint32 t ResultType ; // PPTypeInfo
uint8 t ParamCount ;
TProcedureParam Params [ ParamCount ] ;
};
struct tkMethod
{
TMethodKind MethodKind ;
uint8 t ParamCount ;
ParamList Params [ ParamCount ] ;
PASCAL STRING ResultType ; // o n l y p r e s e n t i f MethodKind = mkFunction
uint32 t PPTypeInfo ; // * o n l y p r e s e n t i f MethodKind = mkFunction
TCallConv CC; // *
uint32 t ParamTypeRefs [ ParamCount ] ; // *
uint32 t MethSig ; // * p o i n t e r t o a T P r o c e d u r e S i g n a t u r e
TAttrData MethAttrData ;
}
struct TArrayTypeData
{
uint32 t Size ;
uint32 t ElCount ;
uint32 t ElType ; // p o i n t e r t o a RTTI r e c o r d
uint8 t DimCount ;
uint32 t Dims [ DimCount ] ;
};
struct tkArray
{
TArrayTypeData ArrayData ;
TAttrData ArrayAttrData ;
};
struct TManagedField
{
uint32 t TypeRef ; // p o i n t e r t o a RTTI r e c o r d
uint32 t FieldOffset ;
};
struct TRecordTypeField
{
14
uint32 t TypeRef ;
uint32 t FldOffset ;
uint8 t Flags ;
PASCAL STRING FieldName ;
TAttrData AttrData ;
};
struct tkRecord
{
uint32 t RecSize ;
uint32 t ManagedFieldCount ;
TManagedField ManagedFields [ ManagedFieldCount ] ; // o n l y p r e s e n t i f ManagedFieldCount > 0
uint8 t NumOps ;
uint32 t RecOps [ NumOps ] ; // o n l y p r e s e n t i f NumOps > 0
uint32 t RecFldCnt ;
TRecordTypeField R e c F i e l d s [ RecFieldCnt ] ;
TAttrData RecAttrData ;
};
// I n t f F l a g b i t v a l u e s
#define ifHasGuid 0 x1
#define i f D i s p I n t e r f a c e 0 x2
#define ifDispatch 0 x4
struct TGUID
{
uint32 t D1 ;
uint16 t D2 ;
uint16 t D3 ;
uint8 t D4 [ 8 ] ;
};
struct t k I n t e r f a c e
{
uint32 t IntfParent ;
uint8 t IntfFlags ; // b i t f i e l d
TGUID Guid ;
PASCAL STRING I n t f U n i t ;
TIntfMethodTable I n t f M e t h o d s ;
TAttrData IntfAttrData ;
};
struct t k I n t 6 4
{
int64 t MinInt64Value ;
int64 t MaxInt64Value ;
TAttrData Int64AttrData ;
};
struct tkDynArray
{
uint32 t elSize ;
uint32 t elType ; // p o i n t e r t o RTTI
uint32 t varType ;
uint32 t elType2 ;
PASCAL STRING DynUnitName ;
uint32 t DynArrElType ;
TAttrData DynArrAttrData ;
15
};
struct t k U S t r i n g
{
TAttrData AttrData ; // *
};
struct t k C l a s s R e f
{
uint32 t PPInstanceType ;
TAttrData ClassRefAttrData ;
};
struct t k P o i n t e r
{
uint32 t RefType ;
TAttrData PtrAttrData ;
}
struct t k P r o c e d u r e
{
TProcedureSignature ProcSig ;
TAttrData ProcAttrData ;
};
struct T P r o c e d u r e S i g n a t u r e
{
uint8 t Flags ;
TCallConv CC;
uint32 t PPTypeInfo ;
uint8 t ParamCount ;
TProcedureParam Params [ ParamCount ] ;
};
struct TProcedureParam
{
uint8 t Flags ;
uint32 t PPTypeInfo ;
PASCAL STRING Name ;
TAttrData Attr ;
};
There is usually such a jump table located at the beginning of the code section immediately following the
RTTI records (if these exist). Locating this jump table is a useful way to determine the maximum size of the
RTTI record area, since once the jump table has been located, the RTTI records have finished, and it may not
always be possible to parse all the RTTI records in a linear fashion to determine their exact size.
The Ancient-style VMT does not have the vmtSelf or vmtIntfTable entries, and is shown in the next listing.
struct O l d V i r t u a l M e t h o d T a b l e P r e f i x
{
uint32 t vmtAutoTable ; // =0x34 ( = 52)
uint32 t vmtInitTable ; // =0x30 ( = 48)
uint32 t vmtTypeInfo ; // =0x2C ( = 44)
uint32 t vmtFieldTable ; // =0x28 ( = 40)
uint32 t vmtMethodTable ; // =0x24 ( = 36)
uint32 t vmtDynamicTable ; // =0x20 ( = 32)
uint32 t vmtClassName ; // =0x1C ( = 28)
uint32 t vmtInstanceSize ; // =0x18 ( = 24)
uint32 t vmtParent ; // =0x14 ( = 20)
uint32 t vmtSafeCallException ; // =0x10 ( = 16)
uint32 t vmtAfterConstruction ; // =0x0C ( = 12)
uint32 t vmtBeforeConstruction ; // =0x08 ( = 08)
8 For a brief explanation of the VMT fields, see http://pages.cs.wisc.edu/~rkennedy/vmt.
17
uint32 t vmtDispatch ; // =0x04 ( = 04)
};
struct T I n t e r f a c e E n t r y
{
TGUID Guid ;
uint32 t VTable ;
uint32 t IOffset ;
uint32 t ImplGetter ;
};
struct T I n t e r f a c e T a b l e
{
uint32 t EntryCount ;
T I n t e r f a c e E n t r y E n t r i e s [ EntryCount ] ;
uint32 t I n t f s [ EntryCount ] ;
};
struct T F i e l d I n f o
{
uint32 t TypeInfo ;
uint32 t Offset ;
};
struct T F i e l d T a b l e // do n o t c o n f u s e w i t h TVmtFieldTable !
{
uint16 t X;
uint32 t Size ;
uint32 t Count ;
TFieldInfo Fields [ ] ;
};
struct TVmtFieldExEntry
{
uint8 t Flags ;
uint32 t TypeRef ; // p o i n t e r t o a RTTI r e c o r d
uint32 t Offset ;
PASCAL STRING Name ;
18
TAttrData AttrData ;
};
struct TVmtFieldEntry
{
uint32 t FieldOffset ;
uint16 t TypeIndex ; // i n d e x i n t o TVmtFieldClassTab
PASCAL STRING Name ;
};
struct TVmtFieldClassTab
{
uint16 t Count ;
uint32 t C l a s s R e f [ Count ] ;
};
struct TVmtFieldTable
{
uint16 t Count ;
uint32 t ClassTab ; // p o i n t e r t o a TVmtFieldClassTab s t r u c t u r e
TVmtFieldEntry Entry [ Count ] ;
uint16 t ExCount ; // *
TVmtFieldExEntry ExEntry [ ExCount ] ; // *
};
struct TVmtMethodParam
{
uint8 t Flags ;
uint32 t ParamType ; // p o i n t e r t o a RTTI r e c o r d
uint8 t ParOff ; // s p e c i f i e s w h e t h e r parameter i s p a s s e d v i a a r e g i s t e r
// or on t h e s t a c k
PASCAL STRING Name ;
TAttrData AttrData ;
};
struct TVmtMethodEntryTail
{
uint8 t Version ; // u s u a l l y ’ 3 ’
TCallConv CC;
uint32 t ResultType ; // p o i n t e r t o a RTTI r e c o r d
uint16 t ParOff ; // amount o f s t a c k s p a c e needed f o r p a r a m e t e r s
uint8 t ParamCount ;
TVmtMethodParam Params [ ParamCount ] ;
};
struct TVmtMethodEntry
{
uint16 t Len ; // s p e c i f i e s e n t i r e l e n g t h o f e n t r y i n b y t e s
uint32 t CodeAddress ; // a d d r e s s o f t h e a c t u a l f u n c t i o n
PASCAL STRING MethodName ;
TVmtMethodEntryTail T a i l ; // p r e s e n t o n l y i f Len > 6 + s i z e o f ( MethodName )
};
struct TVmtMethodExEntry
{
uint32 t Entry ; // p o i n t e r t o a TVmtMethodEntry
uint16 t Flags ;
int16 t VirtualIndex ;
};
struct TVmtMethodTable
{
uint16 t Count ;
19
TMethodTableEntry Entry [ Count ] ;
uint16 t ExCount ; // *
TMethodTableExEntry ExEntry [ ExCount ] ;
};
A selector is looked up by locating its value in the Selectors array. The required address is at the corre-
sponding index in the Addrs array.
struct UnitEntryTable
{
PackageUnitEntry e n t r i e s [ NumberOfUnits ] ;
};
struct PackageUnitEntry
{
uint32 t Init ; // p o i n t e r t o u n i t ’ s i n i t i a l i s a t i o n f u n c t i o n
uint32 t FInit ; // p o i n t e r t o u n i t ’ s f i n a l i s a t i o n f u n c t i o n
};
struct TPackageTypeInfo
{
uint32 t TypeCount ;
uint32 t PTypeTable ; // p o i n t e r t o t y p e t a b l e ( a r r a y o f p o i n t e r s )
uint32 t UnitCount ;
uint32 t UnitNames ; // p o i n t e r t o a c o n c a t e n a t i o n o f PASCAL STRING o b j e c t s
};
struct P a c k a g e I n f o T a b l e
{
uint32 t UnitCount ;
uint32 t U n i t I n f o ; // p o i n t e r t o a U n i t E n t r y T a b l e
TPackageTypeInfo TypeInfo ; // *
};
The type table is an array of pointers to RTTI type records. The reason for this table has not been
ascertained (the Embarcadero documentation just says “for internal use”) but it is likely that the table contains
a pointer to every RTTI record in the executable. If so, this would be extremely useful as it would remove the
need for linear parsing of the RTTI structures at the beginning of the main code section, but only in New-era
executables as this structure does not exist in previous versions.
20
4.6 Strings
Delphi has a tendency to place small amounts of data in the code section adjacent to the function that uses the
data. This includes string data. Because these strings are usually Pascal strings, they may not be detected by
standard string-search techniques, or if they are detected the boundaries may not be detected properly. Such a
string is shown in Table 10 (note the function prologue sequence occurring immediately afterwards):
An alternative technique for brute-force searching for Delphi strings is possible because such strings, where
they occur as standalone data items not part of another structure, are usually preceded by the integer value
0xFFFFFFFF. Therefore it is quite simple to search for all the occurrences of 0xFFFFFFFF on an aligned
boundary in the code section and then determine whether the data following them looks as if it might be a
Pascal string in any of the three formats that Delphi uses (short length prefix, long length prefix, or Unicode).
Such a scan not only ensures that all strings are located, but also allows the space occupied by them to be
marked as data and skipped during code analysis. Examples of strings extracted via this method from a Delphi
code section are shown in Table 11.
0x005A1E14|0x001A1214(SHORTSTR 0036):"Dont<20>Know<20>How<20>to<20>Handle<20>Data<20>Type<20>0x"
0x005A2080|0x001A1480(SHORTSTR 0012):"Parser<20>Error"
0x005A23D8|0x001A17D8(SHORTSTR 0012):"H<85><C0><7C>f@<89>E<E8><C7>E<EC>"
0x005A2C60|0x001A2060(SHORTSTR 0012):"clientheight"
0x005A2C88|0x001A2088(SHORTSTR 0011):"clientwidth"
0x005A2F98|0x001A2398(SHORTSTR 0019):"can<20>not<20>read<20>memory"
0x005A4040|0x001A3440(SHORTSTR 0012):"<25>s<20><28>ver<2E><20><25>s<29>"
0x005A464C|0x001A3A4C(SHORTSTR 0074):"The<20>file<20>can<20>not<20>be<20>executed<20>or<20>has<20>exited<20>immediately (...)
Table 11: Brute Force String Extraction from a Delphi Code Section
21
Enterprise 26 3D 4F 38 C2 82 37 B8 F3 24 42 03 17 9B 3A 83
Professional A2 8C DF 98 7B 3C 3A 79 26 71 3F 09 0F 2A 25 17
Personal 23 78 5D 23 B6 A5 F3 19 43 F3 40 02 26 D1 11 C7
5.2 PACKAGEINFO
The PACKAGEINFO resource is a simple data structure containing some information about the executable (its
type and which tool was used to create it) as well as a list of the names of all the units in the executable (both
the library units and the user code units). Unusually for Delphi, this data structure uses C-style zero-terminated
strings rather than Pascal-style strings with a length prefix.
struct TPkgName
{
uint8 t HashCode ;
C STRING Name ;
};
struct TUnitName
{
uint8 t Flags ;
uint8 t HashCode ;
C STRING Name ;
};
struct TPackageInfoHeader
{
uint32 t Flags ;
uint32 t R equi resC ount ;
TPkgName R e q u i r e s [ Req uire sCou nt ] ;
uint32 t ContainsCount ;
TUnitName C o n t a i n s [ ContainsCount ] ;
};
5.2.2 Flags
The main package flags are a 32-bit value interpreted as follows:
Bit Meaning
0 Build flag (Never build if 1, always build if 0)
1 Design-time only (Yes if 1, No if 0, must be opposite value to bit 2)
2 Run-time only (Yes if 1, No if 0, must be opposite value to bit 1)
3 Check for duplicates (Do not check if 1, Do check if 0)
4 - 25 Reserved
26 - 27 Producer: 0 = pre-V4 VCL, 1 = undefined, 2 = C++, 3 = Pascal (Delphi)
28 - 29 Reserved
30 - 31 Package Type: 0 = Executable, 1 = Package DLL, 2 = Library DLL, 3 = undefined
22
Each entry in the TUnitName array also contains an 8-bit Flags value. Only the lower 5 bits is used, with
the upper 3 bits being reserved. A “weak package” has a special definition9 .
Bit Meaning
0 Set if this unit is the main unit
1 Set if this unit is a package unit
2 Set if this unit is a ‘weak package’ unit
3 Set if this unit is the original container for the weak packaged unit
4 Set if this unit was implicitly imported
5-7 Reserved
An example of raw PACKAGEINFO data may be seen in the hex dump in Table 16. The PACKAGEINFO
structure begins at offset 0x1D719C. The parsed data equivalent is shown in Table 17.
5.3 Forms
In Delphi, windows and dialog boxes are referred to as forms. Information describing the forms used by a
Delphi program is stored in the executable’s resource section as RCDATA. These are not stored in the usual
Windows resource format, but in a proprietary format specific to Delphi and C++ Builder (Delphi executables,
particularly New executables, may often contain standard Windows dialog box resources as well, but these are
not used for user-generated forms). Forms are designed by the programmer using the GUI, and have many
properties which may be changed or left at default values (the form properties are displayed in the window on
the left in Figure 1).
Delphi form resources start with the magic bytes ”TPF0”, which are followed by two Pascal strings giving
the class name and the form name. These are followed immediately by the form data. Sub-objects of a given
form are defined in the same way. It appears that objects can have any number of nested sub-objects, which
complicates processing. Normally each form has its own TPF0 resource, which describes all the objects and
sub-objects on that form.
The form data as set at design time is stored as a set of key-value pairs. The key name is a Pascal string
and is followed by a byte giving the data type, then the value data itself. The length and format of the value
data depends on the data type. Data types are defined in the Delphi runtime source file ”Classes.pas” and are
given in Table 18. The most interesting data stored in the form resource is the name of the method (function)
associated with the form’s actions or with a subobject, such as a button. Such records are strings, and their
key names have the prefix ‘On’, examples being ‘OnCreate’, ‘OnClick’, ‘OnDoubleClick’ and ‘OnClose’. The
value associated with such a key is the name given to the method by the user. Delphi assigns default names
to many methods (for example, if you have a button called ‘Button1’ on a form, the OnClick method for that
button will by default be called ‘Button1Click’), but the user may change them. In order to locate the actual
address of the method’s function, it is necessary to parse the form’s Virtual Method Table using the techniques
described previously.
9 http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/devcommon/
compdirsweakpackaging_xml.html
23
Package Type: Undefined
Producer: Delphi
PackageFlags.Build: Never
PackageFlags.Design-time only: No
PackageFlags.Run-time only: No
PackageFlags.Check duplicate units: Yes
RequiresCount: 0
ContainsCount: 210
24
Type Name Description
0x00 vaNull ‘Null’ type, no extra data needed
0x01 vaList Zero-terminated list of values
0x02 vaInt8 Signed 8-bit integer (Delphi ’shortint’)
0x03 vaInt16 16-bit signed little-endian integer
0x04 vaInt32 32-bit signed little-endian integer
0x05 vaExtended Borland proprietary 10-byte floating point
0x06 vaString Pascal string (byte prefix gives length)
0x07 vaIdent Alias for vaString
0x08 vaFalse Boolean false, no extra data needed
0x09 vaTrue Boolean true, no extra data needed
0x0A vaBinary Arbitrary binary data, 32-bit length prefix
0x0B vaSet List of strings
0x0C vaLString Long Pascal string (32-bit length prefix)
0x0D vaNil Delphi ‘Nil’, no extra data needed
0x0E vaCollection Nested set of values of unrelated type
0x0F vaSingle Single-precision floating point
0x10 vaCurrency Borland proprietary currency type
0x11 vaDate Borland proprietary date type
0x12 vaWString Unicode string
0x13 vaInt64 Signed 64-bit little-endian integer
0x14 vaUTF8String UTF8 string (treated as equivalent to vaLString)
0x15 vaDouble Double-precision floating point
25
5.3.1 Flags
Objects and subobjects may contain flag fields after the class and object name. These flags are related to the
way Delphi serialises objects Most are one byte in length, but they must be accounted for when parsing as some
flags (e.g. ffChildPos) store further data after the flag byte. The following three flags are defined, using bits 0-2
of the flag byte.
26
OBJECT: TDialogDelayForm:DialogDelayForm
--->Left: 460
--->Top: 434
--->BorderIcons: (biSystemMenu)
--->BorderStyle: bsDialog
--->Caption: DialogDelayForm
--->ClientHeight: 333
--->ClientWidth: 450
--->Color: clBtnFace
--->Font.Charset: DEFAULT_CHARSET
--->Font.Color: clWindowText
--->Font.Height: 245
--->Font.Name: MS Sans Serif
--->Font.Style: [Empty]
--->OldCreateOrder: False
--->Scaled: False
--->OnCreate: FormCreate
--->PixelsPerInch: 96
--->TextHeight: 13
OBJECT: TImage:bgImage
--->Left: 253
--->Top: 0
--->Width: 450
--->Height: 333
--->Picture.Data: [30124 bytes of binary data]
27
7 Code Samples
7.1 Brute Force Search for Virtual Method Table Structures
i n t ScanCodeSectionForPotentialVMTs ( const unsigned i n t C o d e S e c t i o n O f f s e t ,
const unsigned i n t C o d e S e c t i o n S i z e )
{
unsigned char * F i l e B a s e = ( unsigned char * ) GetMappedFileBase ( ) ;
unsigned i n t index , t e s t a d d r , t e s t a d d r a d d r ;
P V i r t u a l M e t h o d T a b l e P r e f i x pvmt ;
// need t o go b a c k w a r d s from t h e end o f t h e code s e c t i o n
This is part of an experimental C program which has been successfully used to locate VMT structures in
Delphi executables where there are no calls to CreateForm() in the entry point function and therefore no easy
way to find any VMTs which may be present. It is given the offset and size of a chunk of file (the code section,
but not necessarily), and then scans backwards through it for structures which look like a VMT, stopping once
a valid VMT is found. Validity is determined by checking that the VMT’s fields are either zero or valid virtual
addresses, and that the vmtClassName field points to a Pascal string. This is a relatively simple test, but seems
to produce good results.
28
7.2 Explanation of Functions not Defined in the Listing
The definitions of the following functions have been omitted from the listing for space reasons. The function
GetMappedFileBase() returns a pointer to the memory-mapped PE file. The function IsVirtualAddress-
Valid() checks that the specified virtual address is located within the PE image, and likewise IsVirtual-
AddressInCodeSection determines whether the virtual address is within the code section. The function
DoesVirtualAddressPointToPascalString() determines whether a given virtual address points to a Pas-
cal string object. VirtualAddressToFileOffset() and FileOffsetToVirtualAddress() convert between file
offsets and virtual addresses using previously-stored data from the section table (the virtual address validation
functions rely on the same data).
29