The operating system (OS) loader decodes the object file and creates a memory
image when an executable object or binary file needs to be executed. Apart
from image creation in operating systems, the loader is also be used to extract
important file information in some of the more complex machine-code manipulation
tools disassemblers, decompilers, debuggers, binary translators and
tracers/profilers. Traditional loaders like the ones used in the OS can only
understand one type of binary file format (BFF). The ideal model is to create
a generic loader which is capable of understanding several BFFs.
A binary file have a set of attributes which describe its environment; namely,
the machine architecture M in which it runs, the operating system OS for
that machine, and its binary file format. The notation we use to describe
the environment is the tuple (M, OS, BFF).
Traditionally, we need to write a decoder for every BFF we want to manipulate,
ie. for n different object files or n (M, OS, BFF) tuples,
we need to write n different loaders. The idea of a (generic) retargetable
loader (RL) is to eliminate the effort needed to create different loaders.
The RL is designed to be generically intelligent, and can understand a wide
range of different binary file formats.
This thesis looks into the different approaches for developing an RL and
develop a prototype for such a tool. The approach used was by means of BFF
specifications. Specifications are unambiguous and trouble free, which make
it ideal for developing an RL based on a BFF grammar. There are a few difference
between the grammars used in traditional programming languages and the grammar
used for BFFs. In an object file, parts or regions of the file are inter-related.
Addresses and segment sizes are usually controlled by definitions found in
the file header and their values are determined only at run-time. Hence,
a BFF grammar must be able to re-reference information that was defined earlier
in the file.
The simple retargetable loader (SRL) is a first attempt to develop an RL
with a simple BFF grammar developed by the author. The three environments:
(x86,DOS,EXE), (x86,Windows,NE) and (Sparc,Solaris,ELF) were used as the
basis for testing the SRL. The three environments give a good coverage of
different BFFs currently in use by OSs for RISC and CISC machines. Overall,
the structure of ELF is complex, while EXE is simple and NE is in between.