6 Testing

This idea of a Retargetable Loader (RL) is inviting, but hard to implement. The Simple Retargetable Loader (SRL) is an attempt to demonstration the benefit of using a RL to build a binary manipulation tool. It itself is a scaled downed version of the RL and is implemented in C. The SRL is limited in a way that its grammar is simple and contains a confined number of constructs. As described in the previous chapter, the BFF grammar for the SRL was constructed using three different environments as a basis: (x86, DOS, EXE), (x86, Windows, NE) and (Sparc, Solaris, ELF). The ELF (on a RISC architecture) being the most complex BFF of the three , the EXE (CISC) being the simplest and the NE (CISC) somewhere in between. Nevertheless the SRL's BFF grammar was develop to be generic in mind.

When implementing a RL, one must consider at what extent does this RL take part in the decoding of the binary file. Does it decode the whole file and rewrite it to another representation or does it simple loads the whole file to memory? How much detail is interpreted by the RL? In the case of SRL, the primary functions are producing sources to represent structures of the object file and loading of the image.



6.1 Simple Retargetable Loader (SRL)




Fig. 11 is a description of the contents the SRL produces. The object structures are the type definitions for various regions of the binary object file for (M, OS, BFF). The Loading routine contain initialised information for the object structures and loading of the object image to memory. The object structure and loading routine are implemented as .H and .C files respectively using the C language.

The input to the SRL is the object specification: a binary description of the object file for an environment (M, OS, BFF) written in SRL's syntax grammar. The specifications for (x86, DOS, EXE), (x86, Windows, NE) and (Sparc, Solaris, ELF) are used as inputs to the SRL and the set of corresponding .H and .C interface files are produced. The SRL outputs for an (x86, DOS, EXE) specification is listed in Fig. 12 and Fig. 13. Surprisingly, the specification for the Windows NE manage to be larger than the ELF, since the ELF is suppose to be the more complex of the three. Perhaps if the grammar had more constructs, then finer details could be captured. In that case, we will see more of the ELF structure. But is this all part of the loader? Does the loader need to examine, and be able to identify and understand all the different regions of the object file? How much disposure should the loader know is still left as a discussion.

To examine the useability of the SRL outputs, the loading files (.H and .C) produced from the (x86, DOS, EXE) was integrated with the DCC compiler. The loading module for the Intel 286 DCC compiler was replaced with the corresponding SRL outputs. With a few minor changes to the loading files, the DCC was reconstructed using the loading files. The behaviour for the two versions of the DCC were the same, hence demonstrating the correctness of the SRL outputs. The source code for the SRL can be found in appendix 3.


6.2 Interface

The interface routines produced by the SRL are very simple. The .C file merely provides a loading module for setting up an image in memory (Fig. 13). This is fine for a DOS EXE format as it is extremely simple, but for other types of BFFs, one would like to provide interface functions to access different regions of the binary file. For example, the Windows NE BFF have a number of tables - imported-name table, segment table, module reference table, etc. The structure of the segment table in the specification is:

DEFINITION seg_table ADDRESS (sh_segToff + sho_off) 
  seg_table_ent ARRAY sh_segTent
    ste_logSectoff SIZE 16
    ste_size SIZE 16 
    ste_flag SIZE 16
    ste_minsize SIZE 16
  END seg_table_ent
END seg_table

The SRL creates the structure for the segment table in the .H file and sets up a pointer which points to the beginning of the table in the image. There are no routines generated from the SRL for accessing this structure. If the programmer wants to access a particular entry in the table, then he/she must directly manipulate this structure by hand code that bit of code. A desirable feature for an RL would be to automatically generate a set of interface routines, thus eliminating the need for the programmer to hand code directly manipulate the structure.

/* This file is generated by the BFF generator using the grammar in "dosexe.txt" */ 

#ifndef _LOAD_H_
#define _LOAD_H_ 

#ifdef __MSDOS__
 #define INT int
 #define LONG unsigned long
#else
 #define INT short int
 #define LONG unsigned int
#endif __MSDOS__ 

typedef unsigned char byte;
typedef short int16; 

#define LH(p) ((int16)((byte*)(p))[0]+((int16)((byte*)(p))[1]<<8))  

typedef struct {
 byte h_sigLo;
 byte h_sigHi;
 INT h_lastPageSize;
 INT h_numPages;
 INT h_numReloc;
 INT h_numParaHeader;
 INT h_minAlloc;
 INT h_maxAlloc;
 INT h_initSS;
 INT h_initSP;
 INT h_checkSum;
 INT h_initIP;
 INT h_initCS;
 INT h_relcTabOffset;
 INT h_overlayNum;
} headerT; 

typedef struct {
 headerT *header;
 byte* section;
 char* filename;
 LONG imagesize;
 byte* image;
} BFF; 

extern BFF* aBFF; 

LoadImage(char* filename);  

#endif _LOAD_H_ 


Fig. 12  .H file generated by the SRL using the (x86, DOS, EXE) specification




/* This file is generated by the BFF generator using the grammar in "dosexe.txt" */ 

#include <stdio.h>
#include <string.h>
#include "loader.h"  

BFF *aBFF;  


LoadImage(char* filename) {
 FILE *fp;
 LONG imageaddress; 

 if ((fp=fopen(filename, "rb"))==NULL) { 
   printf("cannot open file ");
   return 0;
 }
 aBFF = (BFF *)malloc(sizeof(BFF));
 aBFF->header = (headerT *)malloc(sizeof(headerT));
 if (fread(aBFF->header, sizeof(headerT), 1, fp) != 1) {
   printf("cannot read file ");
   return 0;
 }
 aBFF->imagesize = LH(&aBFF->header->h_numPages) * 512 -  
       LH(&aBFF->header->h_numParaHeader) * 16 - (512 -  
       LH(&aBFF->header->h_lastPageSize)); 

 aBFF->image = (byte *)malloc(aBFF->imagesize);
 fseek(fp, (Int)LH(&aBFF->header->h_numParaHeader) * 16, SEEK_SET);
 if (fread(aBFF->image, (size_t)aBFF->imagesize, 1, fp))!=1) {
   printf("error reading image ");
   return 0;
 }
 imageaddress = LH(&aBFF->header->h_numParaHeader) * 16; 

 aBFF->section = aBFF->image + LH(&aBFF->header->h_initIP) + 16 - imageaddress;

 aBFF->filename = (char*)malloc(sizeof(char)*(strlen(filename)+1));
 strcpy(aBFF->filename, filename);
 fclose(fp); 

} /* LoadImage */ 


Fig. 13  .C file generated by the SRL using the (x86, DOS, EXE) specification