Hey guys, today I’m going to explain a vital file format which is called PE file format. Understanding this format is very essential, because all debuggers and reverse engineering tools are based on this structure. There are lots of documents to introduce this structure, but I gather the essential information from some them in this post.
PE is the format of binary programs such as EXE and dlls. The best overview of this structure is shown as below.
What is Dos MZ header?
This section will recognize if your file is a valid PE file or not. All valid PE files are started with MZ. You can see the structure of a Dos MZ header structure as below.
The size of this structure is 64 byte. In this structure there are two items which are essential, e_magic and e_Ifanew. e_magic involves MZ charecters and IFanew is a offset to PE header. By using this item you can directly go to the beginning of PE header.
What is Dos Stub?
If your program cannot be run on windows, this section involves the string which warns you that your program cannot be run on windows.
PE header is a structure named IMAGE_NT_HEADER. It involves three members.
Signature: This feature is a signature of the beginning of the PE header.
FileHeader: This field is a structure which is shown below.
NumberofSection implies that how many sections the file contains.
characteristics defines the type of your file. (EXE,DLL,etc)
for instance you can see the numberofSections in PEview tool as below.
Optionalheader: is a structure of some important members.
This address is a virtual address which will points to the start of the program and if you add this value with the IMAGEBASE you can see what ollydbg shows when it loads the program.
IMAGEBASE is an address space which loaders prefer to load the program from that.
Also, you can press ALT+M in ollydbg to see all sections of a PE file.
This structure is very essential. This structure is shown below.
The size of this structure is 128 byte and is the last member of the OPTIONALHEDAER.
DATADIRECTORY is an array of IMAGEDATADIRECTORY which is shown as below.
In other words, datadirectory is an array of the structure which is shown as below.
you can see this structure in a peview tool.
There are two structures which are very vital and are named EXPORT Table and IMPORT Table.
DLL is an implementation of API calls. It means a program is a set of API calls which are implemented in dlls. functions can be imported or exported. when a dll export a function, it means that other dlls can use and call it. When a dll imports a function from other dlls, it means that the function is implemented in the imported dll.
a dll can export a function in 2 ways. name and ordinal number. ordinal number is 16 bit number which makes a function unique in a dll.
HOW loader finds a address of an API in a dll?
In an Export Table which was shown before, we have RVA which is a relative virtual address points to IMAGE_EXPORT_DIRECTORY STRUCT.
It contains the number of functions which are exported in a dll.
It specifies the number of functions which are exported by name.
this address is a RVA which has the address of EXPORT-ADRESS-TABLE(EAT).
It has the address of TABLE ENT.
It has the address of table EOT.
EAT is shown as below.
AddressofFucntions has the address of EAT table.
AddressofNAmes address of ENT table.
NumberofNames is the number of functions which are exported by their names.
It contains information about the sections of a file. It is an array of IMAGESECTIONHEADER structure. In PE file we have some sections. each section has its own section header. The information of each header is located in section table.
How PE loader or PE parsers identify the validity of a PE file?
you can easily write a simple code to identify the validity of a PE file.
You can expand the code below to write a simple PE parser.