PE stand for Portable Executable that is native file format for Win32. Portable Executable file format is universal across all win32 platform. All Win32 executables like Control Panel Applets (.CPL ), 32bit DLLs, COM files, .NET executables and also NT’s kernel mode drivers. Note that VxDs(virtual device drivers) and 16 bit DLLs not use PE file format.
PE file format:
The general outlook of PE file is like shown in figure. PE file mainly have DOS Header, PE Header & Sections. Let have a look on every part.
DOS MZ header: PE files starts with DOS header of 64 bytes. Its structure (IMAGE_DOS_HEADER) is defined in winnt.h or windows.inc that composes of 19 members but only two that is magic and Ifnew are most important.
magic– has value 4Dh, 5Ah that signifies a valid DOS header (“MZ stand for Mark Zbikowsky one of MS-DOS designer)
Ifnew-indicates offset of PE header and PE signature at start of PE header.
DOS Stub: When OS is not able to recognize the PE file then it will execute the EXE (DOS stub). So it is like valid EXE.
PE Header: Its structure is IMAGE_NT_HEADERS that contains all necessary fields used by PE loader. It is defined in windows.inc and mainly have 3 parts that is:
Signature: is 32 bit (DWORD) containing the value 50h, 45h, 00h, 00h (“PE” followed by two terminating zeroes).
FileHeader: Next 20 bytes of PE file. It has various information about physical layout (IMAGE_FILE_HEADER) & properties of file e.g. number of sections.
OptionalHeader: next 224 bytes and contains info about the logical layout (IMAGE_OPTIONAL_HEADERS) inside the PE file e.g. AddressOfEntryPoint, ImageBase, SectionAlignment, FileAlignment, SizeOfImage, SizeOfHeaders & DataDirectory(16 IMAGE_DATA_DIRECTORY array structure at last 128 bytes that relate import address table).
SectionTable: It is an array of IMAGE_SECTION_HEADER structure. There may be a variable section depends upon the file but every section has unique attributes. In one section data/code of common attribute is placed. Some of the sections are:
Code section: .text contains the executable instructions
Data Section: .bss –>uninitialized variables; .rdata->read only variables; .data->initialized variables
Export data: .edata->names and addresses of exported functions as well as Export Directory
Import data: .idata-> names and variables of imported functions as well as Import Descriptor and IAT.
So when PE file loading into the memory the various steps occur are:
- When we run PE file, First of all PE loader examines the DOS MZ header for the offset of the PE header. If not found then execute DOS stub otherwise skips to the PE header.
- Then PE loader checks the validity of PE header. If ok, goes to the end of PE header.
- Using file mapping PE header reads information about the sections and maps those sections into memory. It also gives each section the attributes as specified in the section table.
- After PE file is mapped into memory, the PE loader concerns itself with the logical parts of the PE file, such as the import table.
[Do Some Practical]:
PE Headers: Basic PE information is can be retrieve by reading various attributes
Using pefile module in python we can find header information like:
>>> import pefile
>>> print pe.DOS_HEADER
0x0 0x0 e_magic: 0x5A4D
0x2 0x2 e_cblp: 0x90
0x4 0x4 e_cp: 0x3
0x6 0x6 e_crlc: 0x0
0x8 0x8 e_cparhdr: 0x4
0xA 0xA e_minalloc: 0x0
0xC 0xC e_maxalloc: 0xFFFF
0xE 0xE e_ss: 0x0
0x10 0x10 e_sp: 0xB8
0x12 0x12 e_csum: 0x0
0x14 0x14 e_ip: 0x0
0x16 0x16 e_cs: 0x0
0x18 0x18 e_lfarlc: 0x40
0x1A 0x1A e_ovno: 0x0
0x1C 0x1C e_res:
0x24 0x24 e_oemid: 0x0
0x26 0x26 e_oeminfo: 0x0
0x28 0x28 e_res2:
0x3C 0x3C e_lfanew: 0xE8
Similarly we can find out NT_HEADER, FILE_HEADER, OPTIONAL_HEADER
>>> print pe.NT_HEADER
>>> print pe.FILE_HEADER
>>> print pe.OPTIONAL_HEADER
>>> print pe.FILE_HEADER
0xEC 0x0 Machine: 0x8664
0xEE 0x2 NumberOfSections: 0x6
0xF0 0x4 TimeDateStamp: 0x4A5BC9B3 [Mon Jul 13 23:56:35 2009 UTC]
0xF4 0x8 PointerToSymbolTable: 0x0
0xF8 0xC NumberOfSymbols: 0x0
0xFC 0x10 SizeOfOptionalHeader: 0xF0
0xFE 0x12 Characteristics: 0x22
We can also find specific field like
>>> print “e_magic value = %s” %hex(pe.DOS_HEADER.e_magic)
e_magic value = 0x5a4d
>>> print “e_lfanew value = %s” %hex(pe.DOS_HEADER.e_lfanew)
e_lfanew value = 0xe8
>>> print “Address of Entry Point = %s” %hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)
Address of Entry Point = 0x3570
>>> print “Numbers of Sections = %s”%hex(pe.FILE_HEADER.NumberOfSections)
Numbers of Sections = 0x6
>>> print “Number of Data_Directories= %d” %pe.OPTIONAL_HEADER.NumberOfRvaAndSizes
Number of Data_Directories= 16
Using same code we can retrieve section structure, its content of attributes etc.
>>> for section in pe.sections:
print “\t” +section.Name
If want to display individual section:
>>> print pe.sections
0x1F0 0x0 Name: .text
0x1F8 0x8 Misc: 0xA770
0x1F8 0x8 Misc_PhysicalAddress: 0xA770
0x1F8 0x8 Misc_VirtualSize: 0xA770
0x1FC 0xC VirtualAddress: 0x1000
0x200 0x10 SizeOfRawData: 0xA800
0x204 0x14 PointerToRawData: 0x600
0x208 0x18 PointerToRelocations: 0x0
0x20C 0x1C PointerToLinenumbers: 0x0
0x210 0x20 NumberOfRelocations: 0x0
0x212 0x22 NumberOfLinenumbers: 0x0
0x214 0x24 Characteristics: 0x60000020
Similarly we can retrieve DATA_DIRECTORIES.
>>> for d_dir in pe.OPTIONAL_HEADER.DATA_DIRECTORY:
We can dump all attributes using a single command:
>>> print pe.dump_info()
Modify Address Of Entry Point:
We can change the address of OEP(Original Entry Point). This will help when we manually unpacking a packed file where change in Address_Of_Entry_Point to OEP is required.
>>> import pefile
>>> print “Address_Of_Entry_Point = %s”%hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)
Address_Of_Entry_Point = 0x36201
>>> print “Address_Of_Entry_Point= %s”%hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)
Inject Code at Entry Point:
We can inject code (may be malicious) into the executable at the entry point. So that when application run our code will execute. For injection we use set_bytes_at_offset() method. It returns true or false depending on the success of the operation.
>>> import pefile
>>> print “Address %s”%hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)
>>> shellcode = ( “\xfc\xe8\x89\x00\x00\x00\x60\x89\xe5\x31\xd2\x64\x8b\x52\x30”
>>> print “write %d bytes at offset(Entry Point) %s”%(len(shellcode),hex(x))
write 195 bytes at offset(Entry Point) 0x36201
set_bytes_at_offset takes two parameters offset and data. Offset that point to the destination where the data is to be copied. Data contains the our code that to be copy (str).
Differentiate between DLL and EXE:
There are many GUI tools like Resource Hacker, PE Explorer, PE browse, dependency walker, PEid, PEviewfor analyzing PE file. We can differentiate between EXE and DLL by seeing at the Characteristics attribute of File_Header. It is easy to load the application into PE Explorer tool and see the Characteristic field that tell it is DLL or EXE.