Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is dumpbin able to read the export table when it appears at a file offset larger than the file itself?

I'm writing a little PE reader, so I run dumpbin alongside my test application to confirm that the values are being read correctly. Everything it working so far, except for the export table.

The file I'm testing with is a DLL. My application reads the file in as a byte array, which gets passed to my PE reader class. The values align with those output by dumpbin, including the RVA and size of the export data directory.

        E000 [     362] RVA [size] of Export Directory

The problem is, the byte array's size is only 42,496. As you can probably imagine, when my PE reader attempts to read at E000 (57,344), I get an IndexOutOfRangeException. dumpbin, however, has no such problem and reads the export directory just fine. And yes, the entire file is indeed being read into the byte array.

How is this possible?

like image 769
David Brown Avatar asked Oct 15 '09 01:10

David Brown


1 Answers

The PE file contains "sections", and the sections have independent base addresses. The PE is not a contiguous memory image. Each section is a contiguous memory image.

First you will have to read the section information and make memory-map of their layout. Then you will be able to align the section offsets with the file-based offsets.

As an aside, consider looking at OllyDbg, which is a freeware, open-source debugger and disassembler for Windows. It will possibly help you test your own software, and might server the very purpose you are trying to fill by "rolling your own."

Example from dumpbin /all output:

SECTION HEADER #1
   .text name
    BC14 virtual size
    1000 virtual address (00401000 to 0040CC13)
    BE00 size of raw data
     400 file pointer to raw data (00000400 to 0000C1FF)
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
60000020 flags
         Code
         Execute Read

In this case, my .text section begins at RVA 1000 and extends to RVA CE00. The file pointer to this section is 400. I can translate-to-file-pointer any RVAs in the range 1000-CDFF by the work of subtracting 600. (All numeric values hexadecimal.)

Whenever you encounter an "RVA" (Relative Virtual Address), you resolve it to a file offset (or an index into your byte array), using this method:

  1. Determine to which section the RVA belongs. Each section contains the RVAs from its virtual address through its size. Sections do not overlap.
  2. Subtract the section virtual address from the RVA -- this gives you the offset relative to the section.
  3. Add the section's PointerToRawData to the offset you obtain in step (2). This is the file offset corresponding to the RVA.

Another approach that you might use is to call MapViewOfFileEx() with the flag FILE_MAP_EXECUTE set in dwDesiredAccess argument. This API will parse the section headers from the PE file, and read the contents of the sections into their locations relative to the "module base."

The module base is the base address at which the PE header will be loaded. When loading DLLs using LoadLibrary() functions, this can be obtained via GetModuleInformation() function's MODULEINFO member lpBaseOfDll.

When using MapViewOfFileEx(), the module base is simply the return value from MapViewOfFileEx().

In the setting of loading the module in these ways, resolving the RVA to a normal pointer value is a matter of:

  1. Store the module base address in a char *
  2. Add the RVA to the char *
  3. Cast the char * to the actual datatype and dereference that.

A drawback of letting the OS map the file as in these approaches is that if you are using this tool to investigate some suspect file and are not sure if a developer has taken strange liberties with the section headers, it is possible you miss some valuable information by letting the OS handle this part of the parsing.

like image 169
Heath Hunnicutt Avatar answered Sep 27 '22 17:09

Heath Hunnicutt