Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why don't small changes in code affect exe file size?

I'm curious - sometimes I make changes in my code, recompile, then copy my exe or dll file over the old version and see Windows telling me that the date of the file changed, but the size stayed exactly the same. Why is that?

As an example, I tested with the following console application:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication4
{
    class Program
    {
        static void Main(string[] args)
        {
            int a = 1;
            int b = 2;
            Console.WriteLine(a + b);
        }
    }
}

This produced an exe file of 5120 bytes (Visual Studio 2012, Debug build). Then, I changed the code to this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication4
{
    class Program
    {
        static void Main(string[] args)
        {
            int a = 1;
            int b = 2;
            int c = 3;
            Console.WriteLine(a + b + c);
        }
    }
}

The size of the exe is exactly the same.

I look at the disassembly, which is showing a difference in the IL code, so it cannot be that the difference is optimized away:

First version:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       15 (0xf)
  .maxstack  2
  .locals init (int32 V_0,
           int32 V_1)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.0
  IL_0003:  ldc.i4.2
  IL_0004:  stloc.1
  IL_0005:  ldloc.0
  IL_0006:  ldloc.1
  IL_0007:  add
  IL_0008:  call       void [mscorlib]System.Console::WriteLine(int32)
  IL_000d:  nop
  IL_000e:  ret
} // end of method Program::Main

Second version:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       19 (0x13)
  .maxstack  2
  .locals init ([0] int32 a,
           [1] int32 b,
           [2] int32 c)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.0
  IL_0003:  ldc.i4.2
  IL_0004:  stloc.1
  IL_0005:  ldc.i4.3
  IL_0006:  stloc.2
  IL_0007:  ldloc.0
  IL_0008:  ldloc.1
  IL_0009:  add
  IL_000a:  ldloc.2
  IL_000b:  add
  IL_000c:  call       void [mscorlib]System.Console::WriteLine(int32)
  IL_0011:  nop
  IL_0012:  ret
} // end of method Program::Main

If the code is physically bigger, how can the files be exactly the same size? Is this just some random chance? It happens to me a lot (when making small changes to the code)...

like image 613
vesan Avatar asked Jan 21 '15 04:01

vesan


People also ask

Why are EXE files so big?

An executable have been linked with other object files and libraries, needed for all external functions and variables you need. That of course makes the executable much bigger as it contains much more code.

How big can an EXE file be?

The executable "image" (the code/data as loaded in memory) of a Win64 file is limited in size to 2GB. This is because the AMD64/EM64T processors use relative addressing for most instructions, and the relative address is kept in a dword. A dword is only capable of holding a relative value of ±2GB.

How do I reduce the size of a python exe?

In order to reduce the size of exe file, we will use the packages that are bound to or installed by PIP instead of Conda. So, the trick is uninstall Conda packages and reinstall them again using pip in a new environment where is clean and fresh.


2 Answers

From https://msdn.microsoft.com/en-us/library/ms809762.aspx:

DWORD FileAlignment

In the PE file, the raw data that comprises each section is guaranteed to start at a multiple of this value. The default value is 0x200 bytes, probably to ensure that sections always start at the beginning of a disk sector (which are also 0x200 bytes in length). This field is equivalent to the segment/resource alignment size in NE files. Unlike NE files, PE files typically don't have hundreds of sections, so the space wasted by aligning the file sections is almost always very small.

EDIT: Also all sections sizes on disk are rounded up (padded) to a multiple of the FileAlignment. From http://www.openwatcom.org/ftp/devel/docs/pecoff.pdf

SizeOfRawData

Size of the section (object file) or size of the initialized data on disk (image files). For executable image, this must be a multiple of FileAlignment from the optional header. If this is less than VirtualSize the remainder of the section is zero filled. Because this field is rounded while the VirtualSize field is not it is possible for this to be greater than VirtualSize as well. When a section contains only uninitialized data, this field should be 0.

I presume that even the last section is so padded so that the linker code that emits sections and then the loader code that loads them don't have to worry about a special case for the last section's size. It would be a rather pointless optimization to trim the last section anyhow because the disk sector (and on top of that file system's bigger cluster) have internal fragmentation that would eat back any such "saving" (from trimming the last section) most of the time.

like image 146
Fizz Avatar answered Oct 14 '22 12:10

Fizz


Executable files contain a number of sections. Each of these sections is aligned to, if I remember correctly, 512 bytes.

like image 28
Chris Jester-Young Avatar answered Oct 14 '22 13:10

Chris Jester-Young