Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PDB file larger on the second compile and then stays the same size

Tags:

c#

.net

csc

Using the following simple file:

using System;  public class Program{         [STAThread]         public static void Main(string[] args){             Console.WriteLine("Boo");         } } 

And then using the following command:

csc /target:exe /debug:pdbonly HelloWorld.cs 

If you run this command and the PDB does not already exist then the PDB file size is 12KB. Otherwise, if the PDB file exists, then the new file size is 14KB.

Microsoft (R) Visual C# Compiler version 4.0.30319.17929 .NET 4.5 

Anyone have any ideas what would explain this?

UPDATE:

  1. I do not experience this with .NET 3.5 and from the comments .NET 4 either.
  2. Using pdb2xml (http://blogs.msdn.com/b/jmstall/archive/2005/08/25/sample-pdb2xml.aspx), I cannot see any difference between the small and the larger one.
like image 443
REA_ANDREW Avatar asked Mar 06 '13 15:03

REA_ANDREW


People also ask

How do I reduce PDB file size?

For clean link scenarios /PDBCOMPRESS instructs the linker to open the target PDB file in a mode that will lead to the operating system compressing the file content automatically as debug records are being written into the PDB file. This will result in a smaller PDB.

Do pdb files affect performance?

The executive summary answer: no, generating PDB files will have no impact on performance whatsoever.

Should you ship pdb files?

Shipping pdb does not give any additional convenience to an user. So there are no reasons to ship pdb files with the app. Besides pdb file usually has a large size. Instead of shipping pdb files you should use a local Microsoft Symbol Server for a fast access to pdb files corresponding to error reports.


1 Answers

My answer is simple, but maybe not so accurate. Let's use one debugger tool on our PDB files:

PDB

The only difference is PdbAge field. It means that PDB file is not recreated after each compilation! This file is modified, that's why it's size changes.

My guess is confirmed in this article. Quote:

One of the most important motivations for the change in format was to allow incremental linking of debug versions of programs, a change first introduced in Visual C++ version 2.0.

Another question is what exactly is changed in this file? Most detailed explanation of file format I have found in the book "Sven B. Schreiber, “Undocumented Windows 2000 Secrets: A Programmer’s Cookbook”". Key phrase is:

An even greater benefit of the PDB format becomes apparent when updating an existing PDB file. Inserting data into a file with a sequential structure usually means reshuffling large portions of the contents. The PDB file's random-access structure borrowed from file systems allows addition and deletion of data with minimal effort, just as files can be modified with ease on a file system media. Only the stream directory has to be reshuffled when a stream grows or shrinks across a page boundary. This important property facilitates incremental updating of PDB files.

He describe that not all data in file is useful in every moment. Some ranges of bytes are simply filled by zeros until that file will be modified during next compilation.

So I can't tell what exactly have been changed in PDB file except some GUID and Age number. You can go deeper after reading that book. Good luck!

UPDATE (15/03/2013):

I spent some more time to compare files. When i open them in HEX mode, i see the differences in header: Header Page size of file is 512 bytes (200h value at +20h) and page count is different: 120 and 124 (078h and 07Ch accordingly). On my screens the smaller file is on the left side. OK. The difference in file size is exactly 2048 bytes. It means that compiler adds 4 pages of data at the second time. Then I found all other differences. 3/4 of file from start contains small diffs - a few bytes as usual. But at point 2600h we see: Diff

Look! The line /LinkInfo./names./src/files/c:\Windows\microsoft.net\framework\v4.0.30319\helloworld.cs become cropped and now contains inconsistent information.

I look forward and found this line in second (bigger) file in full representation: Diff2 This information was placed to free space now (see zeros on the left side). I guess, an old pages (with corrupted string) were marked as unused space.

And at the end of file I've found exactly 2048 bytes of new information - all are zeros. Starting at 2E00h (11776 in decimal) and ending at 35F8h (13816 in decimal). And we remember, the size of first file was exactly 11776 bytes.

As a conclusion: I think the bigger file doesn't contain any new information. But I still can't answer why compiler added 4 empty pages of data to the end of ProgramDataBase file. I think this knowledge is a compiler's developers secret.

like image 166
Anthony Avatar answered Oct 07 '22 18:10

Anthony