Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there hidden expenses hardcoding data as code in C#?

Tags:

c#

I wrote a tool that generates a C# source file containing the results of a processor intensive computation (precomputing the result speeds up the startup time of my application by roughly 15 minutes). It is a byte[] that looks sorta like this:

public static byte[] precomputedData = new byte[]{
    0x01,0x02,0x03,0x04,0x05,0x06,0x07,
    0x08,0x09,0x0A,0x0B,0x0C,0x0D,0x0E,
    0x0F,0x10, .... continue for 8000 more lines....

I'm wondering though, are there hidden costs for hardcoding the result in code rather than writing it to a binary file? Specifically, I fear C# may be storing two copies of the data in RAM; one copy containing all the instructions/code to build precomputedData (for reflection purposes) and another copy containing the actual result of building precomputedData. Is this accurate? Or is the choice between hardcoding it or storing it in a binary file, purely preference?

like image 545
Mr. Smith Avatar asked Jan 19 '14 07:01

Mr. Smith


1 Answers

I wrote a small test program and looked at the IL code. With a static field a static constructor is generated which initializes that field. For arrays there is a really nice explanation in this blog post about How C# Array Initialiers Work by Bart De Smet.

.method private hidebysig specialname rtspecialname static 
void .cctor () cil managed 
{
    // Method begins at RVA 0x2b50
    // Code size 24 (0x18)
    .maxstack 8

    IL_0000: ldc.i4.s 10
    IL_0002: newarr [mscorlib]System.Int32
    IL_0007: dup
    IL_0008: ldtoken field valuetype
        '<PrivateImplementationDetails>{0BB3CD0B-D585-49BF-8408-2CCB3FA63A32}'/'__StaticArrayInitTypeSize=40'
        '<PrivateImplementationDetails>{0BB3CD0B-D585-49BF-8408-2CCB3FA63A32}'::'$$method0x6000013-1'
    IL_000d: call void
        [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class
        [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle)
    IL_0012: stsfld int32[] Hgx.Test.Program::myarray
    IL_0017: ret
} // end of method Program::.cctor

The <PrivateImplementationDetails>...$$method field is defined as

.field assembly static valuetype 
    '<PrivateImplementationDetails>{0BB3CD0B-D585-49BF-8408-2CCB3FA63A32}'/'__StaticArrayInitTypeSize=40'
    '$$method0x6000013-1' at I_00002b28

As far as I can tell and according to the post I mentioned earlier this means that the data is initialized directly from the PE image.


Whether or not this is a good practice is a different question.

The advantage of having it hardcoded as you do is that the data is very easy to load and it's hard to manipulate the data since it's embedded in the file which can be signed. The disadvantage is of course that any change of the precomputed data means that you have to recompile the project.

I would probably use a binary file to store the data unless I were really worried about someone changing the precomputed data with malicious intent. Loading is only slightly more complex than having it as a static field and I would value the ability to change the data without recompiling.

like image 79
Dirk Avatar answered Oct 31 '22 10:10

Dirk