Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a best practice when a type should be boxed?

In C#, there are structs and classes. Structs are usually (i.e. there are exceptions) stack allocated and classes are always heap allocated. Class instances, therefore, put pressure on the GC and are considered "slower" than structs. Microsoft has a best practice guide when to use structs over classes. This says to consider a struct if:

  • It logically represents a single value, similar to primitive types (int, double, etc.).
  • It has an instance size under 16 bytes.
  • It is immutable.
  • It will not have to be boxed frequently.

In C#, using struct instances that are larger than 16 bytes is generally said to perform worse than garbage collected class instances (dynamically allocated).

When does a boxed instance (which is heap-allocated) perform better, in terms of speed, than a non-boxed equivalent instance (which is stack-allocated)? Is there any best practice about when we should dynamically allocate (on the heap) instead of sticking to the default stack allocation?

like image 365
Noel Widmer Avatar asked Aug 11 '17 12:08

Noel Widmer


People also ask

Why use boxing and unboxing in c#?

Boxing and unboxing are important concepts in C#. The C# Type System contains three data types: Value Types (int, char, etc), Reference Types (object) and Pointer Types. Basically, Boxing converts a Value Type variable into a Reference Type variable, and Unboxing achieves the vice-versa.

What is boxing allocation?

Boxing is an implicit conversion of a value type to the type object or to any interface type implemented by this value type. Boxing a value type allocates an object instance on the heap and copies the value into the new object.

When a variable of a value type is converted to object it's called______?

Q 16 - Which of the following defines unboxing correctly? A - When a value type is converted to object type, it is called unboxing.

What is a boxed variable?

Boxed values are data structures that are minimal wrappers around primitive types*. Boxed values are typically stored as pointers to objects on the heap. Thus, boxed values use more memory and take at minimum two memory lookups to access: once to get the pointer, and another to follow that pointer to the primitive.


Video Answer


1 Answers

TL;DR: start with no boxing, then profile.


Stack Allocation vs Boxed Allocation

This is perhaps more clear cut:

  • Stick to the stack,
  • Unless the value is big enough that it would blow it up.

While semantically writing fn foo() -> Bar implies moving Bar from the callee frame to the caller frame, in practice you are more likely to end up with the equivalent of a fn foo(__result: mut * Bar) signature where the caller allocates space on its stack and passes a pointer to the callee.

This may not always be sufficient to avoid copying, as some patterns may prevent writing directly in the return slot:

fn defeat_copy_elision() -> WithDrop {
    let one = side_effectful();
    if side_effectful_too() {
        one
    } else {
        side_effects_hurt()
    }
}

Here, there is no magic:

  • if the compiler uses the return slot for one, then in case the branch evaluates to false it has to move one out then instantiate the new WithDrop into it, and finally destroy one,
  • if the compiler instantiates one on the current stack, and it has to return it, then it has to perform a copy.

If the type didn't need Drop, there would be no issue.

Despite these oddball cases, I advise sticking to the stack if possible unless profiling reveals a place where it'd be beneficial to box.


Inline Member or Boxed Member

This case is much more complicated:

  • the size of the struct/enum is affected, thus CPU cache behavior is affected:

    • less frequently used big variants are a good candidate for boxing (or boxing parts of them),
    • less frequently accessed big members are a good candidate for boxing.
  • at the same time, there are costs for boxing:

    • it's incompatible with Copy types, and implicitly implements Drop (which, as seen above, disables some optimizations),
    • allocating/freeing memory has unbounded latency1,
    • accessing boxed memory introduces data-dependency: you cannot know which cache line to request before knowing the address.

As a result, this is a very fine balancing act. Boxing or unboxing a member may improve the performance of some parts of the codebase while decreasing the performance of others.

There is definitely no one-size fits all.

Thus, once again, I advise avoiding boxing until profiling reveals a place where it'd be beneficial to box.

1Consider that on Linux, any memory allocation for which there is no spare memory in the process may require a system call, which if there is no spare memory in the OS may trigger the OOM killer to kill a process, at which point its memory is salvaged and made available. A simple malloc(1) may easily require milliseconds.

like image 89
Matthieu M. Avatar answered Oct 06 '22 14:10

Matthieu M.