Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would one ever use the "in" parameter modifier in C#?

Tags:

c#

c#-7.2

So, I (think I) understand what the in parameter modifier does. But what it does appears to be quite redundant.

Usually, I'd think that the only reason to use a ref would be to modify the calling variable, which is explicitly forbidden by in. So passing by in reference seems logically equivalent to passing by value.

Is there some sort of performance advantage? It was my belief that on the back-end side of things, a ref parameter must at least copy the physical address of the variable, which should be the same size as any typical object reference.

So, then is the advantage just in larger structs, or is there some behind-the-scenes compiler optimization that makes it attractive elsewhere? If the latter, why shouldn't I make every parameter an in?

like image 585
Travis Reed Avatar asked Oct 15 '18 15:10

Travis Reed


People also ask

What is the purpose of the modifier out applied to a parameter in a method definition?

The out keyword causes arguments to be passed by reference. It makes the formal parameter an alias for the argument, which must be a variable. In other words, any operation on the parameter is made on the argument.

What is parameter modifier in C sharp?

In this articleIt is like the ref or out keywords, except that in arguments cannot be modified by the called method. Whereas ref arguments may be modified, out arguments must be modified by the called method, and those modifications are observable in the calling context. C# Copy.

What is the use of ref and out parameter in C#?

ref is used to state that the parameter passed may be modified by the method. in is used to state that the parameter passed cannot be modified by the method. out is used to state that the parameter passed must be modified by the method.

Why do we use out parameter?

The out parameter in C# is used to pass arguments to methods by reference. It differs from the ref keyword in that it does not require parameter variables to be initialized before they are passed to a method. The out keyword must be explicitly declared in the method's definition​ as well as in the calling method.


2 Answers

in was recently introduced to the C# language.

in is actually a ref readonly. Generally speaking, there is only one use case where in can be helpful: high performance apps dealing with lots of large readonly structs.

Assuming you have:

readonly struct VeryLarge {     public readonly long Value1;        public readonly long Value2;      public long Compute() { }     // etc } 

and

void Process(in VeryLarge value) { } 

In that case, the VeryLarge struct will be passed by-reference without creating of defensive copies when using this struct in the Process method (e.g. when calling value.Compute()), and the struct immutability is ensured by the compiler.

Note that passing a not-readonly struct with an in modifier will cause the compiler to create a defensive copy when calling struct's methods and accessing properties in the Process method above, which will negatively affect performance!

There is a really good MSDN blog entry which I recommend to carefully read.

If you would like to get some more historical background of in-introducing, you could read this discussion in the C# language's GitHub repository.

In general, most developers agree that introducing of in could be seen as a mistake. It's a rather exotic language feature and can only be useful in high-perf edge cases.

like image 186
dymanoid Avatar answered Oct 06 '22 22:10

dymanoid


passing by in reference seems logically equivalent to passing by value.

Correct.

Is there some sort of performance advantage?

Yes.

It was my belief that on the back-end side of things, a ref parameter must at least copy the physical address of the variable, which should be the same size as any typical object reference.

There is not a requirement that a reference to an object and a reference to a variable both be the same size, and there is not a requirement that either is the size of a machine word, but yes, in practice both are 32 bits on 32 bit machines and 64 bits on 64 bit machines.

What you think the "physical address" has to do with it is unclear to me. On Windows we use virtual addresses, not physical addresses in user mode code. Under what possible circumstances would you imagine that a physical address is meaningful in a C# program, I am curious to know.

There is also not a requirement that a reference of any kind be implemented as the virtual address of the storage. References could be opaque handles into GC tables in a conforming implementation of the CLI specification.

is the advantage just in larger structs?

Decreasing the cost of passing larger structs is the motivating scenario for the feature.

Note that there is no guarantee that in makes any program actually faster, and it can make programs slower. All questions about performance must be answered by empirical research. There are very few optimizations that are always wins; this is not an "always win" optimization.

is there some behind-the-scenes compiler optimization that makes it attractive elsewhere?

The compiler and runtime are permitted to make any optimization they choose if doing so does not violate the rules of the C# specification. There is to my knowledge not such an optimization yet for in parameters, but that does not preclude such optimizations in the future.

why shouldn't I make every parameter an in?

Well, suppose you made an int parameter instead an in int parameter. What costs are imposed?

  • the call site now requires a variable rather than a value
  • the variable cannot be enregistered. The jitter's carefully-tuned register allocation scheme just got a wrench thrown into it.
  • the code at the call site is larger because it must take a ref to the variable and put that on the stack, whereas before it could simply push the value onto the call stack
  • larger code means that some short jump instructions may have now become long jump instructions, so again, the code is now larger. This has knock-on effects on all kinds of things. Caches get filled up sooner, the jitter has more work to do, the jitter may choose to not do certain optimizations on larger code sizes, and so on.
  • at the callee site, we've turned access to a value on the stack (or register) into an indirection into a pointer. Now, that pointer is highly likely to be in the cache, but still, we've now turned a one-instruction access to the value into a two-instruction access.
  • And so on.

Suppose it's a double and you change it to an in double. Again, now the variable cannot be enregistered into a high-performance floating point register. This not only has performance implications, it can also change program behaviour! C# is permitted to do float arithmetic in higher-than-64-bit precision and typically does so only if the floats can be enregistered.

This is not a free optimization. You have to measure its performance against the alternatives. Your best bet is to simply not make large structs in the first place, as the design guidelines suggest.

like image 34
Eric Lippert Avatar answered Oct 06 '22 22:10

Eric Lippert