I understand that this is an implementation detail. I'm actually curious what that implementation detail is in Microsoft's CLR.
Now, bear with me as I did not study CS in college, so I might have missed out on some fundamental principles.
But my understanding of the "stack" and the "heap" as implemented in the CLR as it stands today is, I think, solid. I'm not going to make some inaccurate umbrella statement such as "value types are stored on the stack," for example. But, in most common scenarios -- plain vanilla local variables, of value type, either passed as parameters or declared within the method and not contained inside a closure -- value type variables are stored on the stack (again, in Microsoft's CLR).
I guess what I'm unsure of is where ref
value type parameters come in.
Originally what I was thinking was that, if the call stack looks like this (left = bottom):
A() -> B() -> C()
...then a local variable declared within the scope of A and passed as a ref
parameter to B could still be stored on the stack--couldn't it? B would simply need the memory location where that local variable was stored within A's frame (forgive me if that isn't the right terminology; I think it's clear what I mean, anyway).
I realized this couldn't be strictly true, though, when it occurred to me that I could do this:
delegate void RefAction<T>(ref T arg);
void A()
{
int x = 100;
RefAction<int> b = B;
// This is a non-blocking call; A will return immediately
// after this.
b.BeginInvoke(ref x, C, null);
}
void B(ref int arg)
{
// Putting a sleep here to ensure that A has exited by the time
// the next line gets executed.
Thread.Sleep(1000);
// Where is arg stored right now? The "x" variable
// from the "A" method should be out of scope... but its value
// must somehow be known here for this code to make any sense.
arg += 1;
}
void C(IAsyncResult result)
{
var asyncResult = (AsyncResult)result;
var action = (RefAction<int>)asyncResult.AsyncDelegate;
int output = 0;
// This variable originally came from A... but then
// A returned, it got updated by B, and now it's still here.
action.EndInvoke(ref output, result);
// ...and this prints "101" as expected (?).
Console.WriteLine(output);
}
So in the example above, where is x
(in A's scope) stored? And how does this work? Is it boxed? If not, is it subject to garbage collection now, despite being a value type? Or can the memory immediately be reclaimed?
I apologize for the long-winded question. But even if the answer is quite simple, maybe this will be informative to others who find themselves wondering the same thing in the future.
As for why async methods don't support out-by-reference parameters? (or ref parameters?) That's a limitation of the CLR. We chose to implement async methods in a similar way to iterator methods -- i.e. through the compiler transforming the method into a state-machine-object.
The ref is a keyword in C# which is used for the passing the arguments by a reference. Or we can say that if any changes made in this argument in the method will reflect in that variable when the control return to the calling method. The ref parameter does not pass the property.
The out parameters are not allowed to use in asynchronous methods. The out parameters are not allowed to use in iterator methods. There can be more than one out parameter in a method. At the time of method call, out parameter can be declared inline.
A reference parameter is a reference to a memory location of a variable. When you pass parameters by reference, unlike value parameters, a new storage location is not created for these parameters. The reference parameters represent the same memory location as the actual parameters that are supplied to the method.
I don't believe that when you use BeginInvoke()
and EndInvoke()
with ref
or out
arguments you are truly passing the variables by ref. The fact that we have to call EndInvoke()
with a ref
parameter as well should be a clue to this.
Let's change your example to demonstrate the behavior I describe:
void A()
{
int x = 100;
int z = 400;
RefAction<int> b = B;
//b.BeginInvoke(ref x, C, null);
var ar = b.BeginInvoke(ref x, null, null);
b.EndInvoke(ref z, ar);
Console.WriteLine(x); // outputs '100'
Console.WriteLine(z); // outputs '101'
}
If you examine the output now, you will see that the value of x
is actually unchanged. But z
does now contain the update value.
I suspect that the compiler alters the semantics of passing variables by ref
when you use the asynchronous Begin/EndInvoke methods.
After taking a look at the IL produced by this code, it appears that ref
arguments to BeginInvoke()
are still passed by ref
. While Reflector doesn't show the IL for this method, I suspect that it simply doesn't pass along the parameter as a ref
argument, but instead creates a separate variable behind the scenes to pass to B()
. When you then call EndInvoke()
you must supply a ref
argument again to retrieve the value from the async state. It's likely that such arguments are actually stored as part of (or in conjunction with) the IAsyncResult
object which is needed to ultimately retrieve their values.
Let's think about why the behavior likely works this way. When you make an async call to a method, you are doing so on a separate thread. This thread has its own stack and so cannot use the typical mechanism of aliasing ref/out
variables. However, in order to get any returned values from an async method, you need to eventually call EndInvoke()
to complete the operation and retrieve these values. However, the call to EndInvoke()
could just as easily occur on a completely different thread than the original call to BeginInvoke()
or the actual body of the method. Clearly the call stack is not a good place to store such data - especially since the thread used for the async call could be re-purposed for a different method once the async operation completes. As a result, some mechanism other than the stack is needed to "marshal" the return value and out/ref arguments from the method being called back to the site where they will ultimately be used.
I believe this mechanism (in the Microsoft .NET implementation) is the IAsyncResult
object. In fact, if you examine the IAsyncResult
object in the debugger, you will notice that in the non-public members there exists _replyMsg
, which contains a Properties
collection. This collection contains elements like __OutArgs
and __Return
whose data appear to reflect their namesakes.
EDIT: Here's a theory about the async delegate design, that occurs to me. It seems likely that the signatures of BeginInvoke()
and EndInvoke()
were chosen to be as similar as possible to each other to avoid confusion and improve clarity. The BeginInvoke()
method doesn't actually need to accept ref/out
arguments - since it only needs their value ... not their identify (as it's never going to assign anything back to them). However it would be really odd (for example) to have a BeginInvoke()
call that takes an int
and an EndInvoke()
call that takes a ref int
. Now, it's possible that there are technical reasons why begin/end calls should have identical signatures - but I think that the benefits of clarity and symmetry are sufficient to validate such a design.
All of this is, of course, an implementation detail of the CLR and C# compiler and could change in the future. It is interesting, however, that there is the possibility for confusion - if you expect that the original variable passed to BeginInvoke()
will actually be modified. It also underscores the importance of calling EndInvoke()
to complete an async operation.
Perhaps someone from the C# team (if they see this question) could offer more insight into the details and design choices behind this functionality.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With