I'm trying to get what I call measurement units system by wrapping double into struct. I have C# structures like Meter, Second, Degree, etc. My original idea was that after compiler is inlined everything I would have a performance the same as if double were used.
My explicit and implicit operators are simple and straightforward, and compiler does actually inline them, yet the code with Meter and Second is 10 times slower than the same code using double.
My question is being: why cannot C# compiler make the code using Second as optimal as the code using double if it inlines everything anyway?
Second is defined as following:
struct Second
{
double _value; // no more fields.
public static Second operator + (Second left, Second right)
{
return left._value + right._value;
}
public static implicit Second operator (double value)
{
// This seems to be faster than having constructor :)
return new Second { _value = value };
}
// plenty of similar operators
}
Update:
I didn't ask if struct fits here. It does.
I didn't ask if code is going to be inlined. JIT does inline it.
I checked assembly operations emitted in runtime. They were different for code like this:
var x = new double();
for (var i = 0; i < 1000000; i++)
{
x = x + 2;
// Many other simple operator calls here
}
and like this:
var x = new Second();
for (var i = 0; i < 1000000; i++)
{
x = x + 2;
// Many other simple operator calls here
}
There were no call instructions in disassembly, so operations were in fact inlined. Yet the difference is significant. Performance tests show that using Second is like 10 times slower than using double.
So my questions are (attention!): why is JIT generated IA64 code is different for the cases above? What can be done to make struct run as fast as double? It seems there no theoretical difference between double and Second, what is the deep reason of difference I saw?
This is my opinion, please write a comment if you disagree, instead of silent downvoting.
C# Compiler doesn't inline it. JIT compiler might, but this is indeterministic for us, because JITer's behavior is not straightforward.
In case of double
no operators are actually invoked. Operands are added right in stack using opcode add
. In your case method op_Add
is invoked plus three struct
copying to and from stack.
To optimize it start with replacing struct
with class
. It will at least minimize amount of copies.
The C# compiler doesn't inline anything - the JIT might do that, but isn't obliged to. It should still be plenty fast though. I would probably remove the implicit conversion in the +
though (see the constructor usage below) - one more operator to look through:
private readonly double _value;
public double Value { get { return _value; } }
public Second(double value) { this._value = value; }
public static Second operator +(Second left, Second right) {
return new Second(left._value + right._value);
}
public static implicit operator Second(double value) {
return new Second(value);
}
JIT inlining is limited to specific scenarios. Will this code satisfy them? Hard to tell - but it should work and work fast enough for most scenarios. The problem with +
is that there is an IL opcode for adding doubles; it does almost no work - where-as your code is calling a few static methods and a constructor; there is always going to be some overhead, even when inlined.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With