Before you react from the gut, as I did initially, read the whole question please. I know they make you feel dirty, I know we've all been burned before and I know it's not "good style" but, are public fields ever ok?
I'm working on a fairly large scale engineering application that creates and works with an in memory model of a structure (anything from high rise building to bridge to shed, doesn't matter). There is a TON of geometric analysis and calculation involved in this project. To support this, the model is composed of many tiny immutable read-only structs to represent things like points, line segments, etc. Some of the values of these structs (like the coordinates of the points) are accessed tens or hundreds of millions of times during a typical program execution. Because of the complexity of the models and the volume of calculation, performance is absolutely critical.
I feel that we're doing everything we can to optimize our algorithms, performance test to determine bottle necks, use the right data structures, etc. etc. I don't think this is a case of premature optimization. Performance tests show order of magnitude (at least) performance boosts when accessing fields directly rather than through a property on the object. Given this information, and the fact that we can also expose the same information as properties to support data binding and other situations... is this OK? Remember, read only fields on immutable structs. Can anyone think of a reason I'm going to regret this?
Here's a sample test app:
struct Point { public Point(double x, double y, double z) { _x = x; _y = y; _z = z; } public readonly double _x; public readonly double _y; public readonly double _z; public double X { get { return _x; } } public double Y { get { return _y; } } public double Z { get { return _z; } } } class Program { static void Main(string[] args) { const int loopCount = 10000000; var point = new Point(12.0, 123.5, 0.123); var sw = new Stopwatch(); double x, y, z; double calculatedValue; sw.Start(); for (int i = 0; i < loopCount; i++) { x = point._x; y = point._y; z = point._z; calculatedValue = point._x * point._y / point._z; } sw.Stop(); double fieldTime = sw.ElapsedMilliseconds; Console.WriteLine("Direct field access: " + fieldTime); sw.Reset(); sw.Start(); for (int i = 0; i < loopCount; i++) { x = point.X; y = point.Y; z = point.Z; calculatedValue = point.X * point.Y / point.Z; } sw.Stop(); double propertyTime = sw.ElapsedMilliseconds; Console.WriteLine("Property access: " + propertyTime); double totalDiff = propertyTime - fieldTime; Console.WriteLine("Total difference: " + totalDiff); double averageDiff = totalDiff / loopCount; Console.WriteLine("Average difference: " + averageDiff); Console.ReadLine(); } }
result:
Direct field access: 3262
Property access: 24248
Total difference: 20986
Average difference: 0.00020986
It's only 21 seconds, but why not?
Fields should be declared private unless there is a good reason for not doing so. One of the guiding principles of lasting value in programming is "Minimize ripple effects by keeping secrets." When a field is private , the caller cannot usually get inappropriate direct access to the field.
Fields can be marked as public, private, protected, internal, protected internal, or private protected. These access modifiers define how users of the type can access the fields. For more information, see Access Modifiers. A field can optionally be declared static.
If you have a Class object, you can obtain its public fields (including inherited fields) by calling getFields() on the Class object.
Your test isn't really being fair to the property-based versions. The JIT is smart enough to inline simple properties so that they have a runtime performance equivalent to that of direct field access, but it doesn't seem smart enough (today) to detect when the properties access constant values.
In your example, the entire loop body of the field access version is optimized away, becoming just:
for (int i = 0; i < loopCount; i++) 00000025 xor eax,eax 00000027 inc eax 00000028 cmp eax,989680h 0000002d jl 00000027 }
whereas the second version, is actually performing the floating point division on each iteration:
for (int i = 0; i < loopCount; i++) 00000094 xor eax,eax 00000096 fld dword ptr ds:[01300210h] 0000009c fdiv qword ptr ds:[01300218h] 000000a2 fstp st(0) 000000a4 inc eax 000000a5 cmp eax,989680h 000000aa jl 00000096 }
Making just two small changes to your application to make it more realistic makes the two operations practically identical in performance.
First, randomize the input values so that they aren't constants and the JIT isn't smart enough to remove the division entirely.
Change from:
Point point = new Point(12.0, 123.5, 0.123);
to:
Random r = new Random(); Point point = new Point(r.NextDouble(), r.NextDouble(), r.NextDouble());
Secondly, ensure that the results of each loop iteration are used somewhere:
Before each loop, set calculatedValue = 0 so they both start at the same point. After each loop call Console.WriteLine(calculatedValue.ToString()) to make sure that the result is "used" so the compiler doesn't optimize it away. Finally, change the body of the loop from "calculatedValue = ..." to "calculatedValue += ..." so that each iteration is used.
On my machine, these changes (with a release build) yield the following results:
Direct field access: 133 Property access: 133 Total difference: 0 Average difference: 0
Just as we expect, the x86 for each of these modified loops is identical (except for the loop address)
000000dd xor eax,eax 000000df fld qword ptr [esp+20h] 000000e3 fmul qword ptr [esp+28h] 000000e7 fdiv qword ptr [esp+30h] 000000eb fstp st(0) 000000ed inc eax 000000ee cmp eax,989680h 000000f3 jl 000000DF (This loop address is the only difference)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With