There is a part of code:
*det_ptr++ = (float)(dx*dy - 0.81*dxy*dxy);
where dx, dy, and dxy are floats.
Apple LLVM 3.0 compiler makes the following assembly for it:
+0x250 vldr.32 s0, [r13, #+140]
+0x254 vldr.32 s1, [r13, #+136]
+0x258 vmul.f32 s0, s0, s1
+0x25c vcvt.f64.f32 d16, s0 <-------------- cast from float to double
+0x260 vldr.32 s0, [r13, #+132]
+0x264 vcvt.f64.f32 d17, s0 <-------------- cast from float to double
+0x268 vldr.64 d18, [r13, #+16]
+0x26c vmul.f64 d17, d18, d17
+0x270 vldr.32 s0, [r13, #+132]
+0x274 vcvt.f64.f32 d19, s0 <-------------- cast from float to double
+0x278 vmul.f64 d17, d17, d19
+0x27c vsub.f64 d16, d16, d17
+0x280 vcvt.f32.f64 s0, d16
+0x284 ldr r0, [sp, #+104]
+0x286 adds r1, r0, #4 ; 0x4
+0x288 str r1, [sp, #+104]
+0x28a vstr.32 s0, [r0]
Is there any way to forbid these casts?
The way in which you wrote your program requires those casts. 0.81
is a double-precision literal, so dxy
must be promoted to double before the multiplication takes place, and dx*dy
must be promoted before the subtraction. The fact that you cast the final result back to float doesn't matter--the C standard is perfectly clear that those terms are evaluated in double-precision regardless.
To prevent the promotion to double, use a single-precision literal instead (by adding the f
suffix):
*det_ptr++ = dx*dy - 0.81f*dxy*dxy;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With