Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why `float` function is slower than multiplying by 1.0?

Tags:

I understand that this could be argued as a non-issue, but I write software for HPC environments, so this 3.5x speed increase actually makes a difference.

In [1]: %timeit 10 / float(98765)             1000000 loops, best of 3: 313 ns per loop  In [2]: %timeit 10 / (98765 * 1.0) 10000000 loops, best of 3: 80.6 ns per loop 

I used dis to have a look at the code, and I assume float() will be slower as it requires a function call (unfortunately I couldn't dis.dis(float) to see what it's actually doing).

I guess a second question would be when should I use float(n) and when should I use n * 1.0?

like image 635
Jason P Avatar asked Apr 10 '14 09:04

Jason P


People also ask

Is floating point multiplication faster than division?

Multiplication is faster than division.

What happens when float is multiplied by INT?

The result of the multiplication of a float and an int is a float . Besides that, it will get promoted to double when passing to printf . You need a %a , %e , %f or %g format. The %d format is used to print int types.

Can you multiply floats?

First off, you can multiply floats. The problem you have is not the multiplication itself, but the original number you've used. Multiplication can lose some precision, but here the original number you've multiplied started with lost precision. This is actually an expected behavior.

Is Division slower than multiplication Java?

Division is much slower, than multiplication.


1 Answers

Because Peep hole optimizer optimizes it by precalculating the result of that multiplication

import dis dis.dis(compile("10 / float(98765)", "<string>", "eval"))    1           0 LOAD_CONST               0 (10)               3 LOAD_NAME                0 (float)               6 LOAD_CONST               1 (98765)               9 CALL_FUNCTION            1              12 BINARY_DIVIDE                     13 RETURN_VALUE          dis.dis(compile("10 / (98765 * 1.0)", "<string>", "eval"))    1           0 LOAD_CONST               0 (10)               3 LOAD_CONST               3 (98765.0)               6 BINARY_DIVIDE                      7 RETURN_VALUE         

It stores the result of 98765 * 1.0 in the byte code as a constant value. So, it just has to load it and divide, where as in the first case we have to call the function.

We can see that even more clearly like this

print compile("10 / (98765 * 1.0)", "<string>", "eval").co_consts # (10, 98765, 1.0, 98765.0) 

Since the value is pre-calculated during the compile time itself, second one is faster.

Edit: As pointed out by Davidmh in the comments,

And the reason why it is not also optimising away the division is because its behaviour depends on flags, like from __future__ import division and also because of -Q flag.

Quoting the comment from the actual peephole optimizer code for Python 2.7.9,

        /* Cannot fold this operation statically since            the result can depend on the run-time presence            of the -Qnew flag */ 
like image 108
thefourtheye Avatar answered Oct 06 '22 00:10

thefourtheye