Why does a large for loop with 10 billion iterations take a much longer time to run in Python than in C?

Question

I am currently comparing two loop calculation in Python3 and C. For Python, I have:

# Python3
t1 = time.process_time()
a = 100234555
b = 22333335
c = 341500
for i in range(1, 10000000001):
    a = a - (b % 2)
    b = b - (c % 2)
print("Sum is", a+b)
t2 = time.process_time()
print(t2-t1, "Seconds")

Then in C, I do the same thing:

#include <stdio.h>

int main() {
   long long a = 100234555;
   long long b = 22333335;  
   long long c = 341500;
   for(long long i = 1; i <= 10000000000; i++){
        a = a - (b % 2);
        b = b - (c % 2);
   }
   printf("Sum is %lld
", a+b);
   return 0;
}

I timed both the code in Python and in C. The timing for Python is around 3500 seconds while the timing in C (including compilation and execution) only takes around 0.3 seconds.

I am wondering how there is such a big difference in timing. The execution was done on a server with 100 GB Ram and enough processing power.

wizzwizz4 · Accepted Answer

It's partially due to the fact that Python byte code is executed by a program instead of the CPU directly, but most of the overhead is caused by the memory allocation and deallocation caused by the immutability of integers which is due to the object model, not the interpretedness.

What's going on is that your C code can change the value of the numbers, but in Python numbers are immutable which means they do not change. This means that when you do a sum, Python has to create a new int object for each new value, and then destroy the old ints after they're no longer used. This makes it much slower than just modifying a single memory value.

There is also the possibility that your C compiler is being clever, and reasons that via a chain of optimisations it can completely remove your for loop, and the result will be identical – as if the loop had actually run. I'd expect the code to run much faster than it did if that had been the case in your example, but it could do that.

Python has no such smart compiler. It can't do something as grand and clever as that; it's just not designed to optimise the code because it's so hard to do so reliably in a dynamically-typed language (though the fact that Python is strongly-typed does make it somewhat of a possibility.

chqrlie · Answer

As dmuir noticed, the code can be simplified drastically if the compiler propagates some constants correctly. For example: clang -O1 compiles the C code down to this (cf https://gcc.godbolt.org/z/1ZH8Rm ):

main:                                   # @main
        push    rax
        movabs  rsi, -9877432110
        mov     edi, offset .L.str
        xor     eax, eax
        call    printf
        xor     eax, eax
        pop     rcx
        ret
.L.str:
        .asciz  "Sum is %lld
"

gcc -O1 produces essentially similar code.

Since this boils down to a single call to printf, the explanation seems to be:

The Python compiler is not as smart as the C compiler to optimize this code.
Your C compiler takes a long time to compile this 12 line piece of code. 3 seconds is way too long given your hardware setup! It only takes 0.15 seconds on my dinky laptop to compile and run the code with all optimisations. Are you compiling as C++?

Testing the C version with optimisations disabled (-O0) produces this output:

$ time (clang -O0 -o loop10g loop10g.c && ./loop10g)
Sum is -9877432110

real    4m15.352s
user    3m47.232s
sys     0m3.252s

Still much faster with unoptimized C than Python: 255 seconds vs: >3500

The Python code is interpreted with byte-code and a stack with dynamically typed values: a factor of 10 to 20 is a typical slowdown. Furthermore the integer arithmetic automatically switches to bignum mode for large values, which may be the case here, although the penalty should be even higher.

Why does a large for loop with 10 billion iterations take a much longer time to run in Python than in C?

Tags:

python

c

python-3.x

user321627

2 Answers

wizzwizz4

chqrlie

Recent Activity

Donate For Us

Why does a large for loop with 10 billion iterations take a much longer time to run in Python than in C?

Tags:

python

c

python-3.x

user321627

2 Answers

wizzwizz4

chqrlie

Related questions

Recent Activity

Donate For Us