Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance effect of using print statements in Python script

I have a Python script that process a huge text file (with around 4 millon lines) and writes the data into two separate files.

I have added a print statement, which outputs a string for every line for debugging. I want to know how bad it could be from the performance perspective?

If it is going to very bad, I can remove the debugging line.

Edit

It turns out that having a print statement for every line in a file with 4 million lines is increasing the time way too much.

like image 641
Sudar Avatar asked Nov 08 '12 11:11

Sudar


2 Answers

Tried doing it in a very simple script just for fun, the difference is quite staggering:

In large.py:

target =  open('target.txt', 'w')

for item in xrange(4000000):
    target.write(str(item)+'\n')
    print item

Timing it:

[gp@imdev1 /tmp]$ time python large.py
real    1m51.690s
user    0m10.531s
sys     0m6.129s

gp@imdev1 /tmp]$ ls -lah target.txt 
-rw-rw-r--. 1 gp gp 30M Nov  8 16:06 target.txt

Now running the same with "print" commented out:

gp@imdev1 /tmp]$ time python large.py 
real    0m2.584s
user    0m2.536s
sys     0m0.040s
like image 121
GSP Avatar answered Oct 15 '22 06:10

GSP


Yes it affects performance. I wrote a small program to demonstrate-

import time
start_time=time.time()
for i in range(100):
    for j in range(100):
        for k in range(100):
            print(i,j,k)
print(time.time()-start_time)
input()

The time measured was-160.2812204496765 Then I replaced the print statement by pass. The results were shocking. The measured time without print was- 0.26517701148986816.

like image 36
Kshitij Joshi Avatar answered Oct 15 '22 06:10

Kshitij Joshi