Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python line read size in bytes

I'm working in linux. I would like to display the percentage of file parsed. That's why after reading a bit I decided that the most accurate way to do that would be get the total size (bytes) of the file i'm parsing then calculate the size (bytes) of each line after reading it.

This is my dummy simplified code.

if __name__ == '__main__':

read_bytes = 0
total_file_size = os.path.getsize(myfile)

with open(myfile, 'r') as input_file:
    for line in input_file:
        read_bytes += sys.getsizeof(line)

        print "do my stuff"

print total_file_size
print read_bytes

Output is:

193794194

203979278

Obviously there's something count in line that's increasing total size. I've tried with:

read_bytes += sys.getsizeof(line) - sys.getsizeof('\n')

And output is:

193794194

193309190

I must be missing something.

like image 398
gmarco Avatar asked May 14 '26 13:05

gmarco


1 Answers

Use len instead of sys.getsizeof():

sys.getsizeof() return used byte by interpreter to hold that object.

>>> len('asdf')
4
>>> import sys
>>> sys.getsizeof('asdf')
37

In addition to that, if you are running the program in the Window, you should use binary mode.

open(myfile, 'rb')

NOTE

Using file.tell, you don't need to calculate current position.

like image 150
falsetru Avatar answered May 16 '26 02:05

falsetru



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!