I'm working in linux. I would like to display the percentage of file parsed. That's why after reading a bit I decided that the most accurate way to do that would be get the total size (bytes) of the file i'm parsing then calculate the size (bytes) of each line after reading it.
This is my dummy simplified code.
if __name__ == '__main__':
read_bytes = 0
total_file_size = os.path.getsize(myfile)
with open(myfile, 'r') as input_file:
for line in input_file:
read_bytes += sys.getsizeof(line)
print "do my stuff"
print total_file_size
print read_bytes
Output is:
193794194
203979278
Obviously there's something count in line that's increasing total size. I've tried with:
read_bytes += sys.getsizeof(line) - sys.getsizeof('\n')
And output is:
193794194
193309190
I must be missing something.
Use len instead of sys.getsizeof():
sys.getsizeof() return used byte by interpreter to hold that object.
>>> len('asdf')
4
>>> import sys
>>> sys.getsizeof('asdf')
37
In addition to that, if you are running the program in the Window, you should use binary mode.
open(myfile, 'rb')
NOTE
Using file.tell, you don't need to calculate current position.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With