I am trying to split up a large xml file into smaller chunks. I write to the output file and then check its size to see if its passed a threshold, but I dont think the getsize() method is working as expected.
What would be a good way to get the filesize of a file that is changing in size.
Ive done something like this...
import string
import os
f1 = open('VSERVICE.xml', 'r')
f2 = open('split.xml', 'w')
for line in f1:
if str(line) == '</Service>\n':
break
else:
f2.write(line)
size = os.path.getsize('split.xml')
print('size = ' + str(size))
running this prints 0 as the filesize for about 80 iterations and then 4176. Does Python store the output in a buffer before actually outputting it?
File size is different from file position. For example,
os.path.getsize('sample.txt')
It exactly returns file size in bytes.
But
f = open('sample.txt')
print f.readline()
f.tell()
Here f.tell() returns the current position of the file handler - i.e. where the next write will put its data. Since it is aware of the buffering, it should be accurate as long as you are simply appending to the output file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With