The Python docs on file.read() state that An empty string is returned when EOF is encountered immediately.
The documentation further states:
Note that this method may call the underlying C function fread() more than once in an effort to acquire as close to size bytes as possible. Also note that when in non-blocking mode, less data than was requested may be returned, even if no size parameter was given.
I believe Guido has made his view on not adding f.eof() PERFECTLY CLEAR so need to use the Python way!
What is not clear to ME, however, is if it is a definitive test that you have reached EOF if you receive less than the requested bytes from a read, but you did receive some.
ie:
with open(filename,'rb') as f:
while True:
s=f.read(size)
l=len(s)
if l==0:
break # it is clear that this is EOF...
if l<size:
break # ? Is receiving less than the request EOF???
Is it a potential error to break
if you have received less than the bytes requested in a call to file.read(size)
?
You are not thinking with your snake skin on... Python is not C.
First, a review:
n
bytes and in no case more than n
bytes;If a file read method is at EOF, it returns ''
. The same type of EOF test is used in the other 'file like" methods like StringIO, socket.makefile, etc. A return of less than n
bytes from f.read(n)
is most assuredly NOT a dispositive test for EOF! While that code may work 99.99% of the time, it is the times it does not work that would be very frustrating to find. Plus, it is bad Python form. The only use for n
in this case is to put an upper limit on the size of the return.
What are some of the reasons the Python file-like methods returns less than n
bytes?
n
bytes may cause a break between logical multi-byte characters (such as \r\n
in text mode and, I think, a multi-byte character in Unicode) or some underlying data structure not known to you;I would rewrite your code in this manner:
with open(filename,'rb') as f:
while True:
s=f.read(max_size)
if not s: break
# process the data in s...
Or, write a generator:
def blocks(infile, bufsize=1024):
while True:
try:
data=infile.read(bufsize)
if data:
yield data
else:
break
except IOError as (errno, strerror):
print "I/O error({0}): {1}".format(errno, strerror)
break
f=open('somefile','rb')
for block in blocks(f,2**16):
# process a block that COULD be up to 65,536 bytes long
Here's what my C compiler's documentation says for the fread()
function:
size_t fread(
void *buffer,
size_t size,
size_t count,
FILE *stream
);
fread returns the number of full items actually read, which may be less than count if an error occurs or if the end of the file is encountered before reaching count.
So it looks like getting less than size
means either an error has occurred or EOF has been reached -- so break
ing out of the loop would be the correct thing to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With