Python binary EOF

1 Answers

To quote the documentation:

file.read([size])

Read at most size bytes from the file (less if the read hits EOF before obtaining size bytes). If the size argument is negative or omitted, read all data until EOF is reached. The bytes are returned as a string object. An empty string is returned when EOF is encountered immediately. (For certain files, like ttys, it makes sense to continue reading after an EOF is hit.) Note that this method may call the underlying C function fread() more than once in an effort to acquire as close to size bytes as possible. Also note that when in non-blocking mode, less data than was requested may be returned, even if no size parameter was given.

That means (for a regular file):

f.read(1) will return a byte object containing either 1 byte or 0 byte is EOF was reached
f.read(2) will return a byte object containing either 2 bytes, or 1 byte if EOF is reached after the first byte, or 0 byte if EOF in encountered immediately.
...

If you want to read your file one byte at a time, you will have to read(1) in a loop and test for "emptiness" of the result:

Click to copy

# From answer by @Daniel
with open(filename, 'rb') as f:
    while True:
        b = f.read(1)
        if not b:
            # eof
            break
        do_something(b)

If you want to read your file by "chunk" of say 50 bytes at a time, you will have to read(50) in a loop:

Click to copy

with open(filename, 'rb') as f:
    while True:
        b = f.read(50)
        if not b:
            # eof
            break
        do_something(b) # <- be prepared to handle a last chunk of length < 50
                        #    if the file length *is not* a multiple of 50

In fact, you may even break one iteration sooner:

Click to copy

with open(filename, 'rb') as f:
    while True:
        b = f.read(50)
        do_something(b) # <- be prepared to handle a last chunk of size 0
                        #    if the file length *is* a multiple of 50
                        #    (incl. 0 byte-length file!)
                        #    and be prepared to handle a last chunk of length < 50
                        #    if the file length *is not* a multiple of 50
        if len(b) < 50:
            break

Concerning the other part of your question:

Why does the container [..] contain [..] a whole bunch of them [bytes]?

Referring to that code:

Click to copy

for x in file:  
   i=i+1  
   print(x)

To quote again the doc:

A file object is its own iterator, [..]. When a file is used as an iterator, typically in a for loop (for example, for line in f: print line.strip()), the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit when the file is open for reading (behavior is undefined when the file is open for writing).

The the code above read a binary file line-by-line. That is stopping at each occurrence of the EOL char (\n). Usually, that leads to chunks of various length as most binary files contains occurrences of that char randomly distributed.

I wouldn't encourage you to read a binary file that way. Please prefer one a solution based on read(size).

119

answered Nov 15 '22 20:11

Sylvain Leroux

Related questions
                            
                                histogram graph line style in matplotlib
                            
                                Python: subprocess.popen: read each line of output
                            
                                Numpy Cholesky decomposition LinAlgError
                            
                                Python daemon thread does not exit when parent thread exits
                            
                                How to get the index value in pandas MultiIndex data frame?
                            
                                Test coverage tool for Behave test framework
                            
                                Pygame error: mixer system not initialized
                            
                                django :: How to style a CheckboxSelectMultiple in a form?
                            
                                Is there a way to prevent a python 3 script from being called in python 2?
                            
                                Get a list of the lowest subdirectories in a tree
                            
                                Python - get the last element of each list in a list of lists
                            
                                PyQT -- How can you make a QTreeview uneditable but also selectable?
                            
                                Second order gradient in numpy
                            
                                Flask global variables [duplicate]
                            
                                gaierror [Errno 8] when send_mail with Django python and gmail
                            
                                How to return value from function running by QThread and Queue
                            
                                Passing dictionary values as constructor's arguments
                            
                                OpenCV:src is not a numerical tuple
                            
                                How to display some non editable text in rich format in GUI created by PyQt4?
                            
                                Django - How can you include annotated results in a serialized QuerySet?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python binary EOF

Tags:

python

binary

eof

mekkanizer

People also ask

1 Answers

Sylvain Leroux

Recent Activity

Donate For Us