I noticed some strange behavior today playing around with next()
and readline()
. It seems that both functions produce the same results (which is what I expect). However, when I mix them, I get a ValueError
. Here's what I did:
>>> f = open("text.txt", 'r')
>>> f.readline()
'line 0\n'
>>> f.readline()
'line 1\n'
>>> f.readline()
'line 2\n'
>>> f.next()
'line 3\n'
>>> f.next()
'line 4\n'
>>> f.readline()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Mixing iteration and read methods would lose data
>>>
>>> f = open("text.txt", 'r')
>>> f.next()
'line 0\n'
>>> f.next()
'line 1\n'
>>> f.next()
'line 2\n'
>>> f.readline()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Mixing iteration and read methods would lose data
So the overall question here is what's going on underneath the hood that causes this error?
Some questions that might get answered along with but I would like to hear an answer for if not:
next()
and readline()
?for f in file:
which function am I calling (and does it matter)?next()
after readline()
, but not the other way around?Thanks in advance,
I don't think it matters, but in case this is version dependent, I'm on Python 2.7.6 for Windows
The readline method reads one line from the file and returns it as a string. The string returned by readline will contain the newline character at the end.
Python file method next() is used when a file is used as an iterator, typically in a loop, the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit.
Python readline() method will return a line from the file when called. readlines() method will return all the lines in a file in the format of a list where each element is a line in the file.
You simply discarded the first line from fp. readline() which explains the "skip". readline returns an empty string at end of file which explains the exception (mostly - you also may want to account for lines that only have a new line). This is a simple line-by-line read of the file which is handled with a for loop.
According to Python's doc (emphasis is mine)
A file object is its own iterator, for example iter(f) returns f (unless f is closed). When a file is used as an iterator, typically in a for loop (for example, for line in f: print line.strip()), the next() method is called repeatedly. This method returns the next input line, or raises StopIteration when EOF is hit when the file is open for reading (behavior is undefined when the file is open for writing). In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right. However, using seek() to reposition the file to an absolute position will flush the read-ahead buffer.
The next
method reads more that is needed for efficiency reasons. This breaks readline
.
So the answers are
next
is faster due to read-aheadfor s in f:
use next
next
, readline
uses standard slow read on the file so there is no problem. If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With