I am using Python 3.3.0, on windows 64bit.
I have a text file as shown below: (see bottom for download link at mediafire)
hello
-data1:blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah
-data2:blah blah blah blah blah blah blah blah blah blah blah
-data3: Empty
-data4: Empty
I'm trying to navigate around the file, and thus I use .tell()
to figure out what my position is. However, when reading through the lines of the file as shown below, I get a very strange result:
f=open("test.txt")
while True:
a = f.readline()
print("{} {}".format(repr(a),f.tell()))
if a == "":
break
The result:
'hello\n' 7
'\n' 9
'-data1:blah blah blah blah blah blah blah blah blah blah blah blah blah blah bl
ah blah\n' 18446744073709551714
'\n' 99
'\n' 101
'-data2:blah blah blah blah blah blah blah blah blah blah blah\n' 164
'-data3: Empty\n' 179
'\n' 181
'-data4: Empty' 194
'' 194
What's with the 18446744073709551714 for the 3rd line? Though it looks like an impossible value, f.seek(18446744073709551714)
is an acceptable value that apparently does bring me to the end of the 3rd line. Though, I can't seem to figure out why.
EDIT:
Opening in binary mode gives no problems with tell()
:
f=open("test.txt","rb")
while True:
a = f.readline()
print("{} {}".format(repr(a),f.tell()))
if a == b"":
break
The result:
b'hello\r\n' 7
b'\r\n' 9
b'-data1:blah blah blah blah blah blah blah blah blah blah blah blah blah blah b
lah blah\r\n' 97
b'\r\n' 99
b'\r\n' 101
b'-data2:blah blah blah blah blah blah blah blah blah blah blah\r\n' 164
b'-data3: Empty\r\n' 179
b'\r\n' 181
b'-data4: Empty' 194
b'' 194
The test.txt text file is downloadable here, just a tiny 194 bytes: http://www.mediafire.com/?1wm4lujb2j48y23
The tell() method returns the current file position in a file stream.
In computer science, an opaque data type is a data type whose concrete data structure is not defined in an interface. This enforces information hiding, since its values can only be manipulated by calling subroutines that have access to the missing information.
If you want to check if a file can be read, then you can use the readable() method. This will return a True or False . The read() method is going to read all of the content of the file as one string. Once you are done reading a file, it is important that you close it.
Python file method tell() returns the current position of the file read/write pointer within the file.
It's a documented behaviour caused by UNIX-style line endings:
file.tell()
Return the file’s current position, like
stdio
'sftell()
.Note: On Windows,
tell()
can return illegal values (after anfgets()
) when reading files with Unix-style line-endings. Use binary mode ('rb') to circumvent this problem.
The above documentation is taken from the python2.7.4 documentation. The documentation for python3 changed a bit, since there is now a hierarchy of classes that handle I/O and I can't find this bit of information. Your test shows that the behaviour didn't change anyway. Also the source code for python3.3 has an XXX Windows support below is likely incomplete
comment before the function called by tell
.
There is an issue in python bug tracker related to this, and the final comment by Catalin Iacob is:
I tried to reproduce this, picked a file on my disk and indeed I got a negative number, but that file has Unix line endings. This is documented at http://docs.python.org/2/library/stdtypes.html#file.tell so probably there's nothing to do then.
As for Armin's report in msg180145, even though it's not intuitive, this matches ftell's behavior on Windows, as documented in the Remarks section of http://msdn.microsoft.com/en-us/library/0ys3hc0b%28v=vs.100%29.aspx. The tell() method on fileobjects is explicitly documented as matching ftell behavior: "Return the file’s current position, like stdio‘s ftell()". So even though it's not intuitive at all, it's probably better to leave it as is. tell() returns the intuitive non zero position when opening with 'a' on Python3 and on Python 2.7 when using io.open so it's fixed for the future anyway.
So it seems like a "wontfix" bug. Someone should probably open an issue(commented the issue) because this fact is not mentioned at all in python3 documentation.
According to Antoine Pitrou python3 doesn't use ftell()
at all, hence this seems to be a different bug. Also the bug is not reproducible in python3.2.3 and was probably introduced when fixing this issue (at least, it's the only change I can find to the implementation of tell()
between 3.2.3 and 3.3)
Last edit: According to the io
module documentation the tell
method does not return the number of bytes since the beginning of a file. The returned value is an "opaque number", which means that the only way you can use it is to pass it to seek
to get back at that position. Other operations aren't meaningful. The fact that until python3.2.3 the value returned was what you'd expect was only an implementation detail.
Note that the information in this section of the documentation is simply wrong and, hopefully, it will be fixed in the future.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With