Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Seeking from end of file throwing unsupported exception

I have this code snippet and I'm trying to seek backwards from the end of file using python:

f=open('D:\SGStat.txt','a');
    f.seek(0,2)
    f.seek(-3,2)

This throws the following exception while running:

f.seek(-3,2)
io.UnsupportedOperation: can't do nonzero end-relative seeks

Am i missing something here?

like image 553
seriousgeek Avatar asked Feb 03 '14 17:02

seriousgeek


3 Answers

From the documentation for Python 3.2 and up:

In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)).

This is because text files do not have a 1-to-1 correspondence between encoded bytes and the characters they represent, so seek can't tell where to jump to in the file to move by a certain number of characters.

If your program is okay with working in terms of raw bytes, you can change your program to read:

f = open('D:\SGStat.txt', 'ab') f.seek(-3, 2) 

Note the b in the mode string, for a binary file. (Also note the removal of the redundant f.seek(0, 2) call.)

However, you should be aware that adding the b flag when you are reading or writing text can have unintended consequences (with multibyte encoding for example), and in fact changes the type of data read or written.

like image 122
jonrsharpe Avatar answered Sep 29 '22 03:09

jonrsharpe


The existing answers do answer the question, but provide no solution.

From readthedocs:

If the file is opened in text mode (without b), only offsets returned by tell() are legal. Use of other offsets causes undefined behavior.

This is supported by the documentation, which says that:

In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file [os.SEEK_SET] are allowed...

This means if you have this code from old Python:

f.seek(-1, 1)   # seek -1 from current position 

it would look like this in Python 3:

f.seek(f.tell() - 1, os.SEEK_SET)   # os.SEEK_SET == 0 

Solution

Putting this information together we can achieve the goal of the OP:
f.seek(0, os.SEEK_END)              # seek to end of file; f.seek(0, 2) is legal f.seek(f.tell() - 3, os.SEEK_SET)   # go backwards 3 bytes 
like image 29
Eric Lindsey Avatar answered Sep 29 '22 03:09

Eric Lindsey


Eric Lindsey's answer does not work because UTF-8 files can have more than one byte per character. Worse, for those of us who speak English as a first language and work with English only files, it might work just long enough to get out into production code and really break things.


The following answer is based on undefined behavior

... but it does work for now for UTF-8 in Python 3.7.

To seek backwards through a file in text mode, you can do so as long as you correctly handle the UnicodeDecodeError caused by seeking to a byte which is not the start of a UTF-8 Character. Since we are seeking backwards we can simply seek back an extra byte until we find the start of the character.

The result of f.tell() is still the byte position in the file for UTF-8 files, at-least for now. So an f.seek() to an invalid offset will raise a UnicodeDecodeError when you subsequently f.read() and this can be corrected by f.seek() again to a different offset. At least this works for now.

Eg, seeking to the beginning of a line (just after the \n):

pos = f.tell() - 1
if pos < 0:
    pos = 0
f.seek(pos, os.SEEK_SET)
while pos > 0:
    try:
        character = f.read(1)
        if character == '\n':
            break
    except UnicodeDecodeError:
        pass
    pos -= 1
    f.seek(pos, os.SEEK_SET)
like image 34
Philip Couling Avatar answered Sep 29 '22 01:09

Philip Couling