I've come across this strange behavior regarding opening a file in append mode and then attempting to seek to the start of the file.
The code should be self-explanatory: In the second open, I expect to be able to write a string to the beginning of the file and then have f.tell()
return 5 (the number of bytes written at the beginning of the file).
The thing is that in Python 2.6.6 and 2.7.6 the final assert fires but, surprisingly, it works in Python 3.3.2.
# Create new file, write after the 100th byte.
f = open("test", "w+b")
f.seek(100, 0)
f.write(b"end")
assert f.tell() == 100 + len("end")
f.close()
# Open file for writing without overwriting, attempt to write at start of file
# of the file, but doesn't.
f = open("test", "a+b")
f.seek(0, 0)
f.write(b"start")
assert f.tell() == len("start")
f.close()
So I made a C program that does the equivalent. It actually behaves like the Python 2.x versions:
#include <stdio.h>
int main() {
FILE *f = fopen("tt", "w+b");
fseek(f, 100, 0);
fwrite("x", 1, 1, f);
fclose(f);
f = fopen("tt", "a+b");
fseek(f, 0, 0);
fwrite("y", 1, 1, f);
printf("%ld\n", ftell(f));
fclose(f);
return 0;
}
This prints 102
, and I consider this canonical (I've looked at strace -eopen,close,lseek
outputs as well, but am none the smarter).
So my question is: What kind of embarrassingly basic knowledge do I not have?
And why does Python 3 behave differently?
By the way, I'm on Linux.
You're looking for the r+ / a+ / w+ mode, which allows both read and write operations to files. With r+ , the position is initially at the beginning, but reading it once will push it towards the end, allowing you to append. With a+ , the position is initially at the end.
Open both the files in read only mode using the open() function. Print the contents of the files before appending using the read() function. Close both the files using the close() function. Open the first file in append mode and the second file in read mode.
Extend mode is used to append records in a sequential file. In this mode, records are inserted at the end.
The behaviour is one enforced by your OS.
The Python 2 open()
call is actually opening files the same way your C
code does, with a little more wrapping. Both ask the OS to open the file in a given mode. Your OS then explicitly limits where you can seek to; you are not allowed to seek to read or overwrite data that was there before you opened the file.
In Python 3 I/O has been overhauled and the new file handling code does not directly pass on the mode to the OS, allowing you to not be bound by that restriction. You can do the same in Python 2 by using the io.open()
function.
The open()
function documentation certainly warns you about certain OS behaviour where you cannot even use seek
when using a
:
[...]
'a'
for appending (which on some Unix systems means that all writes append to the end of the file regardless of the current seek position).
On Linux the open()
man page is explicit about just that behaviour:
O_APPEND
The file is opened in append mode. Before eachwrite(2)
, the file offset is positioned at the end of the file, as if withlseek(2)
.
e.g. when trying to write, the file position is moved to the end of the file; you are only ever allowed to append.
I'm looking to find you a list of OSes that behave in that way; it looks like Microsoft Windows does something similar. The .NET FileMode
enumeration documentation states:
Append
[...] Trying to seek to a position before the end of the file throws anIOException
exception, and any attempt to read fails and throws aNotSupportedException
exception.
In any case, if you want to both write to and read from a file (not just append), but not truncate the file when opening, use the 'r+'
mode. This opens the file for both reading and writing. You can append by seeking to the end, but at the same time you are free to seek to any point in the file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With