I have code like this:
f1 = open('file1', 'a')
f2 = open('file1', 'a')
f1.write('Test line 1\n')
f2.write('Test line 2\n')
f1.write('Test line 3\n')
f2.write('Test line 4\n')
When this code is run with standard Python 2.7 interpreter, the file contains four lines as expected. However, when I run this code under PyPy, the file contains only two lines.
Could someone explain the differences between Python and PyPy in working with files in append mode?
UPDATED: The problem doesn't exist in the PyPy 2.3.
Append mode adds information to an existing file, placing the pointer at the end. If a file does not exist, append mode creates the file. Note: The key difference between write and append modes is that append does not clear a file's contents.
With r+ , the position is initially at the beginning, but reading it once will push it towards the end, allowing you to append. With a+ , the position is initially at the end. If you ever need to do an entire reread, you could return to the starting position by doing f. seek(0) .
The best practice for writing to, appending to, and reading from text files in Python is using the with keyword. Breakdown: You first start off with the with keyword. Next, you open the text file.
The reason in different behavior is different implementation of file I/O operations.
CPython implements it's file I/O on top of fopen
, fread
and fwrite
functions from stdio.h
and is working with FILE *
streams.
In the same time PyPy implements it's file I/O on top of POSIX open
, write
and read
functions and is working with int
file descriptors.
Compare these two programs in C:
#include <stdio.h>
int main() {
FILE *a = fopen("file1", "a");
FILE *b = fopen("file1", "a");
fwrite("Test line 1\n", 12, 1, a);
fflush(a);
fwrite("Test line 2\n", 12, 1, b);
fflush(b);
fwrite("Test line 3\n", 12, 1, a);
fflush(a);
fwrite("Test line 4\n", 12, 1, b);
fclose(a);
fclose(b);
return 0;
}
and
#include <fcntl.h>
#include <unistd.h>
int main() {
int a = open("file1", O_CREAT | O_WRONLY | O_APPEND);
int b = open("file1", O_CREAT | O_WRONLY | O_APPEND);
write(a, "Test line 1\n", 12);
write(b, "Test line 2\n", 12);
write(a, "Test line 3\n", 12);
write(b, "Test line 4\n", 12);
close(a);
close(b);
return 0;
}
More info on difference between open
and fopen
you could find in answers to this question.
UPDATE:
After inspecting PyPy codebase some more, it seems to me it doesn't use O_APPEND
flag by some reason, but O_WRONLY | O_CREAT
for "a" mode. So it is the real reason in PyPy you need to seek
to the end of file after each write
call, as J.F. Sebastian mentioned in another answer. I guess a bug should be created at PyPy bugtracker, as O_APPEND
flag is available both on Windows and Unix. So, what PyPy does now looks like:
#include <fcntl.h>
#include <unistd.h>
int main() {
int a = open("file1", O_CREAT | O_WRONLY);
int b = open("file1", O_CREAT | O_WRONLY);
write(a, "Test line 1\n", 12);
write(b, "Test line 2\n", 12);
write(a, "Test line 3\n", 12);
write(b, "Test line 4\n", 12);
close(a);
close(b);
return 0;
}
Without O_APPEND flag it should reproduce PyPy behavior.
On POSIX systems:
O_APPEND
If set, the file offset shall be set to the end of the file prior to each write.
It means that if a file is opened in "append" mode then when its buffer is flushed; the content shall go to the end of the file.
Python 2, Python 3, Jython respect that on my machine. In your case, the content is smaller than the file buffer therefore you see all writes from one file followed by all writes from another file in the result on the disk.
It is easier to understand if the files are line-buffered:
from __future__ import with_statement
filename = 'file1'
with open(filename, 'wb', 0) as file:
pass # truncate the file
f1 = open(filename, 'a', 1)
f2 = open(filename, 'a', 1)
f1.write('f1 1\n')
f2.write('f2 aa\n')
f1.write('f1 222\n')
f2.write('f2 bbbb\n')
f1.write('f1 333\n')
f2.write('f2 cc\n')
f1 1
f2 aa
f1 222
f2 bbbb
f1 333
f2 cc
Python documentation does not mandate such behaviour. It just mentions:
..'a' for appending (which on some Unix systems means that all writes append to the end of the file regardless of the current seek position)emphasize is mine
Pypy produces the following output in unbuffered and line-buffered mode:
f2 aaff2 bbbf1f2 cc
Manually moving the file position to the end fixes pypy output:
from __future__ import with_statement
import os
filename = 'file1'
with open(filename, 'wb', 0) as file:
pass # truncate the file
f1 = open(filename, 'a', 1)
f2 = open(filename, 'a', 1)
f1.write('f1 1\n')
f2.seek(0, os.SEEK_END)
f2.write('f2 aa\n')
f1.seek(0, os.SEEK_END)
f1.write('f1 222\n')
f2.seek(0, os.SEEK_END)
f2.write('f2 bbbb\n')
f1.seek(0, os.SEEK_END)
f1.write('f1 333\n')
f2.seek(0, os.SEEK_END)
f2.write('f2 cc\n')
If the file is fully-buffered then add .flush()
after each .write()
.
It is probably not a good idea to write to the same file using more than one file object at once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With