I wanted to create a text file containing a number of ''pages'' and log the byte offset of each page in a separate file. To do that, I printed strings to the main output file and counted bytes using bytes_written += file.write(str)
. However, the byte offset was often wrong.
I switched to bytes_written += os.write(fd, bytes(str, 'UTF-8'))
and it works now. What is the difference between write()
and os.write()
? Or is the difference in the return value simply due to my manual conversion of the string to UTF-8?
OS comes under Python's standard utility modules. This module provides a portable way of using operating system dependent functionality. os. write() method in Python is used to write a bytestring to the given file descriptor.
Let us see how to write bytes to a file in Python. First, open a file in binary write mode and then specify the contents to write in the form of bytes. Next, use the write function to write the byte contents to a binary file.
Python file method read() reads at most size bytes from the file. If the read hits EOF before obtaining size bytes, then it reads only available bytes.
What is the difference between
write()
andos.write()
?
It's analogous to the difference between the C functions fwrite(3)
and write(2)
.
The latter is a thin wrapper around an OS-level system call, whereas the former is part of the standard C library, which does some additional buffering, and ultimately calls the latter when it actually needs to write its buffered data to a file descriptor.
Python 3.x adds some additional logic to a file
object's write()
method which does automatic character-encoding conversion for Python str
objects, whereas Python 2.x does not.
Or is the difference in the return value simply due to my manual conversion of the string to UTF-8?
In Python 3.x, the difference is more related to the way in which you opened the file.
If you opened the file in binary mode, e.g. f = open(filename, 'wb')
then f.write()
expects a bytes
object, and will return the number of bytes written.
If, instead, you opened the file in text mode, e.g. f = open(filename, 'w')
then f.write()
expects a str
object, and will return the number of characters written, which for multi-byte encodings such as UTF-8 may not match the number of bytes written.
Note that the os.write()
method always expects a bytes
object, regardless of whether or not the O_BINARY
flag was used when calling os.open()
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With