Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3: write method vs. os.write number of bytes returned

I wanted to create a text file containing a number of ''pages'' and log the byte offset of each page in a separate file. To do that, I printed strings to the main output file and counted bytes using bytes_written += file.write(str). However, the byte offset was often wrong.

I switched to bytes_written += os.write(fd, bytes(str, 'UTF-8')) and it works now. What is the difference between write() and os.write()? Or is the difference in the return value simply due to my manual conversion of the string to UTF-8?

like image 589
Hinton Avatar asked Jun 28 '16 19:06

Hinton


People also ask

What is os write in Python?

OS comes under Python's standard utility modules. This module provides a portable way of using operating system dependent functionality. os. write() method in Python is used to write a bytestring to the given file descriptor.

How do you write bytes in Python?

Let us see how to write bytes to a file in Python. First, open a file in binary write mode and then specify the contents to write in the form of bytes. Next, use the write function to write the byte contents to a binary file.

Does Python read return bytes?

Python file method read() reads at most size bytes from the file. If the read hits EOF before obtaining size bytes, then it reads only available bytes.


1 Answers

What is the difference between write() and os.write()?

It's analogous to the difference between the C functions fwrite(3) and write(2).

The latter is a thin wrapper around an OS-level system call, whereas the former is part of the standard C library, which does some additional buffering, and ultimately calls the latter when it actually needs to write its buffered data to a file descriptor.

Python 3.x adds some additional logic to a file object's write() method which does automatic character-encoding conversion for Python str objects, whereas Python 2.x does not.

Or is the difference in the return value simply due to my manual conversion of the string to UTF-8?

In Python 3.x, the difference is more related to the way in which you opened the file.

If you opened the file in binary mode, e.g. f = open(filename, 'wb') then f.write() expects a bytes object, and will return the number of bytes written.

If, instead, you opened the file in text mode, e.g. f = open(filename, 'w') then f.write() expects a str object, and will return the number of characters written, which for multi-byte encodings such as UTF-8 may not match the number of bytes written.

Note that the os.write() method always expects a bytes object, regardless of whether or not the O_BINARY flag was used when calling os.open().

like image 173
Aya Avatar answered Oct 12 '22 09:10

Aya