Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to open file in binary mode (b)?

Tags:

python

file-io

I noticed in the docs they always open a CSV file with ‘wb’. Why the ‘b’? I know b stands for binary mode, but when do you use binary mode (I’d guess CSV file is not binary). If relevant I’m writing to the CSV from results from query by arcpy.da.SearchCursor()

EDIT: just noticed according to this answer wb+ is used for writing a binary file. What does including the + do?

like image 978
Celeritas Avatar asked Jul 17 '15 19:07

Celeritas


People also ask

Which mode is used to open file in binary mode?

To open a file in binary format, add 'b' to the mode parameter. Hence the "rb" mode opens the file in binary format for reading, while the "wb" mode opens the file in binary format for writing. Unlike text files, binary files are not human-readable. When opened using any text editor, the data is unrecognizable.

What is binary mode in file handling?

Binary mode allows programmers to manipulate files byte by byte rather than in larger logical structures.

Which statement opens a file in read only binary mode?

rb : Opens the file as read-only in binary format and starts reading from the beginning of the file. While binary format can be used for different purposes, it is usually used when dealing with things like images, videos, etc. r+ : Opens a file for reading and writing, placing the pointer at the beginning of the file.


2 Answers

Use 'b' mode, to read/write binary data as is without any transformations such as converting newlines to/from platform-specific values or decoding/encoding text using a character encoding.

csv module is special. csv data is text and therefore the text mode would be expected but csv module uses '\r\n' by default to terminate rows on all platforms and it always recognizes both '\r' and '\n' as newlines. If you open the corresponding file in the text mode (with universal newlines) then you will get '\r\r\n' (corrupted newlines) on Windows (os.linesep == '\r\n' there). That is why Python 2 docs say that you must use the binary mode. In Python 3, the text mode is used but you should pass newline='' to disable universal newlines mode. You would also want to disable universal newlines if you want to preserve possible newline characters (such as '\r') embedded in fields.

like image 69
jfs Avatar answered Oct 24 '22 22:10

jfs


File open default is to use text mode, which may convert '\n' characters to a platform-specific representation on writing and back on reading.

In windows this will modify the line breaks from '\n' to '\r\n' which will create problem opening the CSV file in other applications/platforms.

Thus, when opening a binary file, you should append 'b' to the mode value to open the file in binary mode, which will improve portability. On systems that don’t have this distinction, adding the 'b' has no effect.

Note: 'w+' truncates the file.

Modes 'r+', 'w+' and 'a+' open the file for updating (reading and writing).

As detailed here: https://docs.python.org/2/library/functions.html#open

like image 42
CodePick Avatar answered Oct 24 '22 23:10

CodePick