Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Write Replaces "\n" With "\r\n" in Windows

After looking into my question here, I found that it was caused by a simpler problem.

When I write "\n" to a file, I expect to read in "\n" from the file. This is not always the case in Windows.

In [1]: with open("out", "w") as file:
   ...:     file.write("\n")
   ...:

In [2]: with open("out", "r") as file:
   ...:     s = file.read()
   ...:

In [3]: s  # I expect "\n" and I get it
Out[3]: '\n'

In [4]: with open("out", "rb") as file:
   ...:     b = file.read()
   ...:

In [5]: b  # I expect b"\n"... Uh-oh
Out[5]: b'\r\n'

In [6]: with open("out", "wb") as file:
   ...:     file.write(b"\n")
   ...:

In [7]: with open("out", "r") as file:
   ...:     s = file.read()
   ...:

In [8]: s  # I expect "\n" and I get it
Out[8]: '\n'

In [9]: with open("out", "rb") as file:
   ...:     b = file.read()
   ...:

In [10]: b  # I expect b"\n" and I get it
Out[10]: b'\n'

In a more organized way:

| Method of Writing | Method of Reading | "\n" Turns Into |
|-------------------|-------------------|-----------------|
| "w"               | "r"               | "\n"            |
| "w"               | "rb"              | b"\r\n"         |
| "wb"              | "r"               | "\n"            |
| "wb"              | "rb"              | b"\n"           |

When I try this on my Linux virtual machine, it always returns \n. How can I do this in Windows?

Edit: This is especially problematic with the pandas library, which appears to write DataFrames to csv with "w" and read csvs with "rb". See the question linked at the top for an example of this.

like image 798
fon01234 Avatar asked Nov 20 '17 03:11

fon01234


People also ask

What is \r is doing in the Python?

In Python strings, the backslash "\" is a special character, also called the "escape" character. It is used in representing certain whitespace characters: "\t" is a tab, "\n" is a newline, and "\r" is a carriage return.

How to Write a Line break in Python?

In Python, the new line character “\n” is used to create a new line. When inserted in a string all the characters after the character are added to a new line. Essentially the occurrence of the “\n” indicates that the line ends here and the remaining characters would be displayed in a new line.

How to add a new Line in a list Python?

The new line character in Python is \n . It is used to indicate the end of a line of text. You can print strings without adding a new line with end = <character> , which <character> is the character that will be used to separate the lines.


1 Answers

Since you are using Python 3, you're in luck. When you open the file for writing, just specify newline='\n' to ensure that it writes '\n' instead of the system default, which is \r\n on Windows. From the docs:

When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.

The reason that you think that you are "sometimes" seeing the two-character output is that when you open the file in binary mode, no conversion is done at all. Byte arrays are just displayed in ASCII whenever possible for your convenience. Don't think of them as real strings until they have been decoded. The binary output you show is the true contents of the file in all your examples.

When you open the file for reading in the default text mode, the newline parameter will work similarly to how it does for writing. By default all \r\n in the file will be converted to just \n after the characters are decoded. This is very nice when your code travels between OSes but your files do not since you can use the exact same code that relies only on \n. If your files travel too, you should stick to the relatively portable newline='\n' for at least the output.

like image 174
Mad Physicist Avatar answered Oct 03 '22 15:10

Mad Physicist