Pandas adds "\r" to csv file

Question

This boils down to a simpler problem here

I have a pandas dataframe that looks like this:

In [1]: df
Out[1]:
   0        1
0  a  A
B
C
1  a  D
E
F
2  b  A
B
C

When I write it to a csv file then read it back, I expect to have the same dataframe. This is not the case:

In [2]: df.to_csv("out.csv")

In [3]: df = pd.read_csv("out.csv", index_col=0)

In [4]: df
Out[4]:
   0            1
0  a  A
B
C
1  a  D
E
F
2  b  A
B
C

A character is added before each . Writing and reading it again, the same thing happens:

In [5]: df.to_csv("out.csv")

In [6]: df = pd.read_csv("out.csv", index_col=0)

In [7]: df
Out[7]:
   0                1
0  a  A

B

C
1  a  D

E

F
2  b  A

B

C

How can I stop pandas from adding a character?

Edits:
Yes I am on windows.

pd.read_csv(pd.compat.StringIO(df.to_csv(index=False))) gives me the same dataframe, so the problem seems to be writing to a file

Passing an open file object in binary mode like this:

with open("out.csv", "wb") as file:
    df.to_csv(file)

results in:

TypeError                                 Traceback (most recent call last)
<ipython-input-20-f31d52fb2ce3> in <module>()
      1 with open("out.csv", "wb") as file:
----> 2     df.to_csv(file)
      3

C:\Program Files\Anaconda3\lib\site-packages\pandas\core\frame.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal, **kwds)
   1342                                      doublequote=doublequote,
   1343                                      escapechar=escapechar, decimal=decimal)
-> 1344         formatter.save()
   1345
   1346         if path_or_buf is None:

C:\Program Files\Anaconda3\lib\site-packages\pandas\formats\format.py in save(self)
   1549
   1550             else:
-> 1551                 self._save()
   1552
   1553         finally:

C:\Program Files\Anaconda3\lib\site-packages\pandas\formats\format.py in _save(self)
   1636     def _save(self):
   1637
-> 1638         self._save_header()
   1639
   1640         nrows = len(self.data_index)

C:\Program Files\Anaconda3\lib\site-packages\pandas\formats\format.py in _save_header(self)
   1632
   1633         # write out the index label line
-> 1634         writer.writerow(encoded_labels)
   1635
   1636     def _save(self):

TypeError: a bytes-like object is required, not 'str'

Using regular write does not help

In [1]: with open("out.csv", "w") as file:
   ...:     df.to_csv(file)
   ...:

In [2]: df = pd.read_csv("out.csv")

In [3]: df
Out[3]:
   Unnamed: 0  0            1
0           0  a  A
B
C
1           1  a  D
E
F
2           2  b  A
B
C

My python version is Python 3.5.2 :: Anaconda 4.2.0 (64-bit)

I have determined that the problem is with pandas.read_csv and not pandas.to_csv

In [1]: df
Out[1]:
   0        1
0  a  A
B
C
1  a  D
E
F
2  b  A
B
C

In [2]: df.to_csv("out.csv")

In [3]: with open("out.csv", "r") as file:
    ...:     s = file.read()
    ...:

In [4]: s  # Only to_csv has been used, no 
's!
Out[4]: ',0,1
0,a,"A
B
C"
1,a,"D
E
F"
2,b,"A
B
C"
'

In [5]: pd.read_csv("out.csv")  # Now the 
's come in
Out[5]:
   Unnamed: 0  0            1
0           0  a  A
B
C
1           1  a  D
E
F
2           2  b  A
B
C

miginside · Accepted Answer

As some have already said in comments above and on the post you have put in reference here, this is a typical windows issue when serializing newlines. The issue has been reported on pandas-dev github #17365 as well.

Hopefully on Python3, you can specify the newline:

with open("out.csv", mode='w', newline='
') as f:
    df.to_csv(f, sep=",", line_terminator='
', encoding='utf-8')

Pandas adds "\r" to csv file

Tags:

python

python-3.x

pandas

csv

This boils down to a simpler problem here

fon01234

1 Answers

miginside

Recent Activity

Donate For Us

Pandas adds "\r" to csv file

Tags:

python

python-3.x

pandas

csv

This boils down to a simpler problem here

fon01234

1 Answers

miginside

Related questions

Recent Activity

Donate For Us