How to remove carriage return in a dataframe

Question

I am having a dataframe that contains columns named id, country_name, location and total_deaths. While doing data cleaning process, I came across a value in a row that has ' ' attached. Once I complete cleaning process, I store the resulting dataframe in destination.csv file. Since the above particular row has attached, it always creates a new row.

id                               29
location            Uttar Pradesh
country_name                  India
total_deaths                     20

I want to remove . I tried df.replace({' ': ''}, regex=True). It isn't working for me.

Is there any other solution. Can somebody help?

Edit:

In the above process, I am iterating over df to see if is present. If present, then need to replace. Here row.replace() or row.str.strip() doesn't seem to be working or I could be doing it in a wrong way.

I don't want specify the column name or row number while using replace(). Because I can't be certain that only 'location' column will be having . Please find the code below.

count = 0
for row_index, row in df.iterrows():
    if re.search(r"\r", str(row)):
        print type(row)               #Return type is pandas.Series
        row.replace({r'\r': ''} , regex=True)
        print row
        count += 1

jezrael · Accepted Answer

Another solution is use str.strip:

df['29'] = df['29'].str.strip(r'\r')
print df
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

If you want use replace, add r and one \:

print df.replace({r'\r': ''}, regex=True)
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

In replace you can define column for replacing like:

print df
               id               29
0        location  Uttar Pradesh
1    country_name            India
2  total_deaths
               20

print df.replace({'29': {r'\r': ''}}, regex=True)
               id             29
0        location  Uttar Pradesh
1    country_name          India
2  total_deaths
             20

print df.replace({r'\r': ''}, regex=True)
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

EDIT by comment:

import pandas as pd

df = pd.read_csv('data_source_test.csv')
print df
   id country_name           location  total_deaths
0   1        India          New Delhi           354
1   2        India         Tamil Nadu            48
2   3        India          Karnataka             0
3   4        India      Andra Pradesh            32
4   5        India              Assam           679
5   6        India             Kerala           128
6   7        India             Punjab             0
7   8        India      Mumbai, Thane             1
8   9        India  Uttar Pradesh
            20
9  10        India             Orissa            69

print df.replace({r'
': ''}, regex=True)
   id country_name       location  total_deaths
0   1        India      New Delhi           354
1   2        India     Tamil Nadu            48
2   3        India      Karnataka             0
3   4        India  Andra Pradesh            32
4   5        India          Assam           679
5   6        India         Kerala           128
6   7        India         Punjab             0
7   8        India  Mumbai, Thane             1
8   9        India  Uttar Pradesh            20
9  10        India         Orissa            69

If need replace only in column location:

df['location'] = df.location.str.replace(r'
', '')
print df
   id country_name       location  total_deaths
0   1        India      New Delhi           354
1   2        India     Tamil Nadu            48
2   3        India      Karnataka             0
3   4        India  Andra Pradesh            32
4   5        India          Assam           679
5   6        India         Kerala           128
6   7        India         Punjab             0
7   8        India  Mumbai, Thane             1
8   9        India  Uttar Pradesh            20
9  10        India         Orissa            69

EdChum · Answer

use str.replace, you need to escape the sequence so it treats it as a carriage return rather than the literal :

In [15]:
df['29'] = df['29'].str.replace(r'\r','')
df

Out[15]:
             id             29
0      location  Uttar Pradesh
1  country_name          India
2  total_deaths             20

How to remove carriage return in a dataframe

Tags:

python

replace

pandas

carriage-return

data-cleaning

Edit:

Saranya Krishnamurthy

2 Answers

jezrael

EdChum

Recent Activity

Donate For Us

How to remove carriage return in a dataframe

Tags:

python

replace

pandas

carriage-return

data-cleaning

Edit:

Saranya Krishnamurthy

2 Answers

jezrael

EdChum

Related questions

Recent Activity

Donate For Us