Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all quotes within values in Pandas

I want to remove all double quotes within all columns and all values in a dataframe. So if I have a value such as

potatoes are "great"

I want to return

potatoes are great

DataFrame.replace() lets me do this if I know the entire value I'm changing, but is there a way to remove individual characters?

like image 478
Satvik Beri Avatar asked Jan 31 '14 22:01

Satvik Beri


People also ask

How do you remove quotation marks from a panda?

Using the pandas. str. replace() Function to Remove Quotation Marks from Data Frame in Python.

How do you remove all quotes in Python?

Use str.Call str. replace(old, new) on a string with the quote character '"' as old and an empty string "" as new to remove all quotes from the string.

How do I remove all quotes from a string?

Use the String. replaceAll() method to remove all double quotes from a string, e.g. str. replaceAll('"', '') . The replace() method will return a new string with all double quotes removed.

How do I remove single quotes from a Dataframe in Python?

Use str.replace(old, new) with old as "'" and new as "" to remove all single quotes from the string.


2 Answers

You can do this on each Series/column using str.replace:

In [11]: s = pd.Series(['potatoes are "great"', 'they are'])

In [12]: s
Out[12]: 
0    potatoes are "great"
1                they are
dtype: object

In [13]: s.str.replace('"', '')
Out[13]: 
0    potatoes are great
1              they are
dtype: object

I would be wary of doing this across the entire DataFrame, because it will also change columns of non-strings to strings, however you could iterate over each column:

for i, col in enumerate(df.columns):
    df.iloc[:, i] = df.iloc[:, i].str.replace('"', '')

If you were sure every item was a string, you could use applymap:

df.applymap(lambda x: x.replace('"', ''))
like image 64
Andy Hayden Avatar answered Oct 23 '22 04:10

Andy Hayden


use DataFrame.apply() and Series.str.replace():

import numpy as np
import pandas as pd
import random

a = np.array(["".join(random.sample('abcde"', 3)) for i in range(100)]).reshape(10, 10)
df = pd.DataFrame(a)
df.apply(lambda s:s.str.replace('"', ""))

If just string columns:

df.ix[:,df.dtypes==object].apply(lambda s:s.str.replace('"', ""))
like image 25
HYRY Avatar answered Oct 23 '22 03:10

HYRY