I want to remove all double quotes within all columns and all values in a dataframe. So if I have a value such as
potatoes are "great"
I want to return
potatoes are great
DataFrame.replace() lets me do this if I know the entire value I'm changing, but is there a way to remove individual characters?
Using the pandas. str. replace() Function to Remove Quotation Marks from Data Frame in Python.
Use str.Call str. replace(old, new) on a string with the quote character '"' as old and an empty string "" as new to remove all quotes from the string.
Use the String. replaceAll() method to remove all double quotes from a string, e.g. str. replaceAll('"', '') . The replace() method will return a new string with all double quotes removed.
Use str.replace(old, new) with old as "'" and new as "" to remove all single quotes from the string.
You can do this on each Series/column using str.replace:
In [11]: s = pd.Series(['potatoes are "great"', 'they are'])
In [12]: s
Out[12]:
0 potatoes are "great"
1 they are
dtype: object
In [13]: s.str.replace('"', '')
Out[13]:
0 potatoes are great
1 they are
dtype: object
I would be wary of doing this across the entire DataFrame, because it will also change columns of non-strings to strings, however you could iterate over each column:
for i, col in enumerate(df.columns):
df.iloc[:, i] = df.iloc[:, i].str.replace('"', '')
If you were sure every item was a string, you could use applymap:
df.applymap(lambda x: x.replace('"', ''))
use DataFrame.apply()
and Series.str.replace()
:
import numpy as np
import pandas as pd
import random
a = np.array(["".join(random.sample('abcde"', 3)) for i in range(100)]).reshape(10, 10)
df = pd.DataFrame(a)
df.apply(lambda s:s.str.replace('"', ""))
If just string
columns:
df.ix[:,df.dtypes==object].apply(lambda s:s.str.replace('"', ""))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With