I would like a simple mehtod to delete parts of a string after a specified character inside a dataframe. Here is a simplified example:
df:
obs a b c d
0 1 1-23-12 1 2 3
1 2 12-23-13 4 5 5
2 3 21-23-14 4 5 5
I would like to remove the parts in the a column after the first - sign, my expected output is:
newdf:
obs a b c d
0 1 1 1 2 3
1 2 12 4 5 5
2 3 21 4 5 5
To remove characters from columns in Pandas DataFrame, use the replace(~) method. Here, [ab] is regex and matches any character that is a or b .
Replace Substring Using replace() Method You can replace substring of pandas DataFrame column by using DataFrame. replace() method. This method by default finds the exact sting match and replaces it with the specified value. Use regex=True to replace substring.
lstrip() is used to remove spaces from the left side of string, str. rstrip() to remove spaces from right side of the string and str. strip() removes spaces from both sides.
You can reformat the values by passing a reformatting function into the apply
method as follows:
from StringIO import StringIO
import pandas as pd
data = """ obs a b c d
1 1-23-12 1 2 3
2 12-23-13 4 5 5
3 21-23-14 4 5 5"""
# Build dataframe from data
df = pd.read_table(StringIO(data), sep=' ')
# Reformat values for column a using an unnamed lambda function
df['a'] = df['a'].apply(lambda x: x.split('-')[0])
This gives you your desired result:
obs a b c d
0 1 1 1 2 3
1 2 12 4 5 5
2 3 21 4 5 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With