Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas delete parts of string after specified character inside a dataframe

I would like a simple mehtod to delete parts of a string after a specified character inside a dataframe. Here is a simplified example:

df:

   obs         a  b  c  d
0    1   1-23-12  1  2  3
1    2  12-23-13  4  5  5
2    3  21-23-14  4  5  5

I would like to remove the parts in the a column after the first - sign, my expected output is:

newdf:

   obs   a  b  c  d
0    1   1  1  2  3
1    2  12  4  5  5
2    3  21  4  5  5
like image 812
jonas Avatar asked May 20 '14 18:05

jonas


People also ask

How do I remove part of a string in a data frame?

To remove characters from columns in Pandas DataFrame, use the replace(~) method. Here, [ab] is regex and matches any character that is a or b .

How do I remove a substring from a string in a DataFrame Python?

Replace Substring Using replace() Method You can replace substring of pandas DataFrame column by using DataFrame. replace() method. This method by default finds the exact sting match and replaces it with the specified value. Use regex=True to replace substring.

How do you trim strings in pandas?

lstrip() is used to remove spaces from the left side of string, str. rstrip() to remove spaces from right side of the string and str. strip() removes spaces from both sides.


1 Answers

You can reformat the values by passing a reformatting function into the apply method as follows:

from StringIO import StringIO
import pandas as pd

data = """   obs  a  b  c  d
1   1-23-12  1  2  3
2  12-23-13  4  5  5
3  21-23-14  4  5  5"""

# Build dataframe from data
df = pd.read_table(StringIO(data), sep='  ')

# Reformat values for column a using an unnamed lambda function
df['a'] = df['a'].apply(lambda x: x.split('-')[0])

This gives you your desired result:

   obs   a  b  c  d
0    1   1  1  2  3
1    2  12  4  5  5
2    3  21  4  5  5
like image 198
joemar.ct Avatar answered Oct 19 '22 05:10

joemar.ct