Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

str.contains to create new column in pandas dataframe

I am exploring the titanic data set and want to create a column with similar names. For example, any name that contains "Charles" will show as "ch",as I want to do some group by using those later on. I created a function using the following code:

def cont(Name):
    for a in Name:
        if a.str.contains('Charles'):
            return('Ch')

and then applied using this:

titanic['namest']=titanic['Name'].apply(cont,axis=1)

Error: 'str' object has no attribute 'str'

notebook_link

like image 266
mezz Avatar asked Apr 15 '16 17:04

mezz


1 Answers

Rather than use a loop or apply you can use the vectorised str.contains to return a boolean mask and set all rows where the condition is met to your desired value:

titanic.loc[titanic['Name'].str.contains('Charles'), 'namest'] = 'Ch'
like image 100
EdChum Avatar answered Sep 21 '22 00:09

EdChum