Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the length of a cell value in pandas dataframe?

Tags:

Have a pandas dataframe:

idx Event 0   abc/def 1   abc 2   abc/def/hij 

Run: df['EventItem'] = df['Event'].str.split("/")

Got:

idx EventItem 0   ['abc','def'] 1   ['abc'] 2   ['abc','def','hij'] 

Want to get the length of each cell, run df['EventCount'] = len(df['EventItem'])

Got:

idx EventCount 0   6 1   6 2   6 

How can I get the correct count as follow?

idx EventCount 0   2 1   1 2   3 
like image 328
Kevin Avatar asked May 19 '16 23:05

Kevin


2 Answers

You can use .str.len to get the length of a list, even though lists aren't strings:

df['EventCount'] = df['Event'].str.split("/").str.len() 

Alternatively, the count you're looking for is just 1 more than the count of "/"'s in the string, so you could add 1 to the result of .str.count:

df['EventCount'] = df['Event'].str.count("/") + 1 

The resulting output for either method:

         Event  EventCount 0      abc/def           2 1          abc           1 2  abc/def/hij           3 

Timings on a slightly larger DataFrame:

%timeit df['Event'].str.count("/") + 1 100 loops, best of 3: 3.18 ms per loop  %timeit df['Event'].str.split("/").str.len() 100 loops, best of 3: 4.28 ms per loop  %timeit df['Event'].str.split("/").apply(len) 100 loops, best of 3: 4.08 ms per loop 
like image 52
root Avatar answered Oct 28 '22 15:10

root


You can use apply to apply the len function to each column:

df['EventItem'].apply(len)  0    2 1    1 2    3 Name: EventItem, dtype: int64 
like image 38
johnchase Avatar answered Oct 28 '22 13:10

johnchase