Have a pandas dataframe:
idx Event 0 abc/def 1 abc 2 abc/def/hij
Run: df['EventItem'] = df['Event'].str.split("/")
Got:
idx EventItem 0 ['abc','def'] 1 ['abc'] 2 ['abc','def','hij']
Want to get the length of each cell
, run df['EventCount'] = len(df['EventItem'])
Got:
idx EventCount 0 6 1 6 2 6
How can I get the correct count as follow?
idx EventCount 0 2 1 1 2 3
You can use .str.len
to get the length of a list, even though lists aren't strings:
df['EventCount'] = df['Event'].str.split("/").str.len()
Alternatively, the count you're looking for is just 1 more than the count of "/"
's in the string, so you could add 1 to the result of .str.count
:
df['EventCount'] = df['Event'].str.count("/") + 1
The resulting output for either method:
Event EventCount 0 abc/def 2 1 abc 1 2 abc/def/hij 3
Timings on a slightly larger DataFrame:
%timeit df['Event'].str.count("/") + 1 100 loops, best of 3: 3.18 ms per loop %timeit df['Event'].str.split("/").str.len() 100 loops, best of 3: 4.28 ms per loop %timeit df['Event'].str.split("/").apply(len) 100 loops, best of 3: 4.08 ms per loop
You can use apply
to apply the len
function to each column:
df['EventItem'].apply(len) 0 2 1 1 2 3 Name: EventItem, dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With