I have a series of the form:
s = Series([['a','a','b'],['b','b','c','d'],[],['a','b','e']])
which looks like
0 [a, a, b]
1 [b, b, c, d]
2 []
3 [a, b, e]
dtype: object
I would like to count how many elements I have in total. My naive tentatives like
s.values.hist()
or
s.values.flatten()
didn't work. What am I doing wrong?
If we stick with the pandas Series as in the original question, one neat option from the Pandas version 0.25.0 onwards is the Series.explode() routine. It returns an exploded list to rows, where the index will be duplicated for these rows.
The original Series from the question:
s = pd.Series([['a','a','b'],['b','b','c','d'],[],['a','b','e']])
Let's explode it and we get a Series, where the index is repeated. The index indicates the index of the original list.
>>> s.explode()
Out:
0 a
0 a
0 b
1 b
1 b
1 c
1 d
2 NaN
3 a
3 b
3 e
dtype: object
>>> type(s.explode())
Out:
pandas.core.series.Series
To count the number of elements we can now use the Series.value_counts():
>>> s.explode().value_counts()
Out:
b 4
a 3
d 1
c 1
e 1
dtype: int64
To include also NaN values:
>>> s.explode().value_counts(dropna=False)
Out:
b 4
a 3
d 1
c 1
e 1
NaN 1
dtype: int64
Finally, plotting the histogram using Series.plot():
>>> s.explode().value_counts(dropna=False).plot(kind = 'bar')
s.map(len).sum()
does the trick. s.map(len)
applies len()
to each element and returns a series of all the lengths, then you can just use sum
on that series.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With