I have NumPy array containing string values.
For instance: ["bus", "bar", "bar", "café" .....]
What is the best way of counting the number of occurrences of each element in my array. My current solution is:
# my_list contains my data.
bincount = []
for name in set(my_list.tolist()):
count = sum([1 for elt in my_list if elt == name])
bincount.append(count)
I have tried bincount but it does not work with this type of data.
Do you know a better solution?
np.unique
l = ['bus', 'bar', 'bar', 'café', 'bus', 'bar', 'café']
a, b = np.unique(l, return_counts=True)
a
# array(['bar', 'bus', 'café'], dtype='<U4')
b
# array([3, 2, 2])
pd.value_counts
pd.value_counts(l)
bar 3
bus 2
café 2
dtype: int64
# <=0.23
pd.value_counts(l).values
# 0.24+
pd.value_counts(l).to_numpy()
# array([3, 2, 2])
Make sure pandas is imported (import pandas as pd
).
pd.factorize
np.bincount(pd.factorize(l)[0])
# array([2, 3, 2])
This converts the string to numeric categories (or factors, if you prefer), and counts them.
pd.get_dummies
pd.get_dummies(l).sum()
bar 3
bus 2
café 2
dtype: int64
Slightly roundabout, but interesting nevertheless.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With