Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract and sum unique words from a pandas DataFrame

Consider the following DataFrame:

df = pd.DataFrame({'animals': [['dog','cat','snake','lion','tiger'], 
                  ['dog','moose','alligator','lion','tiger'], 
                  ['eagle','moose','alligator','lion','tiger'],
                  ['cat','alligator','lion']]})

I need to extract every single unique animal and sum the number of occurrences. The output should be something like:

dog             2  
cat             2  
snake           1  
lion            4  
tiger           3  
moose           2  
alligator       3  
eagle           1 

Similar to what df.value_counts() does.

Much appreciated.

like image 879
jcf Avatar asked Dec 31 '22 04:12

jcf


2 Answers

You can use explode and value_counts:

df.animals.explode().value_counts()

Output:

lion         4
tiger        3
alligator    3
moose        2
cat          2
dog          2
eagle        1
snake        1
Name: animals, dtype: int64
like image 150
Quang Hoang Avatar answered Jan 13 '23 14:01

Quang Hoang


One way with Counter + chain

import pandas as pd
from collections import Counter
from itertools import chain

pd.Series(Counter(chain.from_iterable(df['animals'])))

dog          2
cat          2
snake        1
lion         4
tiger        3
moose        2
alligator    3
eagle        1
dtype: int64
like image 34
ALollz Avatar answered Jan 13 '23 14:01

ALollz