I have a dataframe with data in the of similar format
song lyric tokenized_lyrics
0 Song 1 Look at her face, it's a wonderful face [look , at , her ,face, it's a wonderful, face ]
1 Song 2 Some lyrics of the song taken [Some, lyrics ,of, the, song, taken]
I want to count the no of words in the lyrics per song and an output like
song count
song 1 8
song 2 6
I tried aggregate function but it is not yielding the correct result.
Code I tried :
df.groupby(['song']).agg(
word_count = pd.NamedAgg(column='text' , aggfunc = 'count' )
)
How can I achieve the desired result
I couldnt copy tokenized_lyrics as a list, it came in as a string, so I tokenized the lyrics, with the assumption that the delimiter is a white space:
df['token_count'] = df.lyric.str.replace(',','').str.split().str.len()
df.filter(['song','token_count'])
song token_count
0 Song 1 8
1 Song 2 6
note that you can just apply string len to the tokenized lyrics to get your count, since it is a list, it will count the individual items
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With