I'm trying to count the individual words in a column of my data frame. It looks like this. In reality the texts are Tweets.
text
this is some text that I want to count
That's all I wan't
It is unicode text
So what I found from other stackoverflow questions is that I could use the following:
Count most frequent 100 words from sentences in Dataframe Pandas
Count distinct words from a Pandas Data Frame
My df is called result and this is my code:
from collections import Counter
result2 = Counter(" ".join(result['text'].values.tolist()).split(" ")).items()
result2
I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-6-2f018a9f912d> in <module>()
1 from collections import Counter
----> 2 result2 = Counter(" ".join(result['text'].values.tolist()).split(" ")).items()
3 result2
TypeError: sequence item 25831: expected str instance, float found
The dtype of text is object, which from what I understand is correct for unicode text data.
The issue is occurring because some of the values in your series (result['text']) is of type float. If you want to consider them during ' '.join() as well, then you would need to convert the floats to string before passing them onto str.join().
You can use Series.astype() to convert all the values to string. Also, you really do not need to use .tolist() , you can simply give the series to str.join() as well. Example -
result2 = Counter(" ".join(result['text'].astype(str)).split(" ")).items()
Demo -
In [60]: df = pd.DataFrame([['blah'],['asd'],[10.1]],columns=['A'])
In [61]: df
Out[61]:
A
0 blah
1 asd
2 10.1
In [62]: ' '.join(df['A'])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-62-77e78c2ee142> in <module>()
----> 1 ' '.join(df['A'])
TypeError: sequence item 2: expected str instance, float found
In [63]: ' '.join(df['A'].astype(str))
Out[63]: 'blah asd 10.1'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With