I'm trying to count the individual words in a column of my data frame. It looks like this. In reality the texts are Tweets.
text
this is some text that I want to count
That's all I wan't
It is unicode text
So what I found from other stackoverflow questions is that I could use the following:
Count most frequent 100 words from sentences in Dataframe Pandas
Count distinct words from a Pandas Data Frame
My df is called result and this is my code:
from collections import Counter
result2 = Counter(" ".join(result['text'].values.tolist()).split(" ")).items()
result2
I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-6-2f018a9f912d> in <module>()
1 from collections import Counter
----> 2 result2 = Counter(" ".join(result['text'].values.tolist()).split(" ")).items()
3 result2
TypeError: sequence item 25831: expected str instance, float found
The dtype of text is object, which from what I understand is correct for unicode text data.
The issue is occurring because some of the values in your series (result['text']
) is of type float
. If you want to consider them during ' '.join()
as well, then you would need to convert the floats to string before passing them onto str.join()
.
You can use Series.astype()
to convert all the values to string. Also, you really do not need to use .tolist()
, you can simply give the series to str.join()
as well. Example -
result2 = Counter(" ".join(result['text'].astype(str)).split(" ")).items()
Demo -
In [60]: df = pd.DataFrame([['blah'],['asd'],[10.1]],columns=['A'])
In [61]: df
Out[61]:
A
0 blah
1 asd
2 10.1
In [62]: ' '.join(df['A'])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-62-77e78c2ee142> in <module>()
----> 1 ' '.join(df['A'])
TypeError: sequence item 2: expected str instance, float found
In [63]: ' '.join(df['A'].astype(str))
Out[63]: 'blah asd 10.1'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With