Changing phrases to vectors with while function in Python

Question

I would like to change the following phrases to vectors with sklearn:

Article 1. It is not good to eat pizza after midnight
Article 2. I wouldn't survive a day withouth stackexchange
Article 3. All of these are just random phrases
Article 4. To prove if my experiment works.
Article 5. The red dog jumps over the lazy fox

I got the following code:

from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(min_df=1)

n=0
while n < 5:
   n = n + 1
   a = ('Article %(number)s' % {'number': n})
   print(a)
   with open("LISR2.txt") as openfile:
     for line in openfile:
       if a in line:
           X=line
           print(vectorizer.fit_transform(X))

Which gives me the following error:

ValueError: Iterable over raw text documents expected, string object received.

Why does this happen? I know this should work because if I type in individually:

X=("It is not good to eat pizza","I wouldn't survive a day", "All of these")

print(vectorizer.fit_transform(X))

It gives me my desired vectors.

(0, 8)  1
(0, 2)  1
(0, 11) 1
(0, 3)  1
(0, 6)  1
(0, 4)  1
(0, 5)  1
(1, 1)  1
(1, 9)  1
(1, 12) 1
(2, 10) 1
(2, 7)  1
(2, 0)  1

SheepPerplexed · Accepted Answer

Look at the docs. It says CountVectorizer.fit_transform expects an iterable of strings (e.g. a list of strings). You are passing a single string instead.

It makes sense, fit_transform in scikit does two things: 1) it learns a model (fit) 2) it applies the model on the data (transform). You want to build a matrix, where columns are all the words in the vocabulary and rows correspond to the documents. For that you need to know the whole vocabulary in your corpus (all the columns).

purna15111 · Answer

This problem occurs when you provide the raw data, means directly giving the string to the extraction function ,instead you can give Y = [X] and pass this Y as the parameter then you will get it correct i faced this problem too

Changing phrases to vectors with while function in Python

Tags:

python

scikit-learn

Rafael Martínez

2 Answers

SheepPerplexed

purna15111

Recent Activity

Donate For Us

Changing phrases to vectors with while function in Python

Tags:

python

scikit-learn

Rafael Martínez

2 Answers

SheepPerplexed

purna15111

Related questions

Recent Activity

Donate For Us