Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non-Integer Class Labels Scikit-Learn

Tags:

Quick SVM question for scikit-learn. When you train an SVM, it's something like

from sklearn import svm s = svm.SVC() s.fit(training_data, labels) 

Is there any way for labels to be a list of a non-numeric type? For instance, if I want to classify vectors as 'cat' or 'dog,' without having to have some kind of external lookup table that encodes 'cat' and 'dog' into 1's and 2's. When I try to just pass a list of strings, I get ...

ValueError: invalid literal for float(): cat

So, it doesn't look like just shoving strings in labels will work. Any ideas?

like image 744
follyroof Avatar asked Nov 09 '12 00:11

follyroof


1 Answers

Passing strings as classes directly is on my todo, but it is not supported in the SVMs yet. For the moment, we have the LabelEncoder that can do the book keeping for you.

[edit]This should work now out of the box[/edit]

like image 176
Andreas Mueller Avatar answered Oct 15 '22 22:10

Andreas Mueller