Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Trouble using SMOTE package "invalid 'labels'"

Tags:

r

data-mining

Using SMOTE package from DMwR library. After loading the data frame, I try to perform sampling as follows :

crime_bal$target <- as.factor(crime_bal$target)
crime_bal <- SMOTE(target ~ .,crime_bal,perc.under = 200, perc.over = 100)

But it always ends up to this error :

Error in factor(newCases[, a], levels = 1:nlevels(data[, a]), labels = levels(data[,  : 
  invalid 'labels'; length 0 should be 1 or 2
In addition: Warning messages:
1: NAs introduced by coercion 
2: NAs introduced by coercion 

Details of my dataset :

> summary(crime_bal)
     text               url            target  
 Length:6326        Length:6326        0:5994  
 Class :character   Class :character   1: 332  
 Mode  :character   Mode  :character

Why do I always end up with the error?

like image 230
Koustuv Sinha Avatar asked Nov 13 '15 12:11

Koustuv Sinha


1 Answers

I have encountered a similar problem, and I solved it by transforming string features into integer type. My guess it only works with data of numeric/factor type. i.e I replace class_1,class_2 with 1,2.

like image 179
Ali Ahmad Avatar answered Nov 15 '22 07:11

Ali Ahmad