In the code below, I am trying to use K nearest neighbors with a single predictor.
To the best of my understanding, there's no need for the number of examples in train.X
to match the number of examples in test.X
, but R
seems to not be parsing my input correctly.
library(ISLR)
library(class)
train=(Weekly$Year<2009)
train.X = Weekly$Lag2[train]
test.X = Weekly$Lag2[!train]
train.Direction = Weekly$Direction[train]
knn.pred = knn(train.X, test.X, train.Direction, k=1)
When the code above is run, it gets the error
Error in knn(train.X, test.X, train.Direction, k = 1) :
dims of 'test' and 'train' differ
How can I fix train.X
and test.X
so that R
parses them correctly?
The knn
function takes matrices or data frames as arguments for train and test set. You're passing in a vector, which gets interpreted as a matrix, but not in the way you want. Specifically, the data you pass in is interpreted as a single data point with the different values denoting the features. This means that the number of features for train and test is different, as the error message suggests.
To fix, simply convert explicitly, e.g.
knn.pred = knn(data.frame(train.X), data.frame(test.X), train.Direction, k=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With