training set
trainSample <- cbind(data[1:980,1], data[1:980,2]) cl <-
factor(c(data[1:980,3]))
test set
testSample <- data(data[981:1485,1], data[981:1485,2])
cl.test <- clknn
prediction
k <- knn(trainSample, testSample, cl, k = 5)
output
< k
[1] 2 2 1 1 1 1 2 1 2 1 1 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 1 2 2 1 1 2 2 1 1 2 2 2 2 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2
[60] 2 2 2 2 1 2 2 2 2 1 2 2 1 2 2 2 1 1 2 1 2 2 1 1 1 2 1 2 2 2 1 2 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2 1 2 2 2 2 2 2
[119] 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2 2 2 2 1 2 1 1 1 1 2 2 2 2 2 2 2 2 1 2 1 2 2 2 2 2 2 1 2 2 1 2 1 2 2 2 2
[178] 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 1
[237] 2 2 2 2 2 1 2 2 1 2 2 1 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 1 2 2 1 2 2 2 2 1 2 1 2 2 2 2 1 1 2 1 2 2 2 2 1 2 2 2
[296] 2 2 2 1 2 1 2 1 1 1 2 1 2 2 1 1 2 2 1 2 1 2 2 1 2 2 2 1 2 2 2 2 2 1 2 2 2 1 2 2 2 1 2 2 2 2 2 2 2 1 2 1 1 2 2 2 1 1 2
[355] 1 2 1 2 1 2 1 2 2 2 2 2 2 1 1 1 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 1 2 2 2 2 2 1 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2
[414] 2 2 1 2 2 2 2 2 2 2 2 2 1 1 2 2 2 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
[473] 2 2 2 2 2 1 1 2 2 2 2 2 1 2 2 1 1 2 2 1 2 2 1 2 1 2 2 1 2 2 2 2 2
Levels: 1 2
I want "c" and "not-c" (like in my original data.csv), instead of 1 and 2 (im also not sure which number is supposed to represent which)
Can anyone help ?
All levels of a factor variable can also be renamed using the c() method to create a vector of the levels. The newly created values are then assigned to a factor variable.
The number of levels of a factor or independent variable is equal to the number of variations of that factor that were used in the experiment. If an experiment compared the drug dosages 50 mg, 100 mg, and 150 mg, then the factor "drug dosage" would have three levels: 50 mg, 100 mg, and 150 mg.
To rename an object or variable in R, just copy the data. frame variable to another variable by using the assignment operator (<- or =) and remove the existing variable using rm() function.
Factor: a categorical explanatory variable. Levels: values of a factor. Treatment: a particular combination of values for the factors.
It is very easy to change the factor levels and also not get confused about which is which:
Example data:
> a <- factor(rep(c(1,2,1),50))
> a
[1] 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2
[75] 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1 1
[149] 2 1
Levels: 1 2
#this will help later as a verification
#this counts the instances for 1 and 2
> table(a)
a
1 2
100 50
So as you can see above the order of the levels is 1
first and 2
second. When you change the levels (below) the order remains the same:
#the assignment function levels can be used to change the levels
#the order will remain the same i.e. 'c' for '1' and 'not-c' for '2'
levels(a) <- c('c', 'not-c')
> a
[1] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[25] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[49] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[73] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[97] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[121] c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c c not-c c
[145] c not-c c c not-c c
Levels: c not-c
And this is the verification:
> table(a)
a
c not-c
100 50
Subscripted assignment also works. For example, here's a factor:
> a <- factor(sample(letters[1:5],100,replace=T))
> a
[1] a d d d d a d d a b a b e a c d a c a a b e e d a e d e e a a c a a a b a
[38] b b a a e b d b c a a a b e b c e d d b b c c a b a d c b c c d e b d e d
[75] a a a b e e c b c b c c d d e e d a e e e b c e b e
Levels: a b c d e
Now, let's give a couple of those levels new names:
> levels(a)[c(2,4)] <- c('y','z')
> a
[1] a z z z z a z z a y a y e a c z a c a a y e e z a e z e e a a c a a a y a
[38] y y a a e y z y c a a a y e y c e z z y y c c a y a z c y c c z e y z e z
[75] a a a y e e c y c y c c z z e e z a e e e y c e y e
Levels: a y c z e
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With