I have new question related with this my topic deleting outlier in r with account of nominal var. In new case variables x and x1 has different lenght
x <- c(-10, 1:6, 50)
x1<- c(-20, 1:5, 60)
z<- c(1,2,3,4,5,6,7,8)
bx <- boxplot(x)
bx$out
bx1 <- boxplot(x1)
bx1$out
x<- x[!(x %in% bx$out)]
x1 <- x1[!(x1 %in% bx1$out)]
x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]
x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]
z<-z[-unique(c(x_to_remove,x1_to_remove))]
z
data.frame(cbind(x,x1,z))
then i get the warning
Warning message:
In cbind(x, x1, z) :
number of rows of result is not a multiple of vector length (arg 2)
so in new dataframe the obs. of Z is not corresponding to x and x1. How can i decide this problem? This solustion is not help me Rsolnp: In cbind(temp, funv) : number of rows of result is not a multiple of vector length (arg 1) or i just do anything wrong.
x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]
x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]
z<-z[-unique(c(x_to_remove,x1_to_remove))]
z
d=data.frame(cbind(x,x1,z))
d
it is wrong Warning message:
In cbind(x, x1, z) :
number of rows of result is not a multiple of vector length (arg 2)
d
x x1 z
1 1 1 2
2 2 2 3
3 3 3 4
4 4 4 5
5 5 5 6
6 6 1 2
How on this 3 columg get this output
Na Na Na
1 1 2
2 2 3
3 3 4
4 4 5
5 5 6
Na Na Na
Na Na Na
the six row (d) is superfluous
Differents lengths in original x, x1 and z lists is the first problem, how can you say which z values is related to each x and x1 values?
x <- c(-10, 1:6, 50)
x1<- c(-20, 1:5, 60)
z<- c(1,2,3,4,5,6,7,8)
length(x)
[1] 8
length(x1)
[1] 7
length(z)
[1] 8
Another problem is here:
x<- x[!(x %in% bx$out)] #remove this
x1 <- x1[!(x1 %in% bx1$out)] #remove this
x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]
x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]
You clean x
and x1
before calculating x_to_remove
and x1_to_remove
EDIT: To achieve your desired output try this code (/ode lines added signed in comments):
x <- c(-10, 1:6, 50)
x1<- c(-20, 1:5, 60)
z<- c(1,2,3,4,5,6,7,8)
length_max<-min(length(x),length(x1),length(z)) #Added: identify max length before outlier detection
bx <- boxplot(x)
bx1 <- boxplot(x1)
x_to_remove<-which(x %in% bx$out)
x <- x[!(x %in% bx$out)]
x1_to_remove<-which(x1 %in% bx1$out)
x1 <- x1[!(x1 %in% bx1$out)]
z<-z[-unique(c(x_to_remove,x1_to_remove))]
length_min<-min(length(x),length(x1),length(z)) #Minimum length after outlier remove
d=data.frame(cbind(x[1:length_min],x1[1:length_min],z[1:length_min])) #Bind columns
colnames(d)<-c("x","x1","z")
d_NA<-as.data.frame(matrix(rep(NA,(length_max-length_min)*3),nrow=(length_max-length_min))) #Create NA rows
colnames(d_NA)<-c("x","x1","z")
d<-rbind(d,d_NA) #Your desired output
d
x x1 z
1 1 1 2
2 2 2 3
3 3 3 4
4 4 4 5
5 5 5 6
6 NA NA NA
7 NA NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With