Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Preserving many columns when using gather

Tags:

r

tidyr

I have a very wide df (85 columns) that I want to convert into long format using gather. Rather than use the -c(all the columns I do not want to gather) syntax to preserve the columns, I have made an object of the column names and get the error.

Error in -c(KeepThese) : invalid argument to unary operator

For example, using iris with a few additional fields

require(tidyr)
iris$Season <- sample(c("AAA", "BBB"), nrow(iris), replace = T)
iris$Var <- sample(c("CCC", "DDD"), nrow(iris), replace = T)

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species Season Var
1          5.1         3.5          1.4         0.2  setosa    AAA DDD
2          4.9         3.0          1.4         0.2  setosa    AAA CCC
3          4.7         3.2          1.3         0.2  setosa    BBB CCC
4          4.6         3.1          1.5         0.2  setosa    BBB CCC
5          5.0         3.6          1.4         0.2  setosa    BBB DDD
6          5.4         3.9          1.7         0.4  setosa    AAA DDD

I want to gather all the columns except 5:7, which are made into an object below.

KeepThese <- colnames(iris)[5:7]

Now, I want to gather all the columns except 5:7 and call the ID column Part and the numeric field Value and use the following code and get the error.

dat <- iris %>% gather(Part, Value, -c(KeepThese))


Error in -c(KeepNames) : invalid argument to unary operator

How can I specify a bunch of columns that I do not want to gather without writing each one out in tidyr?

ADDITION Why does my code not work?

like image 693
B. Davis Avatar asked Jan 06 '23 20:01

B. Davis


2 Answers

Updated Answer: As noted in the comment by Hadley, one_of() is what you want.

dat <- iris %>% gather(Part, Value, -one_of(KeepThese))

Original Answer:

Another option is to use as.name(). We can create a list of name classed objects from the column names we want to keep. Then use do.call(c, ...) to insert it into gather().

dat <- iris %>% gather(Part, Value, -do.call("c", lapply(KeepThese, as.name)))
head(dat)
#   Species Season Var         Part Value
# 1  setosa    AAA CCC Sepal.Length   5.1
# 2  setosa    AAA CCC Sepal.Length   4.9
# 3  setosa    AAA DDD Sepal.Length   4.7
# 4  setosa    AAA CCC Sepal.Length   4.6
# 5  setosa    AAA CCC Sepal.Length   5.0
# 6  setosa    AAA DDD Sepal.Length   5.4

Alternatively, a simple %in% with which() would also do it (quite similar to jbaums' answer).

iris %>% gather(Part, Value, -which(names(.) %in% KeepThese))
like image 152
Rich Scriven Avatar answered Jan 09 '23 10:01

Rich Scriven


You could use match (or pass the column numbers to gather in the first instance):

dat <- iris %>% gather(Part, Value, -(match(KeepThese, colnames(.))))
head(dat)

##   Species Season Var         Part Value
## 1  setosa    BBB DDD Sepal.Length   5.1
## 2  setosa    AAA CCC Sepal.Length   4.9
## 3  setosa    BBB CCC Sepal.Length   4.7
## 4  setosa    AAA CCC Sepal.Length   4.6
## 5  setosa    BBB DDD Sepal.Length   5.0
## 6  setosa    BBB CCC Sepal.Length   5.4
like image 26
jbaums Avatar answered Jan 09 '23 10:01

jbaums