Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr: select by name and value at the same time

Tags:

select

r

dplyr

(This question is probably a duplicate, but I can't find it being asked yet...)

Using dplyr techniques, how can I select columns from a data.frame by both names & values at the same time? For example the following (which doesn't work):

> data.frame(x=4, y=6, z=3) %>%
    select_if(matches('x') | mean(.) > 5)
Error: No tidyselect variables were registered

In base R, I would do something like this:

> df <- data.frame(x=4, y=6, z=3)
> df[names(df) == 'x' | colMeans(df) > 5]
  x y
1 4 6
like image 664
Ken Williams Avatar asked Jan 27 '23 11:01

Ken Williams


2 Answers

Update: Using dplyr v1.0.0:

data.frame(x=4, y=6, z=3) %>%
      select(matches("x"), where(~mean(.) > 5))

Original answer: You could use select with a comma and colMeans

data.frame(x=4, y=6, z=3) %>%
  select(matches("x"), which(colMeans(.) > 5))
  x y
1 4 6
like image 200
Andrew Avatar answered Jan 29 '23 01:01

Andrew


We could use select_if to extract the column names based on the condiiton and use that in select for those column names matching 'x'

data.frame(x=4, y=6, z=3) %>% 
     select(matches("x"), names(select_if(., ~ mean(.x) > 5)))
#  x y
#1 4 6

NOTE: Here we are using select_if as the OP wanted an answer specificially with that. Otherwise, it can be done in many other ways

like image 21
akrun Avatar answered Jan 29 '23 01:01

akrun