Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using select-like mechanism to select variables for distinct call in dplyr

Desired results

Using simple syntax I filter on vs and am columns leaving also the cyl values.

data(mtcars)
dta <- mtcars[,c("vs", "am", "cyl")]
# Desired results
dta %>% distinct(vs, am, .keep_all = TRUE)

Desired syntax

I would like to reverse the syntax above and select distinct observations on all values excluding the cyl column, corresponding to the example below:

dta %>% distinct(vars(-contains("cyl")), .keep_all = TRUE)

that naturally does not work:

>> dta %>% distinct(vars(-contains("cyl")), .keep_all = TRUE)
   vs am cyl vars(-contains("cyl"))
1   0  1   6      ~-contains("cyl")
2   0  1   6      ~-contains("cyl")
3   1  1   4      ~-contains("cyl")
4   1  0   6      ~-contains("cyl")
5   0  0   8      ~-contains("cyl")
6   1  0   6      ~-contains("cyl")
7   0  0   8      ~-contains("cyl")
like image 984
Konrad Avatar asked Oct 17 '22 08:10

Konrad


1 Answers

If you don't mind not using distinct, then you can use group_by_at together with slice to get your desired result,i.e.

library(dplyr)

dta %>% 
 group_by_at(vars(-cyl)) %>% 
 slice(1L)

# A tibble: 4 x 3
# Groups:   vs, am [4]
#     vs    am   cyl
#  <dbl> <dbl> <dbl>
#1     0     0     8
#2     0     1     6
#3     1     0     6
#4     1     1     4
like image 112
Sotos Avatar answered Oct 21 '22 02:10

Sotos