How do you retain all distinct rows in a data frame excluding certain columns by specifying only the columns you want to exclude. In the example below <pre class="prettyprint"><code>library(dplyr) dat <- data_frame( x = c("a", "a", "b"), y = c("c", "c", "d"), z = c("e", "f", "f") ) </code></pre> I'd like to return a data frame with all distinct rows among variables <code>x</code> and <code>y</code> by only specifying that I'd like to exclude column <code>z</code>. The data frame returned should look like the data frame returned from here <pre class="prettyprint"><code>dat %>% distinct(x, y) </code></pre> You would think you can do the following, but it results in an error <pre class="prettyprint"><code>dat %>% distinct(-z) </code></pre> I prefer a tidyverse solution

Just do: <pre class="prettyprint"><code>library(dplyr) dat %>% distinct_at(vars(-z)) </code></pre> Output: <pre class="prettyprint"><code># A tibble: 2 x 2 x y <chr> <chr> 1 a c 2 b d </code></pre> And as of <code>dplyr</code> 1.0.0, you can use <code>across</code>: <pre class="prettyprint"><code>dat %>% distinct(across(-z)) </code></pre>

We could use <pre class="prettyprint"><code>dat %>% distinct(!!! rlang::syms(setdiff(names(.), "z"))) # A tibble: 2 x 2 # x y # <chr> <chr> #1 a c #2 b d </code></pre>

How to specify columns to exclude when retaining all distinct rows?

Tags:

r

dplyr

tidyverse

How do you retain all distinct rows in a data frame excluding certain columns by specifying only the columns you want to exclude. In the example below

library(dplyr)
dat <- data_frame(
    x = c("a", "a", "b"),
    y = c("c", "c", "d"),
    z = c("e", "f", "f")
)

I'd like to return a data frame with all distinct rows among variables x and y by only specifying that I'd like to exclude column z. The data frame returned should look like the data frame returned from here

dat %>% distinct(x, y)

You would think you can do the following, but it results in an error

dat %>% distinct(-z)

I prefer a tidyverse solution

951

asked Feb 19 '19 17:02

David Rubinger

2 Answers

Just do:

library(dplyr)

dat %>%
  distinct_at(vars(-z))

Output:

# A tibble: 2 x 2
  x     y    
  <chr> <chr>
1 a     c    
2 b     d

And as of dplyr 1.0.0, you can use across:

dat %>% 
  distinct(across(-z))

answered Jan 12 '23 00:01

arg0naut91

We could use

dat %>% 
    distinct(!!! rlang::syms(setdiff(names(.), "z")))
# A tibble: 2 x 2
#  x     y    
#  <chr> <chr>
#1 a     c    
#2 b     d

answered Jan 11 '23 23:01

akrun

Related questions
                            
                                R: How can I count how many points are in each cell of my grid?
                            
                                Adding google tiles with R
                            
                                as.Date from 'YYYY.mm' format [duplicate]
                            
                                Error in installing R package "AppliedPredictiveModeling"
                            
                                Shiny app fails with "argument 1 (type 'closure') cannot be handled by 'cat'" - what does this mean?
                            
                                How to refresh or retry a specific web page using httr GET command?
                            
                                Unexpected behavior for setdiff() function in R
                            
                                Rolling average by group R data.table
                            
                                Opposite of tidyr::separate, concatenating multiple columns into one
                            
                                Coloring points based on variable with R ggpairs
                            
                                Arules returning empty LHS
                            
                                R: undo ordering / unsort / switch back to initial order
                            
                                R Markdown inline code not executed
                            
                                Insert a new line in shiny's modalDialog
                            
                                How to get the package name of a function in R? [duplicate]
                            
                                R Shiny DataTable selected row color
                            
                                ggplot line plot different colors for sections
                            
                                Compute running mean with tapered windows
                            
                                r-convert list column into character vector where lists are characters
                            
                                Select or subset variables whose column sums are not zero

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With