Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tidyr - unique way to get combinations (using tidyverse only)

Tags:

r

tidyr

tidyverse

I wanted to get all unique pairwise combinations of a unique string column of a dataframe using the tidyverse (ideally).

Here is a dummy example:

library(tidyverse)

a <- letters[1:3] %>% 
        tibble::as_tibble()
a
#> # A tibble: 3 x 1
#>   value
#>   <chr>
#> 1     a
#> 2     b
#> 3     c

tidyr::crossing(a, a) %>% 
    magrittr::set_colnames(c("words1", "words2"))
#> # A tibble: 9 x 2
#>   words1 words2
#>    <chr>  <chr>
#> 1      a      a
#> 2      a      b
#> 3      a      c
#> 4      b      a
#> 5      b      b
#> 6      b      c
#> 7      c      a
#> 8      c      b
#> 9      c      c

Is there a way to remove 'duplicate' combinations here. That is have the output be the following in this example:

# A tibble: 9 x 2
#>   words1 words2
#>    <chr>  <chr>
#> 1      a      b
#> 2      a      c
#> 3      b      c

I was hoping there would be a nice purrr::map or filter approach to pipe into to complete the above.

EDIT: There are similar questions to this one e.g. here, marked by @Sotos. Here I am specifically looking for tidyverse (purrr, dplyr) ways to complete the pipeline I have setup. The other answers use various other packages that I do not want to include as dependencies.

like image 436
user4687531 Avatar asked Sep 29 '17 14:09

user4687531


People also ask

How do I get only unique values in R?

To find unique values in a column in a data frame, use the unique() function in R. In Exploratory Data Analysis, the unique() function is crucial since it detects and eliminates duplicate values in the data.

How do you count unique combinations in R?

To find the count of unique group combinations in an R data frame, we can use count function of dplyr package along with ungroup function.

Is Tidyr and Tidyverse the same?

tidyr is the Tidyverse package for getting data frames to tidy. Recall that in a tidy data frame: each row is a unit of observation. each column is a single piece of information.

How do I extract unique rows in R?

To extract the unique rows of a data frame in R, use the unique() function and pass the data frame as an argument and the method returns unique rows.


1 Answers

wish there was a better way, but I usually use this...

library(tidyverse)

df <- tibble(value = letters[1:3])

df %>% 
  expand(value, value1 = value) %>% 
  filter(value < value1)

# # A tibble: 3 x 2
#   value value1
#   <chr> <chr> 
# 1 a     b     
# 2 a     c     
# 3 b     c  
like image 78
CJ Yetman Avatar answered Oct 03 '22 22:10

CJ Yetman