Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Turn ordered pairs into unordered pairs in a data frame with dplyr

Tags:

r

dplyr

I have a data frame that looks like this:

library(dplyr)
df <- data_frame(doc.x = c("a", "b", "c", "d"),
                 doc.y = c("b", "a", "d", "c"))

So that df is:

Source: local data frame [4 x 2]

  doc.x doc.y
  (chr) (chr)
1     a     b
2     b     a
3     c     d
4     d     c

This is a list of ordered pairs, a to d but also d to a, and so on. What is a dplyr-like way to return only a list of unordered pairs in this data frame? I.e.

  doc.x doc.y
  (chr) (chr)
1     a     b
2     c     d
like image 865
Lincoln Mullen Avatar asked Sep 17 '25 10:09

Lincoln Mullen


2 Answers

Use pmin and pmax to sort the pairs alphabetically, i.e. turn (b,a) into (a,b) and then filter away all the duplicates.

df %>%
    mutate(dx = pmin(doc.x, doc.y), dy = pmax(doc.x, doc.y)) %>%
    distinct(dx, dy) %>%
    select(-dx, -dy)
  doc.x doc.y
  (chr) (chr)
1     a     b
2     c     d
like image 50
Backlin Avatar answered Sep 20 '25 04:09

Backlin


Alternate way using data.table:

df <- data.frame(doc.x = c("a", "b", "c", "d"),
                 doc.y = c("b", "a", "d", "c"), stringsAsFactors = F)


library(data.table)
setDT(df)
df[, row := 1:nrow(df)]
df <- df[, list(Left = max(doc.x,doc.y),Right = min(doc.x,doc.y)), by = row]
df <- df[, list(Left,Right)]
unique(df)
   Left Right
1:    b     a
2:    d     c
like image 31
Chris Avatar answered Sep 20 '25 03:09

Chris