Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an R function that checks whether all values in a group are the same as all values in another group?

Tags:

r

group

mutate

Data I have:

A B
1 a
2 c
2 e
3 f
4 h
5 c
5 e

What I want:

A B Group
1 a 1
2 c 2
2 e 2
3 f 3
4 h 4
5 c 2
5 e 2

Code I attempted:

library(readxl)
library(dplyr)
library(stringr)
data1 <- read_excel("testing.xlsx")
data2 <- data1 %>% 
  group_by(A) %>% 
  group_by(B) %>% 
  mutate(Group = cur_group_id()) %>% 
  ungroup()

What I’m getting from this code:

A B Group
1 a 1
2 c 2
2 e 3
3 f 4
4 h 5
5 c 2
5 e 3

EDIT: I get the error — “Can’t supply ‘.by’ when ‘.data’ is a grouped data frame.” for all of the comments below. The original data I am manipulating has been left-joined and then grouped. How do I approach this?


2 Answers

You can try below

library(dplyr)
df %>%
    left_join(
        (.) %>%
            summarise(group = as.factor(toString(sort(B))), .by = A) %>%
            mutate(group = as.integer(group))
    )

or you can use membership from igraph package in addition

library(dplyr)
library(igraph)
df %>%
    mutate(group = {
        (.) %>%
            graph_from_data_frame() %>%
            components() %>%
            membership()
    }[B])

which gives

  A B group
1 1 a     1
2 2 c     2
3 2 e     2
4 3 f     3
5 4 h     4
6 5 c     2
7 5 e     2

bonus (for the igraph interest)

df %>%
    graph_from_data_frame() %>%
    plot()

shows the groups enter image description here

like image 167
ThomasIsCoding Avatar answered Oct 17 '25 17:10

ThomasIsCoding


library(dplyr)

data1 |>
  mutate(group = paste(sort(B), collapse = ""), .by = A) |>
  mutate(group = cur_group_id(), .by = group)

Output

  A B group
1 1 a     1
2 2 c     2
3 2 e     2
4 3 f     3
5 4 h     4
6 5 c     2
7 5 e     2
like image 23
LMc Avatar answered Oct 17 '25 17:10

LMc



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!