Hel lo I have a df such as
Groups COL1 COL2
G1 1 A
G1 1 C
G1 2 A
G1 2 B
G1 5 C
G1 6 C
G2 7 B
G2 7 B
G2 8 C
G3 10 C
G3 10 A
G3 11 B
G4 12 C
G4 12 C
and the idea is to add a new column COL3
and
group_by(Groups, COL1) %>%
mutate(COL3 = COL1(A>B>C))
the idea being that within groups and COL1, if two COL2 values are different, if A is present with A or C, all value become A, if A is not present but B is here, all value become B and if there is only C, all value become C (they already are).
so A > B and B > C here is the expected output :
Groups COL1 COL2 COL3
G1 1 A A
G1 1 C A
G1 2 A A
G1 2 B A
G1 5 C C
G1 6 C C
G2 7 B B
G2 7 B B
G2 8 C C
G3 10 C A
G3 10 A A
G3 11 B B
G4 12 C C
G4 12 C C
Does someone have an idea ?
If COL2
can be meaningfully sorted, min()
should work:
df <- structure(
list(Groups = c("G1", "G1", "G1", "G1", "G1", "G1", "G2", "G2", "G2", "G3",
"G3", "G3", "G4", "G4"),
COL1 = c(1L, 1L, 2L, 2L, 5L, 6L, 7L, 7L, 8L, 10L, 10L, 11L, 12L, 12L),
COL2 = c("A", "C", "A", "B", "C", "C", "B", "B", "C", "C", "A", "B",
"C", "C")),
class = "data.frame", row.names = c(NA, -14L))
library("dplyr")
df %>%
group_by(Groups, COL1) %>%
mutate(COL3 = min(COL2))
#> # A tibble: 14 x 4
#> # Groups: Groups, COL1 [9]
#> Groups COL1 COL2 COL3
#> <chr> <int> <chr> <chr>
#> 1 G1 1 A A
#> 2 G1 1 C A
#> 3 G1 2 A A
#> 4 G1 2 B A
#> 5 G1 5 C C
#> 6 G1 6 C C
#> 7 G2 7 B B
#> 8 G2 7 B B
#> 9 G2 8 C C
#> 10 G3 10 C A
#> 11 G3 10 A A
#> 12 G3 11 B B
#> 13 G4 12 C C
#> 14 G4 12 C C
Created on 2020-05-28 by the reprex package (v0.3.0)
I think this gives the expected result:
df %>%
group_by(Groups, COL1) %>%
mutate(COL2 = levels(COL2)[min(as.numeric(COL2))])
#> # Groups: Groups, COL1 [9]
#> Groups COL1 COL2
#> <fct> <int> <chr>
#> 1 G1 1 A
#> 2 G1 1 A
#> 3 G1 2 A
#> 4 G1 2 A
#> 5 G1 5 C
#> 6 G1 6 C
#> 7 G2 7 B
#> 8 G2 7 B
#> 9 G2 8 C
#> 10 G3 10 A
#> 11 G3 10 A
#> 12 G3 11 B
#> 13 G4 12 C
#> 14 G4 12 C
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With