I have a big data.frame in R, whose oversimplified version looks like this (the real data.frame has 20 colors in the "Color" column and 10 different numbers in "Number" column:
Color Number Y
blue 1 5
blue 2 3
blue 3 2
red 1 5
red 2 8
red 3 2
green 1 2
green 2 9
green 3 3
For every color in "Color", I would like to apply a function between all combinations of numbers in the "Numbers" column, by comparing the corresponding values of the "Y" column. Lets take a simple function as example:
if x >= y, print 1, else print 0 # where x and y represent the first and second values to be compared, respectively
I would obtain this as an output data.frame:
Color Comparison Y
blue 1_vs_2 1
blue 1_vs_3 1
blue 2_vs_1 0
blue 2_vs_3 1
blue 3_vs_1 0
blue 3_vs_2 0
red 1_vs_2 0
red 1_vs_3 1
red 2_vs_1 1
red 2_vs_3 1
red 3_vs_1 0
red 3_vs_2 0
green 1_vs_2 0
green 1_vs_3 0
green 2_vs_1 1
green 2_vs_3 1
green 3_vs_1 1
green 3_vs_2 0
Have you considered SQL? You could merge the data back onto itself. If you restrict the Color
to be the same and the Number
to be different you should get every pairwise comparison you want. This is the same concept as @Psidom's answer - he just does it in a data.table
join.
library(sqldf)
res <- sqldf("SELECT l.Color, l.Number as l_number, r.Number as r_number,
case when l.Y >= r.Y then 1 else 0 end as Y
FROM df as l
INNER JOIN df as r
ON l.Color = r.Color AND
l.Number != r.Number
")
res$comparison <- paste0(res$l_number,"_vs_",res$r_number)
res
Color l_number r_number Y comparison
1 blue 1 2 1 1_vs_2
2 blue 1 3 1 1_vs_3
3 blue 2 1 0 2_vs_1
4 blue 2 3 1 2_vs_3
5 blue 3 1 0 3_vs_1
6 blue 3 2 0 3_vs_2
7 red 1 2 0 1_vs_2
8 red 1 3 1 1_vs_3
9 red 2 1 1 2_vs_1
10 red 2 3 1 2_vs_3
11 red 3 1 0 3_vs_1
12 red 3 2 0 3_vs_2
13 green 1 2 0 1_vs_2
14 green 1 3 0 1_vs_3
15 green 2 1 1 2_vs_1
16 green 2 3 1 2_vs_3
17 green 3 1 1 3_vs_1
18 green 3 2 0 3_vs_2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With