Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort two columns, conditionally select values, then run cumsum frequency

My data looks like this

a   b   c
1   1   0
1   2   8
2   1   0
2   2   2
3   1   3
3   2   3
4   1   7
4   2   4
5   1   3
5   2   5
6   1   1
6   2   8
7   1   1
7   2   2

I want to sort columns a and c so that every even-numbered row in column c is the largest number for every pair in column a. Then I want to take these values and store them in a new object. It should look something like this.

a   c   b
1   8   2
2   2   2
3   3   2
4   7   1
5   5   2
6   8   2
7   2   2
like image 366
Provisional.Modulation Avatar asked Dec 09 '22 05:12

Provisional.Modulation


1 Answers

With data.table package you can sort your data by reference using setorder or setkey (without the need of creating copies using <- function)

library(data.table)
setorder(setDT(df), a, c)[]
#     a b c
#  1: 1 1 0
#  2: 1 2 8
#  3: 2 1 0
#  4: 2 2 2
#  5: 3 1 3
#  6: 3 2 3
#  7: 4 1 7
#  8: 4 2 4
#  9: 5 1 3
# 10: 5 2 5
# 11: 6 1 1
# 12: 6 2 8
# 13: 7 1 1
# 14: 7 2 2

Then you can achieve your goal in various simple ways, for example

df[duplicated(a)]
#    a b c
# 1: 1 2 8
# 2: 2 2 2
# 3: 3 2 3
# 4: 4 2 4
# 5: 5 2 5
# 6: 6 2 8
# 7: 7 2 2

Or maybe

df[, tail(.SD, 1), a]
#    a b c
# 1: 1 2 8
# 2: 2 2 2
# 3: 3 2 3
# 4: 4 2 4
# 5: 5 2 5
# 6: 6 2 8
# 7: 7 2 2

Or

df[, .SD[2], a]
#    a b c
# 1: 1 2 8
# 2: 2 2 2
# 3: 3 2 3
# 4: 4 2 4
# 5: 5 2 5
# 6: 6 2 8
# 7: 7 2 2

P.S. If you want to change the order of the columns, you can also do this by reference using setcolorder function, e.g.,

setcolorder(df, c("a", "c", "b"))
like image 55
David Arenburg Avatar answered Dec 11 '22 12:12

David Arenburg