Is it possible to store the order
of rows in a data.table
while preserving its keys?
Lets say I have the following dummy table:
library(data.table)
dt <- data.table(id=letters[1:6],
group=sample(c("red", "blue"), replace=TRUE),
value.1=rnorm(6),
value.2=runif(6))
setkey(dt, id)
dt
id group value.1 value.2
1: a blue 1.4557851 0.73249612
2: b red -0.6443284 0.49924102
3: c blue -1.5531374 0.72977197
4: d red -1.5977095 0.08033604
5: e blue 1.8050975 0.43553048
6: f red -0.4816474 0.23658045
I would like to store this table so that rows are ordered by group
, and by value.1
in decreasing order, i.e:
> dt[order(group, value.1, decreasing=T),]
id group value.1 value.2
1: f red -0.4816474 0.23658045
2: b red -0.6443284 0.49924102
3: d red -1.5977095 0.08033604
4: e blue 1.8050975 0.43553048
5: a blue 1.4557851 0.73249612
6: c blue -1.5531374 0.72977197
Obviously I can save this as a new variable, but I also want to keep the id
column as my primary key.
Arun's answer to "What is the purpose of setting a key in data.table?" suggests that this can be achieved with clever use setkey
, since it orders the data.table in the order of its keys (although there is no option to set the key to decreasing order):
> setkey(dt, group, value.1, id)
> dt
id group value.1 value.2
1: c blue -1.5531374 0.72977197
2: a blue 1.4557851 0.73249612
3: e blue 1.8050975 0.43553048
4: d red -1.5977095 0.08033604
5: b red -0.6443284 0.49924102
6: f red -0.4816474 0.23658045
However, I lose the ability to use id
as my primary key, because group
is the first key provided:
> dt["a"]
group id value.1 value.2
1: a NA NA NA
Sounds like you simply want to modify print.data.table
:
print.data.table = function(x, ...) {
# put whatever condition identifies your tables here
if ("group" %in% names(x) && "value.1" %in% names(x)) {
data.table:::print.data.table(x[order(group, value.1, decreasing = T)], ...)
} else {
data.table:::print.data.table(x, ...)
}
}
set.seed(2)
dt = data.table(id=letters[1:6],
group=sample(c("red", "blue"), replace=TRUE),
value.1=rnorm(6),
value.2=runif(6))
setkey(dt, id)
dt
# id group value.1 value.2
#1: a red 0.18484918 0.40528218
#2: e red 0.13242028 0.44480923
#3: c red -1.13037567 0.97639849
#4: b blue 1.58784533 0.85354845
#5: f blue 0.70795473 0.07497942
#6: d blue -0.08025176 0.22582546
dt["c"]
# id group value.1 value.2
#1: c red -1.130376 0.9763985
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With