Rank values in r datatable grouped by another variable

Tags:

I would like to use datatable's frank function to rank the date column by id. However, my rankings only seem to take into consideration the date column and not the id corresponding to it. I also receive 6 of these warnings that I'm not sure about:

1..... 6: In [.data.table(dups, , :=(rank, frank(dups, date, ties.method = "average")), : RHS 1 is length 10 (greater than the size (1) of group 6). The last 9 element(s) will be discarded.

Click to copy

dups <- data.table (id = c('11', '11', '11', '22','22',
  '88', '99','44','44', '55'),
  date = mdy(c("1-01-2016", "1-02-2016", "1-02-2016","2-01-2016", 
  "2-02-2016")))

so.sample <- dups[, rank := frank(dups, date, ties.method = "average"), by = id]

For example, id = 11 and date = 2016-01-01 should rank 1 instead of 1.5, because there is only one id and date with that combination.

thx for help

742

asked May 18 '16 16:05

user3067851

1 Answers

It works just fine with both, 'rank' and 'frank'. Maybe your date variable was not formatted correctly. Here is the code:

Click to copy

dt1 <- data.table (id = c('11', '11', '11', '22','22',
                      '88', '99','44','44', '55'),
               date = as.Date(c("01-01-2016", 
                                "01-02-2016", 
                                "01-02-2016",
                                "02-01-2016", 
                                "02-02-2016"),
                              format = "%m-%d-%Y"))
setkey(dt1, date)
setkey(dt1, id)
dt1

    id       date
 1: 11 2016-01-01
 2: 11 2016-01-02
 3: 11 2016-01-02
 4: 22 2016-02-01
 5: 22 2016-02-02
 6: 44 2016-01-02
 7: 44 2016-02-01
 8: 55 2016-02-02
 9: 88 2016-01-01
10: 99 2016-01-02

dt1[, rank := frank(date),
    by = list(id)]
dt1

    id       date  rank
 1: 11 2016-01-01   1.0
 2: 11 2016-01-02   2.5
 3: 11 2016-01-02   2.5
 4: 22 2016-02-01   1.0
 5: 22 2016-02-02   2.0
 6: 44 2016-01-02   1.0
 7: 44 2016-02-01   2.0
 8: 55 2016-02-02   1.0
 9: 88 2016-01-01   1.0
10: 99 2016-01-02   1.0

Additionally, if you just want to enumerate your records, using .N can be quite helpful:

Click to copy

dt1[, Visit := 1:.N,
    by = list(id)]
dt1

    id       date rank Visit
 1: 11 2016-01-01  1.0     1
 2: 11 2016-01-02  2.5     2
 3: 11 2016-01-02  2.5     3
 4: 22 2016-02-01  1.0     1
 5: 22 2016-02-02  2.0     2
 6: 44 2016-01-02  1.0     1
 7: 44 2016-02-01  2.0     2
 8: 55 2016-02-02  1.0     1
 9: 88 2016-01-01  1.0     1
10: 99 2016-01-02  1.0     1

I hope this helps.

187

answered Oct 15 '22 16:10

Davit Sargsyan

Related questions
                            
                                How to make stat_binhex shown on a log scale in ggplot2
                            
                                How to get a hash code as integer in R?
                            
                                world map - map halves of countries to different colors using ggplot2
                            
                                MetaPhone Functions (like SoundEx) functions and use in R?
                            
                                How to scale x axis and add ticks in R
                            
                                Remove rows with NA from data.table in R [duplicate]
                            
                                cbind two lists of data.frames to a new list [duplicate]
                            
                                data.table operations by column name with spaces fails
                            
                                How do I sum the values of columns in several tables if tables have different lengths?
                            
                                Segment vector according to whether or not values are above a threshold in R
                            
                                R split a character string on the second underscore
                            
                                R - Duplicating rows based on a sequence of start and end dates
                            
                                Data.Table rolling join by group
                            
                                how to calculate Euclidean distance between two matrices in R
                            
                                glmnet - variable importance?
                            
                                Launch and terminate shiny app through terminal
                            
                                Reorder list elements
                            
                                Why won't dplyr's top_n() work?
                            
                                Turning off title page in Rmd using knitr
                            
                                add popovers to shiny app?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Rank values in r datatable grouped by another variable

Tags:

r

data.table

rank

user3067851

People also ask

1 Answers

Davit Sargsyan

Recent Activity

Donate For Us