lagging in data.table R

Tags:

data.table

Currently I have a utility function that lags things in data.table by group. The function is simple:

panel_lag <- function(var, k) {
  if (k > 0) {
    # Bring past values forward k times
    return(c(rep(NA, k), head(var, -k)))
  } else {
    # Bring future values backward
    return(c(tail(var, k), rep(NA, -k)))
  }
}

I can then call this from a data.table:

x = data.table(a=1:10, 
               dte=sample(seq.Date(from=as.Date("2012-01-20"),
                                   to=as.Date("2012-01-30"), by=1),
                          10))
x[, L1_a:=panel_lag(a, 1)]  # This won't work correctly as `x` isn't keyed by date
setkey(x, dte)
x[, L1_a:=panel_lag(a, 1)]  # This will

This requires that I check inside panel_lag whether x is keyed. Is there a better way to do lagging? The tables tend to be large so they should really be keyed. I just do setkey before i lag. I would like to make sure I don't forget to key them. So I would like to know if there is a standard way people do this.

277

asked Jan 17 '13 16:01

Alex

1 Answers

If you want to ensure that you lag in order of some other column, you could use the order function:

x[order(dte),L1_a:=panel_lag(a,1)]

Though if you're doing a lot of things in date order it would make sense to key it that way.

answered Oct 14 '22 02:10

Blue Magister

Related questions
                            
                                register PSOCK cluster with foreach - R 2.14
                            
                                Sweave can't load R packages
                            
                                Parallel Random Forests with doSMP and foreach drastically increase memory usage (on Windows)
                            
                                Error when uploading package to CRAN incoming: 550 access denied
                            
                                readHTMLTable and UTF-8 encoding
                            
                                Combine vector and data.frame matching column values and vector values
                            
                                How can I prevent R from loading a package?
                            
                                working with package without Namespace in R
                            
                                Force install.packages()
                            
                                ggplot2 ignoring locale category LC_TIME?
                            
                                R in SharePoint
                            
                                issue converting python pandas DataFrame to R dataframe for use with rpy2
                            
                                R filter() dealing with NAs
                            
                                Displaying TraMineR (R) dendrograms in text/table format
                            
                                R Sweave output error
                            
                                R: NA/NaN/Inf in foreign function call (arg 1)
                            
                                ggplot2 binwidth not responding in facet_wrap histogram plot
                            
                                Changing alpha doesn't affect anything in ggplot2
                            
                                Set default graphical parameters for device
                            
                                Fast grouping by list column subsets in data.table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With