Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create new lagged data.frame column

Tags:

r

I'm not understanding how to create a new 'lagged' column in a data.frame. My current data is collected at the end of the data. One program I need to send this to assumes it's collected first thing in the morning, so I need to lag column 2 by 1 row. The code I wrote just returns the same data.

How can I do this properly?

Thanks.

D8 = structure(list(Date = structure(c(14396, 14397, 14398, 14399, 
14400, 14403, 14404, 14405, 14406,
14407, 14410, 14411, 14412,  14413,
14414, 14417, 14418, 14419, 14420,
14421, 14424, 14425,  14426, 14427,
14428, 14431, 14432, 14433, 14434,
14435), class = "Date"), 
    PL8 = c(0, 0, 0, 0, 76, 0, -334, -974, -104, 356, 378, -1102, 
    -434, 266, -434, 444, 464, 0, 486, 406, -224, -214, 0, -4, 
    0, -188, 356, 322, -484, 436)), .Names = c("Date", "PL8"), row.names =
c(NA,  30L), class = "data.frame")


D8

D8[,3] = lag(D8[,2],k=-1)

D8
like image 216
LGTrader Avatar asked Mar 19 '11 18:03

LGTrader


People also ask

How do I create a lag column in pandas?

You can use the shift() function in pandas to create a column that displays the lagged values of another column. Note that the value in the shift() function indicates the number of values to calculate the lag for.

How do you create a new data frame from an existing one?

You can create a new DataFrame of a specific column by using DataFrame. assign() method. The assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.


2 Answers

Try this:

transform(D8, PL8.lag = c(PL8[-1], NA))

It would be a bit easier if you used a time series class. In that case you could use lag:

library(zoo)
z <- read.zoo(D8)
lag(z, 0:1)

In the other direction we would have:

transform(D8, PL8.lag = c(NA, head(PL8, -1)))

and

lag(z, 0:-1)
like image 172
G. Grothendieck Avatar answered Oct 06 '22 09:10

G. Grothendieck


Here is an alternative solution:

D8$my.PL8.lag <- c(D8$PL8[-1], NA)

Here is code to lag a column by group using tapply:

my.df = read.table(text = "
    REFNO  MONTH   DAY   YEAR   STATE
       1       3     5   2012      1
       1       3     7   2012      2
       1       3    10   2012      3
       1       3    14   2012     NA
       2       3     1   2012     20
       2       3    10   2012     40
       2       3    14   2012     60
       2       3    17   2012     80
       3       4     3   2012     -4
       3       4    24   2012     -8
       3       4    28   2012    -12
", header = TRUE, stringsAsFactors = FALSE)

desired.result = read.table(text = "
    REFNO  MONTH   DAY   YEAR   STATE   STATE.lag
       1       3     5   2012      1       NA
       1       3     7   2012      2        1
       1       3    10   2012      3        2
       1       3    14   2012     NA        3
       2       3     1   2012     20       NA
       2       3    10   2012     40       20
       2       3    14   2012     60       40
       2       3    17   2012     80       60
       3       4     3   2012     -4       NA
       3       4    24   2012     -8       -4
       3       4    28   2012    -12       -8
", header = TRUE, stringsAsFactors = FALSE)

my.df$STATE.lag <- unlist(tapply(my.df$STATE, my.df$REFNO, function(x) {
     c(NA, x[-length(x)])
}))

all.equal(my.df, desired.result)
# [1] TRUE

If the column you wish to lag is in Date format you can use:

my.df$MY.DATE <- do.call(paste, list(my.df$MONTH, my.df$DAY, my.df$YEAR))

my.df$MY.DATE <- as.Date(my.df$MY.DATE, format=c("%m %d %Y"))

my.df$MY.DATE.lag <- as.Date(unlist(tapply(as.character(my.df$MY.DATE), my.df$REFNO, 

      function(x) { c(NA, x[-length(x)]) } )))

   REFNO MONTH DAY YEAR STATE    MY.DATE MY.DATE.lag
1      1     3   5 2012     1 2012-03-05        <NA>
2      1     3   7 2012     2 2012-03-07  2012-03-05
3      1     3  10 2012     3 2012-03-10  2012-03-07
4      1     3  14 2012    NA 2012-03-14  2012-03-10
5      2     3   1 2012    20 2012-03-01        <NA>
6      2     3  10 2012    40 2012-03-10  2012-03-01
7      2     3  14 2012    60 2012-03-14  2012-03-10
8      2     3  17 2012    80 2012-03-17  2012-03-14
9      3     4   3 2012    -4 2012-04-03        <NA>
10     3     4  24 2012    -8 2012-04-24  2012-04-03
11     3     4  28 2012   -12 2012-04-28  2012-04-24
like image 41
Mark Miller Avatar answered Oct 06 '22 08:10

Mark Miller