I am wondering if there is any easy R commands or packages that will all allow me to easily add variables to data.frames which are the "difference" or change of over time of those variables.
If my data looked like this:
set.seed(1)
MyData <- data.frame(Day=0:9 %% 5+1,
Price=rpois(10,10),
Good=rep(c("apples","oranges"), each=5))
MyData
Day Price Good
1 1 8 apples
2 2 10 apples
3 3 7 apples
4 4 11 apples
5 5 14 apples
6 1 12 oranges
7 2 11 oranges
8 3 9 oranges
9 4 14 oranges
10 5 11 oranges
Then after "first differencing" the price variable, my data would look like this.
Day Price Good P1d
1 1 8 apples NA
2 2 10 apples 2
3 3 7 apples -3
4 4 11 apples 4
5 5 14 apples 3
6 1 12 oranges NA
7 2 11 oranges -1
8 3 9 oranges -2
9 4 14 oranges 5
10 5 11 oranges -3
Panel (data) analysis is a statistical method, widely used in social science, epidemiology, and econometrics to analyze two-dimensional (typically cross sectional and longitudinal) panel data. The data are usually collected over time and over the same individuals and then a regression is run over these two dimensions.
A simple way to view a single (or "first order") difference is to see it as x(t) - x(t-k) where k is the number of lags to go back. Higher order differences are simply the reapplication of a difference to each prior result. In R, the difference operator for xts is made available using the diff() command.
An unbalanced-panel is a dataset in which one panel member is not observed every period. To fix it, Run standard fixed effects models on your entire unbalanced data and get estimates.
The first-differenced (FD) estimator is an approach that is used to address the problem of omitted variables in econometrics and statistics by using panel data.
ave
transform(MyData, P1d = ave(Price, Good, FUN = function(x) c(NA, diff(x))))
ave/gsubfn
The last solution can be shorteneed slightly using fn$
in the gsubfn package:
library(gsubfn)
transform(MyData, P1d = fn$ave(Price, Good, FUN = ~ c(NA, diff(x))))
dplyr
library(dplyr)
MyData %>%
group_by(Good) %>%
mutate(P1d = Price - lag(Price)) %>%
ungroup
data.table
library(data.table)
dt <- data.table(MyData)
dt[, P1d := c(NA, diff(Price)), by = Good]
dplyr now uses %>%
instead of %.%
.
One can easily do it like this:
library(reshape2)
library(dplyr)
MyNewData <-
MyData %.%
melt(id = c("Good", "Day")) %.%
dcast(Day ~ Good) %.%
mutate(apples = apples - lag(apples),
oranges = oranges - lag(oranges)) %.%
melt(id = "Day", variable.name = "Good", value.name = "P1d") %.%
merge(MyData) %.%
arrange(Good, Day)
Regards
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With