I need to go from this
id | date
-----------------
A | 2000-01-13
A | 2000-01-18
A | 2000-01-25
B | 2012-10-10
B | 2012-10-11
C | 2005-07-25
C | 2005-07-31
to this
id | date | days from start
---------------------------
A | 2000-01-13 | 0
A | 2000-01-18 | 5
A | 2000-01-25 | 12
A | 2000-02-08 | 26
B | 2012-10-10 | 0
B | 2012-10-11 | 1
C | 2005-07-25 | 0
C | 2005-07-31 | 6
i.e. creating a variable that holds the number of days passed since the first date, grouped by id.
Any ideas?
To find the number of days between these two dates, you can enter “=B2-B1” (without the quotes into cell B3). Once you hit enter, Excel will automatically calculate the number of days between the two dates entered.
How do I go about calculating the days between two dates? To calculate the number of days between two dates, you need to subtract the start date from the end date.
Calculate elapsed time between two dates and times To calculate the time between two dates and times, you can simply subtract one from the other.
Using data.table
: (I assume the date
column is character here. If its date
format, then you can remove the as.Date(.)
function call.
df <- structure(list(id = c("A", "A", "A", "B", "B", "C", "C"),
date = c("2000-01-13", "2000-01-18", "2000-01-25", "2012-10-10",
"2012-10-11", "2005-07-25", "2005-07-31")),
.Names = c("id", "date"), row.names = c(NA, -7L),
class = "data.frame")
require(data.table)
dt <- data.table(df, key="id")
dt[, days_from_start := cumsum(c(0, diff(as.Date(date)))),by=id]
# id date days_from_start
# 1: A 2000-01-13 0
# 2: A 2000-01-18 5
# 3: A 2000-01-25 12
# 4: B 2012-10-10 0
# 5: B 2012-10-11 1
# 6: C 2005-07-25 0
# 7: C 2005-07-31 6
You can also use a combination of functions difftime
and split
:
dat
id date
1 A 2000-01-13
2 A 2000-01-18
3 A 2000-01-25
4 B 2012-10-10
5 B 2012-10-11
6 C 2005-07-25
7 C 2005-07-31
dat$date <- as.POSIXct(dat$date)
dat$"Days spent" <- unlist(lapply(split(dat,f=dat$id),
function(x){as.numeric(difftime(x$date,x$date[1], units="days"))}))
dat
id date Days spent
1 A 2000-01-13 0
2 A 2000-01-18 5
3 A 2000-01-25 12
4 B 2012-10-10 0
5 B 2012-10-11 1
6 C 2005-07-25 0
7 C 2005-07-31 6
Following @agstudy and @Arun suggestions, this can be simplified as follows:
dat$"Days spent" <- unlist(by(dat, dat$id,
function(x)difftime(x$date,x$date[1], units= "days")))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With