Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you subtract two data frames from one another in R

Tags:

r

I have two data frames that I need to subtract the same columns per time and store the results in a different data frame:

dput(t)

structure(list(time = structure(c(2L, 1L, 3L), .Label = c("1/13/15 1:18 PM", 
"1/13/15 12:18 PM", "1/13/15 2:18 PM"), class = "factor"), web01 = c(24083L, 
24083L, 24083L), web03 = c(24083L, 24083L, 24083L)), .Names = c("time", 
"web01", "web03"), class = "data.frame", row.names = c(NA, -3L
))

dput(d)

structure(list(time = structure(c(2L, 1L, 3L), .Label = c("1/13/15 1:18 PM", 
"1/13/15 12:18 PM", "1/13/15 2:18 PM"), class = "factor"), web01 = c(7764.8335, 
7725, 7711.5), web03 = c(10885.5, 10582.333, 10104.5)), .Names = c("time", 
"web01", "web03"), class = "data.frame", row.names = c(NA, -3L
))

Data frame t and d are just sample, my actual data frames have 20 columns. Data frame t and d in this case have the same column names and time will the same for each row for both data frames.

I need to subtract d from d for the same time period and store the result in a different data frame. Any ideas how I could do this in R

like image 879
user1471980 Avatar asked Jan 23 '15 19:01

user1471980


2 Answers

Update

rbind_list and rbind_all have been deprecated. Instead use bind_rows.

Based on discussions in comments and inspired by Andrew's answer:

library(dplyr)
df <- bind_rows(d,t) %>% 
  group_by(time = as.POSIXct(time, format="%m/%d/%Y %I:%M %p")) %>%
  summarise_each(funs(diff(.))) %>% 
  data.frame()

This will keep time in a chronological order and convert the result in a regular data.frame()

like image 129
Steven Beaupré Avatar answered Oct 31 '22 21:10

Steven Beaupré


Here's a data.table approach:

library(data.table)
rbindlist(list(d,t))[, lapply(.SD, diff),
                 by = .(time = as.POSIXct(time, format="%m/%d/%y %I:%M %p"))]

#                  time    web01    web03
#1: 2015-01-13 12:18:00 16318.17 13197.50
#2: 2015-01-13 13:18:00 16358.00 13500.67
#3: 2015-01-13 14:18:00 16371.50 13978.50

Edit: corrected date format and output, removed .SDcols = ... .

like image 25
talat Avatar answered Oct 31 '22 21:10

talat