Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting by group in data.table

Tags:

r

data.table

I've got individual-level data for which I'm trying to summarize an outcome dynamically by group.

Example:

set.seed(12039)
DT <- data.table(id = rep(1:100, each = 50),
                 grp = rep(letters[1:4], each = 1250),
                 time = rep(1:50, 100),
                 outcome = rnorm(5000))

I want to know the simplest way to plot the group-level summary, the data for which is contained in:

DT[ , mean(outcome), by = .(grp, time)]

I wanted something like:

dt[ , plot(mean(outcome)), by = .(grp, time)]

But this doesn't work at all.

The workable option I am surviving on (which could be looped pretty easily) is:

plot(DT[grp == "a", mean(outcome), by = time])
lines(DT[grp == "b", mean(outcome), by = time])
lines(DT[grp == "c", mean(outcome), by = time])
lines(DT[grp == "d", mean(outcome), by = time])

(with added parameters for colors, etc, excluded for conciseness)

This strikes me as not the best way to do this--given data.table's craft in handling groups, is there not a more elegant solution?

Other sources have been pointing me to matplot but I can't see a straightforward way to use it--do I need to reshape DT, and is there a simple reshape that would get the job done?

like image 659
MichaelChirico Avatar asked Feb 08 '15 23:02

MichaelChirico


1 Answers

Base R solution using matplot and dcast

dt_agg <- dt[ , .(mean = mean(outcome)), by=.(grp,time)]
dt_cast <- dcast(dt_agg, time~grp, value.var="mean")
dt_cast[ , matplot(time, .SD[ , !"time"], type="l", ylab="mean", xlab="")]
# alternative:
dt_cast[ , matplot(time, .SD, type="l", ylab="mean", xlab=""), .SDcols = !"time"]

Result: enter image description here

like image 61
Rentrop Avatar answered Sep 29 '22 08:09

Rentrop