This is a personal project to learn the syntax of the data.table
package. I am trying to use the data values to create multiple graphs and label each based on the by
group value. For example, given the following data:
# Generate dummy data
require(data.table)
set.seed(222)
DT = data.table(grp=rep(c("a","b","c"),each=10),
x = rnorm(30, mean=5, sd=1),
y = rnorm(30, mean=8, sd=1))
setkey(DT, grp)
The data consists of random x and y values for 3 groups (a, b, and c). I can create a formatted plot of all values with the following code:
# Example of plotting all groups in one plot
require(ggplot2)
p <- ggplot(data=DT, aes(x = x, y = y)) +
aes(shape = factor(grp))+
geom_point(aes(colour = factor(grp), shape = factor(grp)), size = 3) +
labs(title = "Group: ALL")
p
This creates the following plot:
Instead I would like to create a separate plot for each by
group, and change the plot title from “Group: ALL” to “Group: a”, “Group: b”, “Group: c”, etc. The documentation for data.table says:
.BY
is a list containing a length 1 vector for each item inby
. This can be useful when by is not known in advance. Theby
variables are also available toj
directly by name; useful for example for titles of graphs ifj
is a plot command, or to branch withif()
That being said, I do not understand how to use .BY
or .SD
to create separate plots for each group. Your help is appreciated.
The title should be a concise description of what is being graphed (e. g., “Pressure as a Function of Temperature for Nitrogen”). Usually you do not need to describe in the title the units used in the graph, but there are some instances where this is necessary.
To properly label a graph, you should identify which variable the x-axis and y-axis each represent. Don't forget to include units of measure (called scale) so readers can understand each quantity represented by those axes. Finally, add a title to the graph, usually in the form "y-axis variable vs. x-axis variable."
Here is the data.table
solution, though again, not what I would recommend:
make_plot <- function(dat, grp.name) {
print(
ggplot(dat, aes(x=x, y=y)) +
geom_point() + labs(title=paste0("Group: ", grp.name$grp))
)
NULL
}
DT[, make_plot(.SD, .BY), by=grp]
What you really should do for this particular application is what @dmartin recommends. At least, that's what I would do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With