I'm pretty new to R and just can't figure out how to do this, despite some similar but not-quite-the-same questions floating around. What I have is several (~10) CSV files that look like this:
time, value
0, 5
100, 4
200, 8
etc.
That is they record a long series of times and values at that time. I want to plot all of them on one chart in R using ggplot2, so that it looks something like this
. I've been trying all kinds of melts and merges and have been unsuccessful so far (though read.csv is working fine and I can plot the files one by one easily). One thing I can't figure out is whether to combine all the data before it gets to ggplot2, or somehow pass all the data individually to ggplot2.
I should probably note that each data series shares the exact same time points. By this I mean, if file 1 has values at times 100, 200, 300, ..., 1000 then so do all the other files. But ideally, I'd like the solution not to depend on that, because I could see a future situation where the times are similarly scaled but not exactly the same, e.g. file 1 has times 99, 202, 302, 399, ... and file 2 has times 101, 201, 398, 400, ...
Thanks much.
EDIT: I can do this with just regular plot like so (clunkily), this might illustrate the kind of thing I want to do:
f1 = read.csv("file1.txt")
f2 = read.csv("file2.txt")
f3 = read.csv("file3.txt")
plot(f1$time,f1$value,type="l",col="red")
lines(f2$time, f2$value, type="l",col="blue" )
lines(f3$time, f3$value, type="l",col="green" )
I would divide this in 4 tasks. This can also help look for answers for each.
1. Reading a few files automatically, without harcoding the file names
2. Merging these data.frame's , using a "left join"
3. Reshaping the data for ggplot2
4. Plotting a line graph
.
# Define a "base" data.frame
max_time = 600
base_df <- data.frame(time=seq(1, max_time, 1))
# Get the file names
all_files = list.files(pattern='.*csv')
# This reads the csv files, check if you need to make changes in read.csv
all_data <- lapply(all_files, read.csv)
# This joins the files, using the "base" data.frame
ls = do.call(cbind, lapply(all_data, function(y){
df = merge(base_df, y, all.x=TRUE, by="time")
df[,-1]
}))
# This would have the data in "wide" format
data = data.frame(time=base_df$time, ls)
# The plot
library(ggplot2)
library(reshape2)
mdf = melt(data, id.vars='time')
ggplot(mdf, aes(time, value, color=variable, group=variable)) +
geom_line() +
theme_bw()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With