Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

merging several data frames into a single expanded frame

Tags:

r

I have a list of data frames, where each frame contains the same kind of measurements for a single system. E.g.,

$system1                           
                file    cumSize     cumloadTime     query1
1  ../data/data1.dat    100000      158.1000        0.4333333
2  ../data/data2.dat    200000      394.9000        0.5000000
3  ../data/data3.dat    250000      561.8667        0.6666667

$system2                           
                file    cumSize     cumloadTime     query1
1  ../data/data1.dat    100000      120.1000        0.4333333
2  ../data/data2.dat    200000      244.9000        0.4500000
3  ../data/data3.dat    250000      261.8667        0.2666667

Now I would like to display several aspects of these data frames in separate plots using the matplot command. Therefore I need to transform the above input data structure into the following output structure:

$cumloadTime

cumSize     system1     system2
100000      158.1000    120.1000
200000      394.9000    244.9000
250000      561.8667    261.8667

$query1

cumSize     system1     system2
100000      0.4333333   0.4333333
200000      0.5000000   0.4500000
250000      0.6666667   0.2666667

I played around with the reshape, merge, and melt functions but haven't found the solution yet.

Thanks for any hints...

like image 448
behas Avatar asked Jan 20 '11 14:01

behas


Video Answer


1 Answers

Use rbind to create one data frame containing everything.

data_list <- list()
data_list[["system1"]] <- read.table(tc <- textConnection("file    cumSize     cumloadTime     query1
1  ../data/data1.dat    100000      158.1000        0.4333333
2  ../data/data2.dat    200000      394.9000        0.5000000
3  ../data/data3.dat    250000      561.8667        0.6666667"), header = TRUE); close(tc)

data_list[["system2"]] <- read.table(tc <- textConnection("file    cumSize     cumloadTime     query1
1  ../data/data1.dat    100000      120.1000        0.4333333
2  ../data/data2.dat    200000      244.9000        0.4500000
3  ../data/data3.dat    250000      261.8667        0.2666667"), header = TRUE); close(tc)

for(n in names(data_list)) data_list[[n]]$system <- n

all_data <- do.call(rbind, data_list)

Forget matplot, use ggplot instead, e.g.,

p1 <- ggplot(all_data, aes(cumSize, cumloadTime, color = system)) + geom_line(); p1
p2 <- ggplot(all_data, aes(cumSize, query1, color = system)) + geom_line(); p2
like image 182
Richie Cotton Avatar answered Oct 16 '22 04:10

Richie Cotton