Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Loading data from RData files into a single data table

I'm trying to load data from data frame objects of all .RData files in a specified directory into a single data table. This is how I've tried to do this:

library(data.table)

fileList <- list.files("../cache/FLOSSmole", pattern="\\.RData$", full.names=TRUE)
dataset <- rbindlist(lapply(fileList, FUN=function(file) {as.data.table(load(file))}))

However, the result is different from the expected (single data table containing all data) - it contains just names of data frame objects from the source .RData files:

> str(dataset)
Classes ‘data.table’ and 'data.frame':  39 obs. of  1 variable:
 $ V1: chr  "lpdOfficialBugTags" "lpdLicenses" "lpdMilestones" "lpdSeries" ...
 - attr(*, ".internal.selfref")=<externalptr>
> head(dataset)
                        V1
1:      lpdOfficialBugTags
2:             lpdLicenses
3:           lpdMilestones
4:               lpdSeries
5:             lpdProjects
6: lpdProgrammingLanguages

What am I doing wrong? Your help is greatly appreciated!

My R environment:

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] data.table_1.9.2

loaded via a namespace (and not attached):
[1] plyr_1.8.1    Rcpp_0.11.1   reshape2_1.4  stringr_0.6.2 tools_3.1.0
like image 787
Aleksandr Blekh Avatar asked Feb 07 '26 16:02

Aleksandr Blekh


1 Answers

.RData is a saved workspace, it may contain data frames, but it is not a data frame. How many data frames are there in each .RData? You can load multiple .RData files and they add to the current workspace. Just load them all then merge or rbind the data.frames once they are in your current workspace

# lapply(FileList,function(x) load(x)) # Changed to a for loop, I guess the lapply was only loading into the lapply environment which disappears when the function ends
for (i in 1:length(FileList)) {
   load(FileList[i])
}
my.list <- vector(length(ls()),mode="list")
for (i in 1:length(ls())) {
    my.list[[i]] <- get(ls()[i])
}
my.rbind <- do.call(rbind,my.list)

This is one way. An easier way would be to save individual tables as delimited text files in the first place.

like image 187
JeremyS Avatar answered Feb 09 '26 07:02

JeremyS



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!