Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading multiple RDS files

Tags:

r

I have a directory with multiple RDS files (300+) that I would like to read and combine, these RDS files share the same basic format but have different number of rows & a few different columns in each file. I have the simple code to read one RDS file (All files have same "Events-3digitnumber-4digitnumber-6digitnumber.RDS")

    mydata <- readRDS("Events-104-2014-752043.RDS")

Being new to data science I'm sure this simple answer that I'm missing but would I have to use something like list.files() and either lapply or some for loop.

like image 530
L.England Avatar asked Nov 30 '22 23:11

L.England


2 Answers

Just to add a tidyverse answer:

library(tidyverse)

df <- list.files(pattern = ".RDS") %>%
  map(readRDS) %>% 
  bind_rows()

Update:

It is advised to use map_dfr for binding rows and map_dfc for binding columns, much more efficient:

df <- list.files(pattern = ".RDS") %>%
  map_dfr(readRDS)
like image 74
FMM Avatar answered Dec 25 '22 03:12

FMM


First a reproducible example:

data(iris)
# make sure that the two data sets (iris, iris2) have different columns
iris2 = copy(iris)
iris2$Species2 = iris2$Species
iris2$Species = NULL

saveRDS(iris, "Events-104-2014-752043.RDS")
saveRDS(iris2, "Events-104-2015-782043.RDS")

Now you need to

  1. find all file names
  2. read the data
  3. combine the data to one table (if you want that)

I would use data.table::rbindlist because it handles differing columns for you when you set fill = TRUE:

require(data.table)
files = list.files(path = '.', pattern = '^Events-[0-9]{3}-[0-9]{4}-[0-9]{6}\\.RDS$')
dat_list = lapply(files, function (x) data.table(readRDS(x)))
dat = rbindlist(dat_list, fill = TRUE)
like image 20
sbstn Avatar answered Dec 25 '22 01:12

sbstn