I have a directory with multiple RDS files (300+) that I would like to read and combine, these RDS files share the same basic format but have different number of rows & a few different columns in each file. I have the simple code to read one RDS file (All files have same "Events-3digitnumber-4digitnumber-6digitnumber.RDS")
mydata <- readRDS("Events-104-2014-752043.RDS")
Being new to data science I'm sure this simple answer that I'm missing but would I have to use something like list.files() and either lapply or some for loop.
Just to add a tidyverse
answer:
library(tidyverse)
df <- list.files(pattern = ".RDS") %>%
map(readRDS) %>%
bind_rows()
Update:
It is advised to use map_dfr
for binding rows and map_dfc
for binding columns, much more efficient:
df <- list.files(pattern = ".RDS") %>%
map_dfr(readRDS)
First a reproducible example:
data(iris)
# make sure that the two data sets (iris, iris2) have different columns
iris2 = copy(iris)
iris2$Species2 = iris2$Species
iris2$Species = NULL
saveRDS(iris, "Events-104-2014-752043.RDS")
saveRDS(iris2, "Events-104-2015-782043.RDS")
Now you need to
I would use data.table::rbindlist
because it handles differing columns for you when you set fill = TRUE
:
require(data.table)
files = list.files(path = '.', pattern = '^Events-[0-9]{3}-[0-9]{4}-[0-9]{6}\\.RDS$')
dat_list = lapply(files, function (x) data.table(readRDS(x)))
dat = rbindlist(dat_list, fill = TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With