Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quick Read and Merge with Data.Table's Fread and Rbindlist

I am looking for a way to quickly read and merge a bunch of data files using data.table's fread and rbindlist functions. I think if fread could take a vector of files names as an argument, it could be one, elegant line like

mergeddata = rbindlist(fread(list.files("my/data/directory/")))

but since that doesn't seem to be an option, I've taken the more awkward approach of looping through the files to read them in and assign them to temporary names and then put together a list of the temporary data table names created. However I get tripped up whenever I am trying to call the list of data.table names. So my questions are (1) how can I pass a list of datatable names to rbindlist in this context, and (2) more broadly is there a better approach to this problem?

Thanks in advance for the time and help!

datafiles = list.files()

datatablelist = c()

for(i in 1:length(datafiles)){
  assign(paste("dt",i,sep=""),fread(datafiles[1]))
  datatablelist = append(datatablelist ,paste("dt",i,sep=""))
}

mergeddata = rbindlist(list(datatablelist))
like image 969
DaedalusBloom Avatar asked Dec 20 '22 04:12

DaedalusBloom


1 Answers

Here is a simple way to bind multiple data frames into one single data frame using fread

# Load library
  library(data.table)

# Get a List of all files named with a key word, say all `.csv` files
  filenames <- list.files("C:/your/folder", pattern=glob2rx("*.csv"), full.names=TRUE)

 # Load and bind all data sets
   data <- rbindlist(lapply(filenames,fread))

And in case you want to bind all data files into a list of data frames, it's as simple as

# Load data sets
  list.DFs <- lapply(filenames,fread)
like image 115
rafa.pereira Avatar answered Jan 20 '23 07:01

rafa.pereira