I have a series of csv files (one per anum) with the same column headers and different number of rows. Originally I was reading them in and merging them like so; <pre class="prettyprint"><code>setwd <- ("N:/Ring data by cruise/Shetland") LengthHeight2013 <- read.csv("N:/Ring data by cruise/Shetland/R_0113A_S2013_WD.csv",sep=",",header=TRUE) LengthHeight2012 <- read.csv("N:/Ring data by cruise/Shetland/R_0212A_S2012_WD.csv",sep=",",header=TRUE) LengthHeight2011 <- read.csv("N:/Ring data by cruise/Shetland/R_0211A_S2011_WOD.csv",sep=",",header=TRUE) LengthHeight2010 <- read.csv("N:/Ring data by cruise/Shetland/R_0310A_S2010_WOD.csv",sep=",",header=TRUE) LengthHeight2009 <- read.csv("N:/Ring data by cruise/Shetland/R_0309A_S2009_WOD.csv",sep=",",header=TRUE) LengthHeight <- merge(LengthHeight2013,LengthHeight2012,all=TRUE) LengthHeight <- merge(LengthHeight,LengthHeight2011,all=TRUE) LengthHeight <- merge(LengthHeight,LengthHeight2010,all=TRUE) LengthHeight <- merge(LengthHeight,LengthHeight2009,all=TRUE) </code></pre> I would like to know if there is a shorter/tidier way to do this, also considering that each time I run the script I might want to look at a different range of years. I also found this bit of code by Tony Cookson which looks like it would do what I want, however the data frame it produces for me has only the correct headers but no data rows. <pre class="prettyprint"><code>multmerge = function(mypath){ filenames=list.files(path=mypath, full.names=TRUE) datalist = lapply(filenames, function(x){read.csv(file=x,header=T)}) Reduce(function(x,y) {merge(x,y)}, datalist) mymergeddata = multmerge("C://R//mergeme") </code></pre>

Find files (<code>list.files</code>) and read the files in a loop (<code>lapply</code>), then call (<code>do.call</code>) row bind (<code>rbind</code>) to put all files together by rows. <pre class="prettyprint"><code>myMergedData <- do.call(rbind, lapply(list.files(path = "N:/Ring data by cruise"), read.csv)) </code></pre> Update: There is a vroom package, according to the manuals it is much faster than data.table::fread and base read.csv. The syntax looks neat, too: <pre class="prettyprint"><code>library(vroom) myMergedData <- vroom(files) </code></pre>

If you're looking for speed, then try this: <pre class="prettyprint"><code>require(data.table) ## 1.9.2 or 1.9.3 ans = rbindlist(lapply(filenames, fread)) </code></pre>

Read and rbind multiple csv files

Tags:

merge

dataframe

r

csv

data-binding

I have a series of csv files (one per anum) with the same column headers and different number of rows. Originally I was reading them in and merging them like so;

setwd <- ("N:/Ring data by cruise/Shetland")
LengthHeight2013 <- read.csv("N:/Ring data by      cruise/Shetland/R_0113A_S2013_WD.csv",sep=",",header=TRUE)
LengthHeight2012 <- read.csv("N:/Ring data by cruise/Shetland/R_0212A_S2012_WD.csv",sep=",",header=TRUE)
LengthHeight2011 <- read.csv("N:/Ring data by cruise/Shetland/R_0211A_S2011_WOD.csv",sep=",",header=TRUE)
LengthHeight2010 <- read.csv("N:/Ring data by cruise/Shetland/R_0310A_S2010_WOD.csv",sep=",",header=TRUE)
LengthHeight2009 <- read.csv("N:/Ring data by cruise/Shetland/R_0309A_S2009_WOD.csv",sep=",",header=TRUE)

LengthHeight <- merge(LengthHeight2013,LengthHeight2012,all=TRUE)
LengthHeight <- merge(LengthHeight,LengthHeight2011,all=TRUE)
LengthHeight <- merge(LengthHeight,LengthHeight2010,all=TRUE)
LengthHeight <- merge(LengthHeight,LengthHeight2009,all=TRUE)

I would like to know if there is a shorter/tidier way to do this, also considering that each time I run the script I might want to look at a different range of years.

I also found this bit of code by Tony Cookson which looks like it would do what I want, however the data frame it produces for me has only the correct headers but no data rows.

multmerge = function(mypath){
filenames=list.files(path=mypath, full.names=TRUE)
datalist = lapply(filenames, function(x){read.csv(file=x,header=T)})
Reduce(function(x,y) {merge(x,y)}, datalist)

mymergeddata = multmerge("C://R//mergeme")

463

asked Jun 02 '14 13:06

helen.h

2 Answers

Find files (list.files) and read the files in a loop (lapply), then call (do.call) row bind (rbind) to put all files together by rows.

myMergedData <- 
  do.call(rbind,
          lapply(list.files(path = "N:/Ring data by cruise"), read.csv))

Update: There is a vroom package, according to the manuals it is much faster than data.table::fread and base read.csv. The syntax looks neat, too:

library(vroom)
myMergedData <- vroom(files)

answered Sep 23 '22 00:09

zx8754

If you're looking for speed, then try this:

require(data.table) ## 1.9.2 or 1.9.3
ans = rbindlist(lapply(filenames, fread))

answered Sep 20 '22 00:09

Arun

Related questions
                            
                                Split a data frame column containing a list into multiple columns using dplyr (or otherwise)
                            
                                Label Encoder functionality in R?
                            
                                nest all columns by row
                            
                                Installing R packages in macOS Mojave: Error in if (nzchar(SHLIB_LIBADD))
                            
                                Arrange plots in a layout which cannot be achieved by 'par(mfrow ='
                            
                                Split date into different columns for year, month and day
                            
                                Truncating the end of a string in R after a character that can be present zero or more times
                            
                                Cannot install package XML to R
                            
                                How to remove error in term-document matrix in R?
                            
                                Proper way to access list elements in R [duplicate]
                            
                                Creating a matrix of increasing concentric rings of numbers in R
                            
                                find value closest to x by group in dplyr [duplicate]
                            
                                Show/hide entire box element in R Shiny
                            
                                ggplot2 bar plot with two categorical variables
                            
                                R - remove anything after comma from column
                            
                                Can I gracefully include formatted SQL strings in an R script?
                            
                                How to display numeric columns in an R dataframe without scientific notation ('e+07')
                            
                                Combinations of multiple vectors in R
                            
                                Run Sweave or knitr with objects from existing R session
                            
                                Avoid clipping of points along axis in ggplot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With