Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to combine multiple .csv files in R?

Tags:

r

csv

I have a directory containing close to 2,000 .csv files.

Each file has the following structure (showing 4 out of 500 rows):

                       Date;QOF
1    2004-01-04 - 2004-01-10;9
2    2004-01-11 - 2004-01-17;11
3    2004-01-18 - 2004-01-24;13
4    2004-01-25 - 2004-01-31;13

The column "QOF" is also the name of the .csv file and each file has a unique name (e.g. "MSTF", "XQS" etc.) I would like this column from each .csv file to be merged on to the first .csv file being read which also contains the date variable. In other words I want to keep all columns from the first file and merge only the second column from all other .csv files on to this file. End result should be something like:

                    Date;QOF;MSTF;XQS
1    2004-01-04 - 2004-01-10;9;10;8
2    2004-01-11 - 2004-01-17;11;11;5
3    2004-01-18 - 2004-01-24;13;31;2
4    2004-01-25 - 2004-01-31;13;45;23

So far I have tried this:

filenames <- list.files()

do.call("cbind", lapply(filenames, read.csv, header = TRUE))
like image 245
Sunv Avatar asked Oct 03 '22 02:10

Sunv


1 Answers

mybig <- do.call( rbind, lapply( listfiles, function(nam){ 
                       cbind(name=nam, read.file(paste0(nam,".csv"), header=TRUE) )
                                                }
        )              )

Untested. And notice that I intentionally did not follow the structure you suggested. I cannot thnk of a more confusing data structure to work with down the line. You might be thinking of using that format for output and would first need to build a dataframe and then write it to a file with semi-colon delimiter.

like image 75
IRTFM Avatar answered Nov 28 '22 06:11

IRTFM