I have a large number of csv files that I want to read into R. All the Column headings in the csvs are the same. At first I thought I would need to create a loop based on the list of file names, but after searching I found a faster way. This reads in and combines all the csvs correctly (as far as i know). <pre class="prettyprint"><code>filenames <- list.files(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE) library(plyr) import.list <- llply(filenames, read.csv) combined <- do.call("rbind", import.list) </code></pre> The only problem is that I want to know which csv a specific row of data comes from. I want a column labeled 'source' that contains the name of the csv that the particular row came from. so for example if the csv was called Chicago_IL.csv when the data got into R the row would look something like this: <pre class="prettyprint"><code>> City State Market etc Source > Burbank IL Western etc Chicago_IL </code></pre>

You have already done all the hard work. With a fairly small modification this should be straight-forward. The logic is: <ol> <li>Create a small helper function that reads an individual csv and adds a column with the file name.</li> <li>Call this helper function in llply()</li> </ol> The following should work: <pre class="prettyprint"><code>read_csv_filename <- function(filename){ ret <- read.csv(filename) ret$Source <- filename #EDIT ret } import.list <- ldply(filenames, read_csv_filename) </code></pre> Note that I have proposed another small improvement to your code: read.csv() returns a data.frame - this means you can use ldply() rather than llply().

When importing CSV into R how to generate column with name of the CSV?

Tags:

I have a large number of csv files that I want to read into R. All the Column headings in the csvs are the same. At first I thought I would need to create a loop based on the list of file names, but after searching I found a faster way. This reads in and combines all the csvs correctly (as far as i know).

filenames <- list.files(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE)

library(plyr)
import.list <- llply(filenames, read.csv)

combined <- do.call("rbind", import.list)

The only problem is that I want to know which csv a specific row of data comes from. I want a column labeled 'source' that contains the name of the csv that the particular row came from. so for example if the csv was called Chicago_IL.csv when the data got into R the row would look something like this:

> City    State   Market  etc Source  
> Burbank IL      Western etc Chicago_IL

869

asked Mar 03 '11 21:03

Arndt

1 Answers

You have already done all the hard work. With a fairly small modification this should be straight-forward.

The logic is:

Create a small helper function that reads an individual csv and adds a column with the file name.
Call this helper function in llply()

The following should work:

read_csv_filename <- function(filename){
    ret <- read.csv(filename)
    ret$Source <- filename #EDIT
    ret
}

import.list <- ldply(filenames, read_csv_filename)

Note that I have proposed another small improvement to your code: read.csv() returns a data.frame - this means you can use ldply() rather than llply().

121

answered Oct 19 '22 01:10

Andrie

Related questions
                            
                                matching types in scala
                            
                                stop scrolling to top after AJAX request [duplicate]
                            
                                WTForms: How to select options in SelectMultipleField?
                            
                                How to call servlet through a JSP page
                            
                                Open Google Chrome from VBA/Excel
                            
                                Oracle equivalent to SQL Server DATEPART
                            
                                JavaScriptSerializer. How to ignore property
                            
                                Question about nested code block declarations in Razor
                            
                                While executing gem ... ["extconf.rb", ...] are not files
                            
                                What does "javax.naming.NoInitialContextException" mean? [duplicate]
                            
                                Mysterious padding/margin appears after image in strict mode
                            
                                Exception running boost asio ssl example

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With