I have a large number of csv files that I want to read into R. All the Column headings in the csvs are the same. At first I thought I would need to create a loop based on the list of file names, but after searching I found a faster way. This reads in and combines all the csvs correctly (as far as i know).
filenames <- list.files(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE)
library(plyr)
import.list <- llply(filenames, read.csv)
combined <- do.call("rbind", import.list)
The only problem is that I want to know which csv a specific row of data comes from. I want a column labeled 'source' that contains the name of the csv that the particular row came from. so for example if the csv was called Chicago_IL.csv when the data got into R the row would look something like this:
> City State Market etc Source
> Burbank IL Western etc Chicago_IL
Method 1: Using read. table() function. In this method of only importing the selected columns of the CSV file data, the user needs to call the read. table() function, which is an in-built function of R programming language, and then passes the selected column in its arguments to import particular columns from the data.
We can import the data into R using the read_csv() function; this is part of the readr package, which is part of the tidyverse .
You have already done all the hard work. With a fairly small modification this should be straight-forward.
The logic is:
The following should work:
read_csv_filename <- function(filename){
ret <- read.csv(filename)
ret$Source <- filename #EDIT
ret
}
import.list <- ldply(filenames, read_csv_filename)
Note that I have proposed another small improvement to your code: read.csv() returns a data.frame - this means you can use ldply() rather than llply().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With