How can I turn the filename into a variable when reading multiple csvs into R

Question

I have a bunch of csv files that follow the naming scheme: est2009US.csv.

I am reading them into R as follows:

myFiles <- list.files(path="~/Downloads/gtrends/", pattern = "^est[[:digit:]][[:digit:]][[:digit:]][[:digit:]]US*\.csv$")

myDB <- do.call("rbind", lapply(myFiles, read.csv, header = TRUE))

I would like to find a way to create a new variable that, for each record, is populated with the name of the file the record came from.

GSee · Accepted Answer

You can avoid looping twice by using an anonymous function that assigns the file name as a column to each data.frame in the same lapply that you use to read the csvs.

myDB <- do.call("rbind", lapply(myFiles, function(x) {
  dat <- read.csv(x, header=TRUE)
  dat$fileName <- tools::file_path_sans_ext(basename(x))
  dat
}))

I stripped out the directory and file extension. basename() returns the file name, not including the directory, and tools::file_path_sans_ext() removes the file extension.

hadley · Answer

plyr makes this very easy:

library(plyr)
paths <- dir(pattern = "\.csv$")
names(paths) <- basename(paths)

all <- ldply(paths, read.csv)

Because paths is named, all will automatically get a column containing those names.

How can I turn the filename into a variable when reading multiple csvs into R

Tags:

import

r

csv

user2658742

2 Answers

GSee

hadley

Recent Activity

Donate For Us

How can I turn the filename into a variable when reading multiple csvs into R

Tags:

import

r

csv

user2658742

2 Answers

GSee

hadley

Related questions

Recent Activity

Donate For Us