Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I read multiple files from multiple directories into R for processing?

Tags:

r

batch-file

I am running a simulation study and need to process and save the results from several text files. I have the data organized in such a way where there are sub directories and within each sub directory, I need to process and get individual results for 1000 data files. This is very easy to do in SAS using macros. However, I am new to R and cannot figure out how to do such. Below is what I am trying to accomplish.

DATA Folder-> DC1 -> DC1R1.txt ... DC1R1000.txt
              DC2 -> DC2R1.txt ... DC2R1000.txt

Any help would be greatly appreciated!

like image 845
Stefanie Avatar asked Sep 11 '11 05:09

Stefanie


People also ask

How do I read multiple data files in R?

In order to read multiple CSV files or all files from a folder in R, use data. table package. data. table is a third-party library hence, in order to use data.

How do I set multiple working directory in R?

Use the Tools | Change Working Dir...menu (Session | Set Working Directory on a mac). This will also change directory location of the Files pane. From within the Files pane, use the More | Set As Working Directory menu.


2 Answers

I'm not near a computer with R right now, but read the help for file-related functions:

The dir function will list the files and directories. It has a recursive argument. list.files is an alias for dir. The file.info function will tell you (among other things) if a path is a directory and file.path will combine path parts.

The basename and dirname functions might also be useful.

Note that all these functions are vectorized.

EDIT Now at a computer, so here's an example:

# Make a function to process each file
processFile <- function(f) {
  df <- read.csv(f)
  # ...and do stuff...
  file.info(f)$size # dummy result
}

# Find all .csv files
files <- dir("/foo/bar/", recursive=TRUE, full.names=TRUE, pattern="\\.csv$")

# Apply the function to all files.
result <- sapply(files, processFile)
like image 67
Tommy Avatar answered Oct 19 '22 03:10

Tommy


If you need to run the same analysis on each of the files, then you can access them in one shot using list.files(recursive = T). This is assuming that you have already set your working directory to Data Folder. The recursive = T lists all files within subdirectories as well.

like image 24
Ramnath Avatar answered Oct 19 '22 01:10

Ramnath