I am stuck. I need a way to iterate through a bunch of subfolders in a directory, pull out 4 .csv files , bind the contents of those 4 .csv files, then write out the new .csv to a new directory using the name of the initial subfolder as the name of the new .csv. I know R could do this. But I am stuck at how to iterate across the subfolders and bind the csv files together. My obstacle is that each subfolder contains the same 4 .csv files using the same 8-digit id. For example, subfolder A contains 09061234.csv, 09061345.csv, 09061456.csv, and 09061560.csv. subfolder B contains 9061234.csv, 09061345.csv, 09061456.csv, and 09061560.csv. (...). There are 42 subfolders, and hence 168 csv files with the same names. I want to compact the files down to 42. I can use <code>list.files</code> to retrieve all the subfolders. But then what? <pre class="prettyprint"><code>##Get Files from directory TF = "H:/working/TC/TMS/Counts/June09" ##List Sub folders SF <- list.files(TF) ##List of File names inside folders FN <- list.files(SF) #Returns list of 168 filenames ###?????### #How to iterate through each subfolder, read each 8-digit integer id file, #bind them all together into one single csv, #Then write to new directory using #the name of the subfolder as the name of the new csv? </code></pre> There is probably a way to do this easily but I am a noob with R. Something involving functions, <code>paste</code> and <code>write.table</code> perhaps? Any hints/help/suggestions is greatly appreciated. Thanks!

After some tweaking of agstudy's code, I came up with the solution I was ultimately after. There were a couple of missing pieces that are more due to the nature of my specific problem, so I am leaving agstudy's answer as "accepted". Turns out a function really wasn't needed. At least not for now. If I need to perform this same task again, I will create a function out of it. For now, I can solve this particular problem without it. Also, for my instance, I needed a conditional "if" statement to handle any non-csv files that may have lived in the subfolders. By adding an if statement, R throws warnings and skips any files that are not comma-separated. Code: <pre class="prettyprint"><code>##Define directory path## TF = "H:/working/TC/TMS/Counts/June09" ##List of subfolder files where file name starts with "0906"## SF <- list.files(TF,recursive=T, pattern=paste("*09061*",x,'*.csv',sep="")) ##Define the list of files to search for## x <- (c('1234' ,'1345','1456','1560') ##Create a conditional to skip over the non-csv files in each folder## if (is.integer(x)){ sources.files <- list.files(TF, recursive=T,full.names=T)} dat <- do.call(rbind,lapply(sources.files,read.csv)) #the warnings thrown are ok--these are generated due to the fact that some of the folders contain .xls files write.table(dat,file="H:/working/TC/TMS/June09Output/June09Batched.csv",row.names=FALSE,sep=",") </code></pre>

How to use R to Iterate through Subfolders and bind CSV files of the same ID?

Tags:

r

I am stuck. I need a way to iterate through a bunch of subfolders in a directory, pull out 4 .csv files , bind the contents of those 4 .csv files, then write out the new .csv to a new directory using the name of the initial subfolder as the name of the new .csv.

I know R could do this. But I am stuck at how to iterate across the subfolders and bind the csv files together. My obstacle is that each subfolder contains the same 4 .csv files using the same 8-digit id. For example, subfolder A contains 09061234.csv, 09061345.csv, 09061456.csv, and 09061560.csv. subfolder B contains 9061234.csv, 09061345.csv, 09061456.csv, and 09061560.csv. (...). There are 42 subfolders, and hence 168 csv files with the same names. I want to compact the files down to 42.

I can use list.files to retrieve all the subfolders. But then what?

##Get Files from directory
TF = "H:/working/TC/TMS/Counts/June09" 
##List Sub folders
SF <- list.files(TF)
##List of File names inside folders
FN <- list.files(SF)
#Returns list of 168 filenames

###?????###
#How to iterate through each subfolder, read each 8-digit integer id file, 
#bind them all together into one single csv, 
#Then write to new directory using 
#the name of the subfolder as the name of the new csv?

There is probably a way to do this easily but I am a noob with R. Something involving functions, paste and write.table perhaps? Any hints/help/suggestions is greatly appreciated. Thanks!

422

asked Mar 02 '13 01:03

myClone

2 Answers

You can use recursive=T option for list.files,

 lapply(c('1234' ,'1345','1456','1560'),function(x){
     sources.files  <- list.files(path=TF,
                                recursive=T,
                                pattern=paste('*09061*',x,'*.csv',sep='')
                                ,full.names=T)
      ## ou read all files with the id and bind them
      dat <- do.call(rbind,lapply(sources.files,read.csv))
      ### write the file for the 
      write(dat,paste('agg',x,'.csv',sep='')
   }

answered Sep 21 '22 00:09

agstudy

After some tweaking of agstudy's code, I came up with the solution I was ultimately after. There were a couple of missing pieces that are more due to the nature of my specific problem, so I am leaving agstudy's answer as "accepted".

Turns out a function really wasn't needed. At least not for now. If I need to perform this same task again, I will create a function out of it. For now, I can solve this particular problem without it.

Also, for my instance, I needed a conditional "if" statement to handle any non-csv files that may have lived in the subfolders. By adding an if statement, R throws warnings and skips any files that are not comma-separated.
Code:

##Define directory path##
TF = "H:/working/TC/TMS/Counts/June09" 
##List of subfolder files where file name starts with "0906"##
SF <- list.files(TF,recursive=T, pattern=paste("*09061*",x,'*.csv',sep=""))
##Define the list of files to search for##
x <- (c('1234' ,'1345','1456','1560')
##Create a conditional to skip over the non-csv files in each folder##
if (is.integer(x)){
  sources.files  <- list.files(TF, recursive=T,full.names=T)}

dat <- do.call(rbind,lapply(sources.files,read.csv))
#the warnings thrown are ok--these are generated due to the fact that some of the folders contain .xls files
write.table(dat,file="H:/working/TC/TMS/June09Output/June09Batched.csv",row.names=FALSE,sep=",")

answered Sep 19 '22 00:09

myClone

Related questions
                            
                                How to duplicate last row by group (ID)?
                            
                                How to edit and save changes made on Shiny dataTable using DT package
                            
                                Plot Histogram with Points Instead of Bars
                            
                                Scraping a wiki page for the "Periodic table" and all the links
                            
                                R: ggplot2, can I make the facet/strip text wrap around?
                            
                                Reading the last n lines from a huge text file
                            
                                What are the advantages of the "apply" functions? When are they better to use than "for" loops, and when are they not? [duplicate]
                            
                                Why can cosine similarity between two vectors be negative?
                            
                                Get plot() bounding box values
                            
                                Select last value in a row, by row
                            
                                Simple combinatorics in R
                            
                                Average values of a point dataset to a grid dataset
                            
                                Column names on each page with xtable in Sweave
                            
                                How extract regression results from lme, lmer, glmer to Latex?
                            
                                How to curry a ... argument by position in R?
                            
                                Setting levels when creating a factor vs. `levels()<-`
                            
                                How to add RMSE, slope, intercept, r^2 to R plot?
                            
                                emulating multiple dispatch using S3 for "+" method - possible?
                            
                                R shortcut to getting last n entries in a vector [duplicate]
                            
                                create a Corpus from many html files in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use R to Iterate through Subfolders and bind CSV files of the same ID?

Tags:

r

myClone

People also ask

2 Answers

agstudy

myClone

Recent Activity

Donate For Us