Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read nested folder and file names as nested list

Tags:

r

I'm trying to read all the folder and file names of a defined directory into a nested list that will be as long as the number of folders on the top level, then each list element has as many elements as in the sub-directory (if it is a folder) and so on until the level where there are only files and no more folders.

My use case is with my iTunes Music folder:

m <- "/Users/User/Music/iTunes/iTunes Media/Music"  # set the path to the library folder
x <- list.files(m, recursive = FALSE)               # get all artists names (folder names on top level)
# read all Albums and title of each song per album
lst <- setNames(lapply(paste(m, x, sep = "/"), list.files, recursive = T), x)  

The structure of each element in lst is now:

#$`The Kooks`                                       # artist name "The Kooks"
# [1] "Inside In Inside Out/01 Seaside.mp3"         # album name "Inside In Inside Out", title "01 Seaside.mp3"
# [2] "Inside In Inside Out/02 See The World.mp3"                 
#...                           
#[16] "Konk/01 See The Sun.mp3"                     # second album of The Kooks
#[17] "Konk/02 Always Where I Need To Be.mp3"               

What I'm trying to do, is to make the entries of each artist nested lists, so in the example there would be the list element $TheKooks which has 2 (sub-)lists (1 for each album): $Inside In Inside Out and $Konk and each of the album lists has a vector of title names in it (without album names).

I couldn't find the right answers (yet) on SO and tried (unsuccessfully), among other things:

list.files(m, recursive = TRUE)

and

lapply(lst, function(l) {
  strsplit(l, "/")
})

How to do it properly?

P.S.:

  • You can think of the desired output as a list-structure where each file/folder name only occurs as often as in the actual file/folders.
  • As a best case, I'm hoping to find a solution that will be flexible enough to allow for different folder levels and will not require as many explicit lapply calls as the folder depths
like image 757
talat Avatar asked Jan 05 '15 13:01

talat


People also ask

How do I create a list of filenames in a folder?

Press and hold the SHIFT key and then right-click the folder that contains the files you need listed. Click Open command window here on the new menu. A new window with white text on a black background should appear. o To the left of the blinking cursor you will see the folder path you selected in the previous step.


2 Answers

The following function identifies files and folders in a directory. It then calls itself again for each identified folder, creating a list with any files and subfolders found.

fileFun <- function(theDir) {
    ## Look for files (directories included for now)
    allFiles <- list.files(theDir, no.. = TRUE)
    ## Look for directory names
    allDirs <- list.dirs(theDir, full.names = FALSE, recursive = FALSE)
    ## If there are any directories,
    if(length(allDirs)) {
        ## then call this function again
        moreFiles <- lapply(file.path(theDir, allDirs), fileFun)
        ## Set names for the new list
        names(moreFiles) <- allDirs
        ## Determine files found, excluding directory names
        outFiles <- allFiles[!allFiles %in% allDirs]
        ## Combine appropriate results for current list
        if(length(outFiles)) {
            allFiles <- c(outFiles, moreFiles)
        } else {
            allFiles <- moreFiles
        }
    }
    return(allFiles)
}
## Try with your directory?
fileFun(m)
like image 76
BenBarnes Avatar answered Sep 29 '22 15:09

BenBarnes


This solution should work, assuming that your directory structure is always artist/album/songs. If some directories are deeper (or less deep) you won't get what you want.

First, I get the list of directories (that is, the list of artists):

artists <- list.dirs(path=m,recursive=FALSE,full.names=FALSE)

Then I create the nested list:

lapply(artists,function(dir) {
  albums <- list.dirs(path=paste0(m,"/",dir),recursive=FALSE,full.names=FALSE)
  album.list <-
      lapply(albums,function(dir2) {
      list.files(path=paste0(m,"/",dir,"/",dir2))
  })
  names(album.list) <- albums
  album.list
})

And finally, I name the top level of the list:

names(music.list) <- artists

The album level works identically to the artist level: I get the directories (corresponding to the albums), then I list the files inside (corresponding to songs) and finally, I name the list elements by the album names.

EDIT: As docendo discimus points out, the above solution is not general. The following recursive solution should do the job in a more elegant way:

rfl <- function(path) {
  folders <- list.dirs(path,recursive=FALSE,full.names=FALSE)
  if (length(folders)==0) list.files(path)
  else {
    sublist <- lapply(paste0(path,"/",folders),rfl)
    setNames(sublist,folders)
  }  
}
rfl(m)

It is still not fully general: As long as a folder contains subfolders, the algorithm descends into these folders without storing files that might also exist on the same depth into the list.

like image 44
Stibu Avatar answered Sep 29 '22 15:09

Stibu