I have a simple function in R that runs summary() via lapply() on many CSVs from one directory that I specify. The function is shown below:
# id -- the file name (i.e. 001.csv) so ID == 001.
# directory -- location of the CSV files (not my working directory)
# summarize -- boolean val if summary of the CSV to be output to console.
getMonitor <- function(id, dir, summarize = FALSE)
{
fl <- list.files(dir, pattern = "*.csv", full.names = FALSE)
fdl <- lapply(fl, read.csv)
dataSummary <- lapply(fdl, summary)
if(summarize == TRUE)
{ dataSummary[[id]] }
}
When I try to specify the directory and then pass it as a parameter to the function like so:
dir <- "C:\\Users\\ST\\My Documents\\R\\specdata"
funcVar <- getMonitor("001", dir, FALSE)
I receive the error:
Error in file(file, "rt") : cannot open the connection. In addition: Warning message: In file(file, "rt") : cannot open file '001.csv': No such file or directory
Yet when I run the code below on its own:
fl <- list.files("C:\\Users\\ST\\My Documents\\R\\specdata",
pattern = "*.csv",
full.names = FALSE)
fl[1]
It find the directory I'm pointing to and fl[1] correctly outputs [1] "001.csv" which is the first file listed.
My question is what am I doing wrong when trying to pass this path variable as a parameter to my function. Is R incapable of handling a parameter this way? Is there something I'm just completely missing? I've tried searching around and am familiar with other programming languages so, frankly, I feel kind of stupid/defeated for getting stuck on this right now.
The correct pattern would be '\\. csv$' . What you have now says "match zero or more of the empty character, followed by any character and the letters csv," which would match a lot more than just files ending in '. csv'.
Relative file paths. When R starts a session, it has a location to look for other files. This path is called the current working directory, and this is often shortened to the working directory. Relative paths in a program are specified as starting at the current working directory.
The current working directory is displayed by the RStudio IDE within the title region of the Console pane. You can also check your current working directory by running the command getwd() in the console. There are a number of ways to change the current working directory: Use the setwd R function.
You're passing fl[1]
directly to read.csv
with the qualifying path. If, instead, you use full.names=TRUE
you'll get the full path and your read.csv
step will work properly. However, you'll have to do a little munge to make your if
statement function again.
You could also expand on your lapply
function to paste the directory and file name together:
fdl <- lapply(fl, function(x) read.csv(paste(dir, x, sep='\\')))
Or create this pasted full path in a separate line:
fl.qualified <- paste(dir, fl, sep='\\')
fdl <- lapply(fl.qualified, read.csv)
When you do the paste
step, if you want to be really explicit, I would encourage a regex
to make sure you don't have someone passing a directory with a trailing slash:
fl.qualified <- paste(gsub('\\\\$', '', dir), f1, sep='\')
or something along those lines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With