Goal:
- Import the newest file (.csv) from a local directory into R
Goal Details:
- A csv file is uploaded to a folder daily on my Mac. I would like to be able to incorporate a function in my R script that automatically imports the newest file into my workspace for further analysis. The file is uploaded daily around 4:30AM
- I would like this function to be run in the morning (no earlier than 6AM so there's plenty of time for leeway here)
Input Details:
- file type: .csv
- naming convention: example file name: "28 Jul 2014 04:37:47 -0400.csv"
- frequency: daily import @ ~ 04:30
What I've Tried:
- I know this may seem like a weak attempt but I'm really at a loss on how to amend this function below.
- My thought on paper is to 'grab' the id of the newest file, than paste() it in front of the directory name, then viola! (but alas my programming skills are lacking to code this here)
- The code below is what tried to run but it just 'hangs' and doesn't finish. I got this code from this R forum found here
Code:
lastChange = file.info(directory)$mtime
while(TRUE){
currentM = file.info(directory)$mtime
if(currentM != lastChange){
lastChange = currentM
read.csv(directory)
}
# try again in 10 minutes
Sys.sleep(600)
}
My Environment:
- R 3.1
- Mac OS X 10.9.4 (Mavericks)
Thank you so much in advance for any help! :-)
-- readfile.R --
files <- file.info(list.files(directory))
read.csv(rownames(files)[order(files$mtime)][nrow(files)])
I'd put the above script in a cron job that runs every morning at a time when the file for the day will have been written. The below crontab runs it every morning at 8am.
-- in crontab --
0 8 * * * Rscript readfile.R
Read more about cron here.
A more efficient solution using dplyr
/magrittr
pacman::p_load(magrittr)
path <- list.files(path = directory,
pattern = "csv$",
full.names = TRUE) %>%
extract(which.max(file.mtime(.)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With