I have a folder with about 700 text files that I want to import and add a column to. I've figured out how to do this using the following code:
files = list.files(pattern = "*c.txt")
DF <- NULL
for (f in files) {
data <- read.table(f, header = F, sep=",")
data$species <- strsplit(f, split = "c.txt") <-- (column name is filename)
DF <- rbind(DF, data)
}
write.xlsx(DF,"B:/trends.xlsx")
Problem is, there are about 100 files that are empty. so the code stops at the first empty file and I get this error message:
Error in read.table(f, header = F, sep = ",") : no lines available in input
Is there a way to skip over these empty files?
You can skip empty files by checking that file.size(some_file) > 0
:
files <- list.files("~/tmp/tmpdir", pattern = "*.csv")
##
df_list <- lapply(files, function(x) {
if (!file.size(x) == 0) {
read.csv(x)
}
})
##
R> dim(do.call("rbind", df_list))
#[1] 50 2
This skips over the 10 files that are empty, and reads in the other 10 that are not.
Data:
for (i in 1:10) {
df <- data.frame(x = 1:5, y = 6:10)
write.csv(df, sprintf("~/tmp/tmpdir/file%i.csv", i), row.names = FALSE)
## empty file
system(sprintf("touch ~/tmp/tmpdir/emptyfile%i.csv", i))
}
For a different approach that introduces explicit error handling, think about a tryCatch
to handle anything else bad that might happen in your read.table
.
for (f in files) {
data <- tryCatch({
if (file.size(f) > 0){
read.table(f, header = F, sep=",")
}
}, error = function(err) {
# error handler picks up where error was generated
print(paste("Read.table didn't work!: ",err))
})
data$species <- strsplit(f, split = "c.txt")
DF <- rbind(DF, data)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With