I'm loading a csvfile into R with data.table's fread
function. It has a bunch of columns that I don't need, so the select
parameter comes in handy. I've noticed, however, that if one of the columns specified in the select does not exist in the csvfile, fread will silently continue. Is it possible to make R throw an error if one of the selected columns doesn't exist in the csvfile?
#csvfile has "col1" "col2" "col3" "col4" etc
colsToKeep <- c("col1", "col2" "missing")
data <- fread(csvfile, header=TRUE, select=colsToKeep, verbose=TRUE)
In the above example, data
will have two columns: col1
, col2
. The remaining columns will be dropped as expected, but missing
is silently skipped. It would certainly be nice to know that fread is skipping that column because it did not find it.
I'd suggest parsing the first row pre-emptively, then throwing your own error. You could do:
read_cols <- function(file_name, colsToKeep) {
header <- fread(file_name, nrows = 1, header = FALSE)
all_in_header <- all(colsToKeep %chin% unlist(header))
stopifnot(all_in_header)
fread(file_name, header=TRUE, select=colsToKeep, verbose=TRUE)
}
my_data <- read_cols(csvfile, c("col1", "col2" "missing"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With