I'm trying to extract information from a several data files -- specifically, how many complete records exist in each file.
Here's what I've written:
complete <- function(directory, id=1:332) {
files_senscomp <- list.files(directory, full.names=TRUE)[id]
pre_dat <- data.frame()
full_dat <- data.frame()
for (i in seq(files_senscomp)) {
pre_dat <- rbind(pre_dat, read.csv(files_senscomp[i]))
nobs <- sum(complete.cases(pre_dat))
id <- i
full_dat <- rbind(full_dat,data.frame(id,nobs))
}
full_dat
}
What it returns, though, is cumulative. And the IDs are incorrect. Here's the function in action and the result:
> complete("specdata", 40:45)
id nobs
1 1 21
2 2 248
3 3 308
4 4 382
5 5 665
6 6 1089
Why does this not return the IDs 40-45, along with a "nobs" result for an individual file rather than all of the files combined to that point?
This does it:
for (i in seq(files_senscomp)) {
pre_dat <- read.csv(files_senscomp[i]) ## no `rbind`
nobs <- sum(complete.cases(pre_dat))
ID <- id[i] ## `id` is your function argument, taking `40:45`
full_dat <- rbind(full_dat,data.frame(id = ID, nobs = nobs))
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With