I would like to return a matrix/data.frame each row containing arguments and the content of a file.
However, there may be many files, so I would prefer if I could load the file lazily, so the file is only read if the actual content is requested. The function below loads the files actively if as.func=F
.
It would be perfect if it could load them lazily, but it would also be acceptable, if instead of the content a function is returned that would read the content.
I can make functions that read the content (see below with as.func=T
), but for some reason I cannot put that into the data.frame to return.
load_parallel_results <- function(resdir,as.func=F) {
## Find files called .../stdout
stdoutnames <- list.files(path=resdir, pattern="stdout", recursive=T);
## Find files called .../stderr
stderrnames <- list.files(path=resdir, pattern="stderr", recursive=T);
if(as.func) {
## Create functions to read them
stdoutcontents <-
lapply(stdoutnames, function(x) { force(x); return(function() { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } ) } );
stderrcontents <-
lapply(stderrnames, function(x) { force(x); return(function() { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } ) } );
} else {
## Read them
stdoutcontents <-
lapply(stdoutnames, function(x) { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } );
stderrcontents <-
lapply(stderrnames, function(x) { return(paste(readLines(paste(resdir,x,sep="/")),collapse="\n")) } );
}
if(length(stdoutnames) == 0) {
## Return empty data frame if no files found
return(data.frame());
}
## Make the columns containing the variable values
m <- matrix(unlist(strsplit(stdoutnames, "/")),nrow = length(stdoutnames),byrow=T);
mm <- as.data.frame(m[,c(F,T)]);
## Append the stdout and stderr column
mmm <- cbind(mm,unlist(stdoutcontents),unlist(stderrcontents));
colnames(mmm) <- c(strsplit(stdoutnames[1],"/")[[1]][c(T,F)],"stderr");
## Example:
## parallel --results my/res/dir --header : 'echo {};seq {myvar1}' ::: myvar1 1 2 ::: myvar2 A B
## > load_parallel_results("my/res/dir")
## myvar1 myvar2 stdout stderr
## [1,] "1" "A" "1 A\n1" ""
## [2,] "1" "B" "1 B\n1" ""
## [3,] "2" "A" "2 A\n1\n2" ""
## [4,] "2" "B" "2 B\n1\n2" ""
return(mmm);
}
Background
GNU Parallel has a --results option that stores output in a structured way. If there are 1000000 outputfiles it may be hard to manage them. R is good for that, but it would be awfully slow if you had to read all 1000000 files just to get the ones where argument 1 = "Foo" and argument 2 = "Bar".
Unfortunately I don't think you can save a function in a data.frame column. But you could store the deparsed text of the function and evaluate it when needed:
e.g.
myFunc <- function(x) { print(x) }
# convert the function to text
funcAsText <- deparse(myFunc)
# convert the text back to a function
newMyFunc <- eval(parse(text=funcAsText))
# now you can use the function newMyFunc exactly like myFunc
newMyFunc("foo")
> [1] "foo"
EDIT:
Since the files are a lot, I suggest you to simply store a string indicating the type of the file and create a function that understands the types and reads the file accordingly; so you can call it when needed by passing the type and filepath.
(Without reading the question body:)
You can store functions in a data.frame
like this:
df <- data.frame(fun = 1:3)
df$fun <- c(mean, sd, function(x) x^2)
I am not sure if this will break other things, so consider using tibble
or data.table
from the same named packages which really support arbitrary object types.
You can use 2D lists to store your functions. Obviously, you lose some of the checks you get with DFs, but that's the whole point here:
> funs <- c(replicate(5, function(x) NULL), replicate(5, function(y) TRUE))
> names <- as.list(letters[1:10])
> # df doesn't work
> df <- data.frame(names=names)
> df.2 <- cbind(df, funs)
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ""function"" to a data.frame
# but 2d lists do
> lst.2d <- cbind(funs, names)
> lst.2d[2, 1]
$funs
function (x)
NULL
> lst.2d[6, 1]
$funs
function (y)
TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With