Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting system.time or replicate work with piping %>%

I tend to use the piping operator a lot (%>%, from magrittr, or dplyr library). Until one day I tried to use the system.time command on the right-hand-side.

system.time(mean(rnorm(1E7))) # ok
#### user     system      elapsed 
#### 3.52        0.05        3.58 
rnorm(1E7) %>% mean %>% system.time # ?
#### user     system      elapsed 
#### 0        0        0

So I went reading the documentation and I tried this (it says you can force the evaluation of RHS first by enclosing it in parenthesis, but it gives the same behaviour:

rnorm(1E7) %>% mean %>% (function(x) system.time(x))
#### user     system      elapsed 
#### 0        0        0

My question is the following:

1. Why exactly is the command system.time not working as expected when placed at the end of a piping line?

2. Is there a way to measure the calculation time of a line of code composed of pipings, without having to place the whole line inside parenthesis (which would annihilate the practical benefits of piping...) or using proc.time?

Note: same issue with the replicate command.

like image 632
agenis Avatar asked Dec 08 '22 21:12

agenis


1 Answers

Second best I can do is to make a wrapper on system.time that takes an unevaluated expression and evaluates it, and then you have to wrap the timed expression in curly brackets and quote it when you pipe it so it isn't evaluated until my wrapper function gets its paws on it:

> psystime = function(e){system.time(eval(e))}
> quote({rnorm(1e7) %>%  mean}) %>% psystime
   user  system elapsed 
  0.764   0.004   0.767 
> 

I say second best, because the best answer is simply not to do this at all. Sometimes pipes are the problem, not the solution.

Another possibility is to wrap your piped expression in quotes and feed it to a system.time wrapper that runs an evaluated version of its argument as text:

> esystime = function(e){system.time(eval(parse(text=e)))}
> "rnorm(1e7) %>%  mean" %>% esystime
   user  system elapsed 
  1.075   0.033   1.137 

I'm guessing the use case for this is really when you have a long pipeline and want to quickly see how long it takes to run, so you naturally want to just bung %>% system.time on the end. Its probably just as easy, assuming you know the keyboard shortcuts for "start of line" and "end of line", to put system.time( on the start and ) on the end.

like image 183
Spacedman Avatar answered Dec 10 '22 09:12

Spacedman