Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Faster %in% operator

Tags:

The fastmatch package implements a much faster version of match for repeated matches (e.g. in a loop):

set.seed(1) library(fastmatch) table <- 1L:100000L x <- sample(table, 10000, replace=TRUE) system.time(for(i in 1:100) a <-  match(x, table)) system.time(for(i in 1:100) b <- fmatch(x, table)) identical(a, b) 

Is there a similar implementation for %in% I could use to speed up repeated lookups?

like image 597
Zach Avatar asked Oct 04 '15 15:10

Zach


People also ask

Why operator is faster than function?

Are operators faster than functions? Calling a function at runtime is potentially slower than not calling a function. But, as we've found out, an operator can actually internally call a function. Besides, a function call for the abstract machine doesn't necessarily mean that a function will be called at runtime.

Which comparison operator is faster?

Some processors are quicker when comparing against zero. So > 0 might be faster than >= 1 , as an example. @Simple A decent compiler will replace >= 1 with > 0 if it's faster.

Is in operator Python fast?

Is the in operator's speed in Python proportional to the length of the iterable? Yes. The time for in to run on a list of length n is O(n) . It should be noted that it is O(1) for x in set and x in dict as they are hashed, and in is constant time.


1 Answers

Look at the definition of %in%:

R> `%in%` function (x, table)  match(x, table, nomatch = 0L) > 0L <bytecode: 0x1fab7a8> <environment: namespace:base> 

It's easy to write your own %fin% function:

`%fin%` <- function(x, table) {   stopifnot(require(fastmatch))   fmatch(x, table, nomatch = 0L) > 0L } system.time(for(i in 1:100) a <- x %in% table) #    user  system elapsed  #   1.780   0.000   1.782  system.time(for(i in 1:100) b <- x %fin% table) #    user  system elapsed  #   0.052   0.000   0.054 identical(a, b) # [1] TRUE 
like image 91
Joshua Ulrich Avatar answered Sep 29 '22 18:09

Joshua Ulrich