Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will scala compiler hoist regular expressions

I wonder if this:

object Foo {
  val regex = "some complex regex".r
  def foo() {
    // use regex
  }
}

and this:

object Foo {
  def foo() {
    val regex = "some complex regex".r
    // use regex
  }
}

will have any performance difference. i.e., will scala compiler recognize that "some complex regex".r is a constant and cache it, so that it will not recompile every time?

like image 399
lyomi Avatar asked Sep 26 '14 08:09

lyomi


1 Answers

It will have a difference in runtime. Expression from first example will be calculated only once. Expression from second - every time you call Foo.foo(). Calculation here means applying implicitly added function "r" (from scala-library) to the string:

scala> ".*".r
res40: scala.util.matching.Regex = .*

This function actually compiles the regular expression every time you call it (no caching).

Btw, any naive caching of regexps in runtime is vulnerable to OutOfMemory - however, I believe it's possible to implement it safely with WeakHashMap, but current Java's Pattern implementation (which is underlying to scala's Regex) doesn't implement it actually, probably because such implementation may not have predictable effect on performance (GC may have to remove most of cached values every time it's running). Cache with eviction is more predictable, but still not so easy way (who's gonna choose timeout/size for it?). Talking about scala-way, some smart macro could do optimization in compile-time (do caching only for 'string constant'-based regexps), but by default:

Scala compiler also doesn't have any optimizations about regexps because regexp is not a part of the scala language.

So it's better to move static "".r constructions out of the function.

like image 170
dk14 Avatar answered Oct 24 '22 02:10

dk14