Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is java.util.regexp efficient enough?

Tags:

java

regex

I need to do a lot of searches of certain patterns in source files while the user is changing them, so I need to do regexp matching that is efficient in time and memory. The pattern repeats itself so should be compiled once, but I need to be able to retrieve subparts (rather than just confirm a match)

I'm considering using java.util.regexp or the Jakarta perl5util (if it still exists, been a few years since I used it), or perhaps the Eclipse search engine though I doubt that ti's smarter.

Is there any significant performance difference between the two?

like image 870
Uri Avatar asked Oct 10 '08 05:10

Uri


2 Answers

I am not sure there is a huge performance gap in term of the different regexp java engines.

But there sure is a performance issue when constructing a regexp (and that is, if the data is large enough, as noted by Jeff Atwood)

The only thing you should avoid is catastrophic backtracking, better avoided when using atomic grouping.

So, by default I would use the java.utils.regexp engine, unless you have specific perl-compliant sources of regexp you need to reuse in your program.

Then I would carefully construct the regexp I intend to use.

But in term of choosing one engine or another... as it has been said in many other questions...:

  • "make it work, make it fast - in that order"
  • beware of "premature optimization".
like image 81
VonC Avatar answered Sep 18 '22 02:09

VonC


As VonC says, you need to know your regexps. It doesn't hurt to compile the Regexes beforehand OTHERWISE, the cost of compiling regex each time can hurt the performance badly.

For some categories, there are alternate libraries : http://jint.sourceforge.net/jint.html which might have better performance. Then again, it depends upon which version of java you're using.

JDK 1.6 shows the maturity of the regex engine with good features and performance combined.

like image 41
anjanb Avatar answered Sep 21 '22 02:09

anjanb