Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can pattern search make faster?

Tags:

java

regex

I am working on about 1GB incremental file and I want to search for a particular pattern. Currently I am using Java Regular expressions, do you have any idea how can I do this faster?

like image 435
Kamahire Avatar asked Oct 21 '10 15:10

Kamahire


People also ask

Is regex faster than find?

Summary: find and in depend on string length and location of pattern in the string while regex is somehow string-length independent and faster for very long strings with the pattern at the end.

Is regex faster than for loop?

Regex is faster for large string than an if (perhaps in a for loops) to check if anything matches your requirement.

How efficient is regex?

Regular Expressions are efficient in that one line of code can save you writing hundreds of lines. But they're normally slower (even pre-compiled) than thoughtful hand written code simply due to the overhead. Generally the simpler the objective the worse Regular Expressions are. They're better for complex operations.


1 Answers

Sounds like a job for Apache Lucene.

You probably will have to rethink your searching strategy, but this library is made for doing things like this and adding indexes incrementally.

It works by building reverse indexes of your data (documents in Lucene parlance), and then quickly checking in the reverse indexes for which documents have parts of your pattern.

You can store metadata with the document indexes so you might able to not having to consult the big file in the majority of use-cases.

like image 155
Peter Tillemans Avatar answered Oct 04 '22 09:10

Peter Tillemans