Regular expression matching algorithm in Java

Tags:

This article says that regexp matching in Java is slow because regexps with "back references" cannot be matched efficiently. The article explains efficient Thomson's NFA-based matching algorithm (invented in 1968) which works for regexps without "back references". However the Pattern javadoc says Java regexps use NFA-based approach.

Now I wonder how efficient Java regexp matching is and what algorithm it uses.

786

asked Oct 08 '13 15:10

Michael

1 Answers

java.util.regex.Pattern uses Boyer–Moore string search algorithm

/* Attempts to match a slice in the input using the Boyer-Moore string
 * matching algorithm. The algorithm is based on the idea that the
 * pattern can be shifted farther ahead in the search text if it is
 * matched right to left.
 */

private void compile() {
    ----------------------
    -----------------------

   if (matchRoot instanceof Slice) {
        root = BnM.optimize(matchRoot);
        if (root == matchRoot) {
            root = hasSupplementary ? new StartS(matchRoot) : new Start(matchRoot);
        }
    } else if (matchRoot instanceof Begin || matchRoot instanceof First) {
        root = matchRoot;
    } else {
        root = hasSupplementary ? new StartS(matchRoot) : new Start(matchRoot);
    }
}

answered Nov 10 '22 00:11

Prabhakaran Ramaswamy

Related questions
                            
                                SBT run differences between scala and java?
                            
                                Quickly unload bitmaps from memory
                            
                                Spring @Transactional and inheritance
                            
                                How to use auto completion in Eclipse adding fully qualified name
                            
                                Thin controllers
                            
                                Java Properties class implementation handles double/single quoted values?
                            
                                Java multithreading in CPU load
                            
                                Parallel test runner for play framework
                            
                                How to avoid the duplication of code in this case?
                            
                                primefaces partial processing not working
                            
                                Code Coverage with Jenkins [closed]
                            
                                How to (safely) remove unnecessary Maven dependencies in Eclipse?
                            
                                Disable TLD scanning at appengine initialization
                            
                                Set the initial directory in SWT FileDialog
                            
                                Android Fatal Signal Error 11 SIGSEGV On Different Versions of JellyBean
                            
                                Unit testing - implementing equals only to facilitate testing
                            
                                Does a binary tree contain another tree?
                            
                                Ehcache, Redis and Gemfire which Cache for which Scenario?
                            
                                Kafka: Cant Create Multiple Stream Consumers
                            
                                Efficient concatenation of Bit Arrays Java

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regular expression matching algorithm in Java

Tags:

java

regex

algorithm

Michael

People also ask

1 Answers

Prabhakaran Ramaswamy

Recent Activity

Donate For Us