Optimization techniques used by std::regex_constants::optimize

Tags:

I am working with std::regex, and whilst reading about the various constants defined in std::regex_constants, I came across std::optimize, reading about it, it sounds like it is useful in my application (I only need one instance of the regex, initialized at the beginning, but it is used multiple times throughout the loading process).

According to the working paper n3126 (pg. 1077), std::regex_constants::optimize:

Specifies that the regular expression engine should pay more attention to the speed with which regular expressions are matched, and less to the speed with which regular expression objects are constructed. Otherwise it has no detectable effect on the program output.

I was curious as to what type of optimization would be performed, but there doesn't seem to be much literature about it (indeed, it seems to be undefined), and one of the only things I found was at cppreference.com, which stated that std::regex_constants::optimize:

Instructs the regular expression engine to make matching faster, with the potential cost of making construction slower. For example, this might mean converting a non-deterministic FSA to a deterministic FSA.

However, I have no formal background in computer science, and whilst I'm aware of the basics of what an FSA is, and understand the basic difference between a deterministic FSA (each state only has one possible next state), and a non-deterministic FSA (with multiple potential next states); I do not understand how this improves matching time. Also, I would be interested to know if there are any other optimizations in various C++ Standard Library implementations.

237

asked Jul 21 '12 13:07

Thomas Russell

2 Answers

There's some useful information on the topic of regex engines and performance trade offs (far more than can fit in a stackoverflow answer) in Mastering Regular Expressions by Jeffrey Friedl.

It's worth noting that Boost.Regex, which was the source for N3126, documents optimize as "This currently has no effect for Boost.Regex."

P.S.

indeed, it seems to be implementation-defined

No, it's unspecified. Implementation-defined means an implementation is required to define the choice of behaviour. Implementations are not required to document how their regex engines are implemented or what (if any) difference the optimize flag makes.

P.S. 2

in various STL implementations

std::regex is not part of the STL, the C++ Standard Library is not the same thing as the STL.

184

answered Oct 19 '22 05:10

Jonathan Wakely

See http://swtch.com/~rsc/regexp/regexp1.html for a nice explanation on how NFA based regex implementations can avoid the exponential backtracking that occurs in DFA matchers in certain circumstances.

answered Oct 19 '22 05:10

JohannesD

Related questions
                            
                                C++ Operator () and 'using' Declaration: Left operand must be l-value error
                            
                                Escalate App Privileges Programmatically OS X
                            
                                When exporting STL std::basic_string template from DLL, I get a LNK2005 error
                            
                                How to test that some code doesn't compile in C++? [duplicate]
                            
                                C++ Input Performance
                            
                                How to Skin an Win32 Application
                            
                                AOP in Objective-C: Inject context-aware code into each method while maintaining DRY
                            
                                Is there a way to force Eclipse to automatically clean every run?
                            
                                Boost parse date/time string and yield .NET-compatible Ticks value
                            
                                Concurrent C++11 - Which toolchains can be used?
                            
                                How does boost::asio::io_service prioritize work?
                            
                                Test an emulator
                            
                                gcc/linux: CppuTest shows memory leak using static vectors, false positive?
                            
                                Why does r-value reference to object generator call require copy constructor?
                            
                                Is it possible to disable Exceptions and RTTI in Boost libraries?
                            
                                C/C++ Framework for distributed computing in a dynamic cluster
                            
                                vector::erase and reverse_iterator
                            
                                Visual Designer for native Win32 Apps
                            
                                How can I make eclipse CDT ignore errors?
                            
                                Is it safe to use temporary object as default argument in C++?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Optimization techniques used by std::regex_constants::optimize

Tags:

c++

regex

optimization

c++11

Thomas Russell

People also ask

2 Answers

Jonathan Wakely

JohannesD

Recent Activity

Donate For Us