Is there a Python equivalent for Perl's `study`?

Question

From Perl's documentation:

study takes extra time to study SCALAR ($_ if unspecified) in anticipation of doing many pattern matches on the string before it is next modified. This may or may not save time, depending on the nature and number of patterns you are searching and the distribution of character frequencies in the string to be searched;

I'm trying to speed up some regular expression-driven parsing that I'm doing in Python, and I remembered this trick from Perl. I realize I'll have to benchmark to determine if there is a speedup, but I can't find an equivalent method in Python.

tchrist · Accepted Answer

Perl’s study doesn’t really do much anymore. The regex compiled has gotten a whole, whole lot smarter than it was when study was created.

For example, it compiles alternatives into a trie structure with Aho–Corasick prediction.

Run with perl -Mre=debug to see the sorts of cleverness the regex compiler and execution engine apply.

Danica · Answer

As far as I know there's nothing like this built into Python. But according to the perldoc:

The way study works is this: a linked list of every character in the string to be searched is made, so we know, for example, where all the 'k' characters are. From each search string, the rarest character is selected, based on some static frequency tables constructed from some C programs and English text. Only those places that contain this "rarest" character are examined.

This doesn't sound very sophisticated, and you could probably hack together something equivalent yourself.

esmre is kind of vaguely similar. And as @Frg noted, you'll want to use re.compile if you're reusing a single regex (to avoid re-parsing the regex itself over and over).

Or you could use suffix trees (here's one implementation, or here's a C extension with unicode support) or suffix arrays (implementation).

Is there a Python equivalent for Perl's `study`?

Tags:

python

regex

optimization

perl

bonsaiviking

2 Answers

tchrist

Danica

Recent Activity

Donate For Us

Is there a Python equivalent for Perl's `study`?

Tags:

python

regex

optimization

perl

bonsaiviking

2 Answers

tchrist

Danica

Related questions

Recent Activity

Donate For Us