Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Method regex.scanner() cannot be found in the Python 3.5.1 documentation, but the Interpreter works well

I am learning Python with Python Cookbook, 3rd. On page 67, here is a sample code like this

import re
NAME = r'(?P<NAME>[a-zA-Z_][a-zA-Z_0-9]*)'
NUM = r'(?P<NUM>\d+)'
PLUS = r'(?P<PLUS>\+)'
TIMES = r'(?P<TIMES>\*)'
EQ = r'(?P<EQ>=)'
WS = r'(?P<WS>\s+)'    
master_pat = re.compile('|'.join([NAME, NUM, PLUS, TIMES, EQ, WS]))
scanner = master_pat.scanner('foo = 42')
scanner.match()
 ......

I was trying to find the signature of method regex.scanner() in Python Standard Documentation, but I failed.There is nothing about regex.scanner().On the other hand, the sample code runs quick well with the Interpreter.Does anyone know what's the situation?Or it's just a common case of lacking signature details in CPython?

like image 911
ZuoHe Erya Avatar asked May 06 '16 15:05

ZuoHe Erya


People also ask

How does regex work Python?

Regular Expressions, also known as “regex” or “regexp”, are used to match strings of text such as particular characters, words, or patterns of characters. It means that we can match and extract any string pattern from the text with the help of regular expressions.

Which module in Python supports regular expressions?

The Python "re" module provides regular expression support.

How do I import re in Python?

Python has a module named re to work with RegEx. Here's an example: import re pattern = '^a...s$' test_string = 'abyss' result = re. match(pattern, test_string) if result: print("Search successful.") else: print("Search unsuccessful.")


1 Answers

It's a hidden gem :-)

This is where things get interesting. For the last 15 years or so, there has been a completely undocumented feature in the regular expression engine: the scanner. The scanner is a property of the underlying SRE pattern object where the engine keeps matching after it found a match for the next one. There even exists an re.Scanner class (also undocumented) which is built on top of the SRE pattern scanner which gives this a slightly higher level interface.

The scanner as it exists in the re module is not very useful unfortunately for making the 'not matching' part faster, but looking at its sourcecode reveals how it's implemented: on top of the SRE primitives.

like image 197
totoro Avatar answered Nov 15 '22 00:11

totoro