Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python RegEx Meaning

Tags:

python

regex

I'm new to python regular expressions and was wondering if someone could help me out by walking me through what this means (I'll state what I think each bit means here as well).

Thanks!

RegExp:
r'(^.*def\W*)(\w+)\W*\((.*)\):'

r'...' = python definition of regular expression within the ''
(...) = a regex term
(^. = match the beginning of any character
*def\W* = ???
(\w+) = match any of [a, z] 1 or more times
\W*\ = ? i think its the same as the line above this but from 0+ more times instead of 1 but since it matches the def\W line above (which i dont really know the meaning of) i'm not sure.
((.*)\): = match any additional character within brackets ()

thanks!

like image 939
Ben Nelson Avatar asked Oct 31 '11 18:10

Ben Nelson


1 Answers

It seems like a failed attempt to match a Python function signature:

import re

regex = re.compile(r""" # r'' means that \n and the like is two chars
                        # '\\','n' and not a single newline character

    ( # begin capturing group #1; you can get it: regex.match(text).group(1)
      ^   # match begining of the string or a new line if re.MULTILINE is set
      .*  # match zero or more characters except newline (unless
          # re.DOTALL is set)
      def # match string 'def'
      \W* # match zero or more non-\w chars i.e., [^a-zA-Z0-9_] if no
          # re.LOCALE or re.UNICODE
    ) # end capturing group #1

    (\w+) # second capturing group [a-zA-Z0-9_] one or more times if
          # no above flags

    \W*   # see above

    \(    # match literal paren '('
      (.*)  # 3rd capturing group NOTE: `*` is greedy `.` matches even ')'
            # therefore re.match(r'\((.*)\)', '(a)(b)').group(1) == 'a)(b'
    \)    # match literal paren ')'
     :    # match literal ':'
    """, re.VERBOSE|re.DEBUG)

re.DEBUG flag causes the output:

subpattern 1
  at at_beginning
  max_repeat 0 65535
    any None
  literal 100
  literal 101
  literal 102
  max_repeat 0 65535
    in
      category category_not_word
subpattern 2
  max_repeat 1 65535
    in
      category category_word
max_repeat 0 65535
  in
    category category_not_word
literal 40
subpattern 3
  max_repeat 0 65535
    any None
literal 41
literal 58

more

like image 143
jfs Avatar answered Oct 14 '22 04:10

jfs