I'm trying to parse the following string:
constructor: function(some, parameters, here) {
With the following regex:
re.search("(\w*):\s*function\((?:(\w*)(?:,\s)*)*\)", line).groups()
And I'm getting:
('constructor', '')
But I was expecting something more like:
('constructor', 'some', 'parameters', 'here')
What am I missing?
If you change your pattern to:
print re.search(r"(\w*):\s*function\((?:(\w+)(?:,\s)?)*\)", line).groups()
You'll get:
('constructor', 'here')
This is because (from docs):
If a group is contained in a part of the pattern that matched multiple times, the last match is returned.
If you can do this in one step, I don't know how. Your alternative, of course is to do something like:
def parse_line(line):
cons, args = re.search(r'(\w*):\s*function\((.*)\)', line).groups()
mats = re.findall(r'(\w+)(?:,\s*)?', args)
return [cons] + mats
print parse_line(line) # ['constructor', 'some', 'parameters', 'here']
One option is to use more advanced regex instead of the stock re
. Among other nice things, it supports captures
, which, unlike groups
, save every matching substring:
>>> line = "constructor: function(some, parameters, here) {"
>>> import regex
>>> regex.search("(\w*):\s*function\((?:(\w+)(?:,\s)*)*\)", line).captures(2)
['some', 'parameters', 'here']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With