I wonder if anyone can provide me with the regular expressions needed to parse a string like:
'foo bar "multiple word tag"'
into an array of tags like:
["foo","bar","multiple word tag"]
Thanks
In Ruby
scan(/\"([\w ]+)\"|(\w+)/).flatten.compact
E.g.
"foo bar \"multiple words\" party_like_1999".scan(/\"([\w ]+)\"|(\w+)/).flatten.compact
=> ["foo", "bar", "multiple words", "party_like_1999"]
You could implement a scanner to do this. For instance, in Python it'd look something like this:
import re
scanner = re.Scanner([
(r"[a-zA-Z_]\w*", lambda s,t:t), # regular tag
(r"\".*?\"", lambda s,t:t[1:-1]), # multi-word-tag
(r"\s+", None), # whitespace not in multi-word-tag
])
tags, _ = scanner.scan('foo bar "multiple word tag"')
print tags
# ['foo', 'bar', 'multiple word tag']
This is called lexical analysis.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With