I am trying to work out a good regular expression for a python comment(s) that is located within a long string. So far I have
regex:
#(.?|\n)*
string:
'### this is a comment\na = \'a string\'.toupper()\nprint a\n\na_var_name = " ${an.injection} "\nanother_var = " ${bn.injection} "\ndtabse_conn = " ${cn.injection} "\n\ndef do_something()\n # this call outputs an xml stream of the current parameter dictionary.\n paramtertools.print_header(params)\n\nfor i in xrange(256): # wow another comment\n print i**2\n\n'
I feel like there is a much better way to get all of the individual comments from the string, but I am not an expert in regular expressions. Does anyone have a better solution?
Get the comments from matched group at index 1.
(#+[^\\\n]*)
DEMO
Sample code:
import re
p = re.compile(ur'(#+[^\\\n]*)')
test_str = u"..."
re.findall(p, test_str)
Matches:
1. ### this is a comment
2. # this call outputs an xml stream of the current parameter dictionary.
3. # wow another comment
Since this is a python code in the string, I'd use tokenize module to parse it and extract comments:
import tokenize
import StringIO
text = '### this is a comment\na = \'a string\'.toupper()\nprint a\n\na_var_name = " ${an.injection} "\nanother_var = " ${bn.injection} "\ndtabse_conn = " ${cn.injection} "\n\ndef do_something():\n # this call outputs an xml stream of the current parameter dictionary.\n paramtertools.print_header(params)\n\nfor i in xrange(256): # wow another comment\n print i**2\n\n'
tokens = tokenize.generate_tokens(StringIO.StringIO(text).readline)
for toktype, ttext, (slineno, scol), (elineno, ecol), ltext in tokens:
if toktype == tokenize.COMMENT:
print ttext
Prints:
### this is a comment
# this call outputs an xml stream of the current parameter dictionary.
# wow another comment
Note that the code in the string has a syntax error: missing : after the do_something() function definition.
Also, note that ast module would not help here, since it doesn't preserve comments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With