Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find all strings that are in between two sub strings

Tags:

python

regex

I have the following string as an example:

string = "@@ cat $$ @@dog$^"

I want to extract all the stringa that are locked between "@@" and "$", so the output will be:

[" cat ","dog"]

I only know how to extract the first occurrence:

import re
r = re.compile('@@(.*?)$')
m = r.search(string)
if m:
   result_str = m.group(1) 

Thoughts & suggestions on how to catch them all are welcomed.

like image 495
Captain_Meow_Meow Avatar asked Dec 06 '14 19:12

Captain_Meow_Meow


1 Answers

Use re.findall() to get every occurrence of your substring. $ is considered a special character in regular expressions meaning — "the end of the string" anchor, so you need to escape $ to match a literal character.

>>> import re
>>> s = '@@ cat $$ @@dog$^'
>>> re.findall(r'@@(.*?)\$', s)
[' cat ', 'dog']

To remove the leading and trailing whitespace, you can simply match it outside of the capture group.

>>> re.findall(r'@@\s*(.*?)\s*\$', s)
['cat', 'dog']

Also, if the context has a possibility of spanning across newlines, you may consider using negation.

>>> re.findall(r'@@\s*([^$]*)\s*\$', s)
like image 58
hwnd Avatar answered Oct 04 '22 04:10

hwnd