Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to extract between two strings (which are variables)

I am looking to use regex to extract text which occurs between two strings. I know how to do if i want to extract between the same strings every time (and countless questions asking for this e.g. Regex matching between two strings?), but I want to do it using variables which change, and may themselves include special characters within Regex. (i want any special characters, e.g. * treated as text).

For example if i had:

text = "<b*>Test</b>"
left_identifier = "<b*>"
right_identifier = "</b>

i would want to create regex code which would result in the following code being run:

re.findall('<b\*>(.*)<\/b>',text)

It is the <b\*>(.*)<\/b> part that I don't know how to dynamically create.

like image 931
kyrenia Avatar asked Apr 15 '15 17:04

kyrenia


People also ask

How do I extract a string between two characters?

To extract part string between two different characters, you can do as this: Select a cell which you will place the result, type this formula =MID(LEFT(A1,FIND(">",A1)-1),FIND("<",A1)+1,LEN(A1)), and press Enter key. Note: A1 is the text cell, > and < are the two characters you want to extract string between.

What does (? I do in regex?

(? i) makes the regex case insensitive. (? c) makes the regex case sensitive.

What does \r represent in regex?

The \r metacharacter matches carriage return characters.


1 Answers

You can do something like this:

import re
pattern_string = re.escape(left_identifier) + "(.*?)" + re.escape(right_identifier)
pattern = re.compile(pattern_string)

The escape function will automatically escape special characters. For eg:

>>> import re
>>> print re.escape("<b*>")
\<b\*\>
like image 90
Alexandru Chirila Avatar answered Oct 09 '22 05:10

Alexandru Chirila