Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex partial extract

I want to find all data enclosed in [[ ]] these brackets.

[[aaaaa]] -> aaaaa

My python code (using re library) was

la = re.findall(r'\[\[(.*?)\]\]', fa.read())

What if I want to extract only 'a' from [[a|b]]

Any concise regular expression for this task? ( extract data before | )

Or should I use additional if statement?

like image 992
SUNDONG Avatar asked Sep 28 '15 03:09

SUNDONG


People also ask

How do you use extract method in Python?

extract() function is used to extract capture groups in the regex pat as columns in a DataFrame. For each subject string in the Series, extract groups from the first match of regular expression pat. Regular expression pattern with capturing groups. Flags from the re module, e.g. re.


1 Answers

You can try:

r'\[\[([^\]|]*)(?=.*\]\])'

([^\]|]*) will match until a | or ] is found. And (?=.*\]\]) is a lookahead to ensure that ]] is matched on RHS of match.

Testing:

>>> re.search( r'\[\[([^\]|]*)(?=.*\]\])', '[[aaa|bbb]]' ).group(1)
'aaa'
>>> re.search( r'\[\[([^\]|]*)(?=.*\]\])', '[[aaabbb]]' ).group(1)
'aaabbb'
like image 183
anubhava Avatar answered Oct 16 '22 07:10

anubhava