How to say "match anything until a specific character, then work your way backwards"?

Question

I am often faced with patterns where the part which is interesting is delimited by a specific character, the rest does not matter. A typical example:

/dev/sda1       472437724  231650856 216764652  52% /

I would like to extract 52 (which can also be 9, or 100 - so 1 to 3 digits) by saying "match anything, then when you get to % (which is unique in that line), see before for the matches to extract".

I tried to code this as .*(\d*)%.* but the group is not matched:

.* match anything, any number of times
% ... until you get to the litteral % (the \d is also matched by .* but my understanding is that once % is matched, the regex engine will work backwards, since it now has an "anchor" on which to analyze what was before -- please tell if this reasoning is incorrect, thank you)
(\d*) ... and now before that % you had a (\d*) to match and group
.* ... and the rest does not matter (match everything)

Sweeper · Accepted Answer

Your regex does not work because . matches too much, and the group matches too little. The group \d* can basically match nothing because of the * quantifier, leaving everything matched by the ..

And your description of .* is somewhat incorrect. It actually matches everything until the end, and moves backwards until the thing after it ((\d*).*) matches. For more info, see here.

In fact, I think your text can be matched simply by:

(\d{1,3})%

And getting group 1.

The logic of "keep looking until you find..." is kind of baked into the regex engine, so you don't need to explicitly say .* unless you want it in the match. In this case you just want the number before the % right?

flokibb · Answer

If you are just looking to extract just the number then I would use:

import re
pattern = r"\d*(?=%)"
string = "/dev/sda1   472437724  231650856 216764652  52% /"
returnedMatches = re.findall(pattern, string)

The regex expression does a positive look ahead for the special character

How to say "match anything until a specific character, then work your way backwards"?

Tags:

python

regex

python-3.x

WoJ

2 Answers

Sweeper

flokibb

Recent Activity

Donate For Us

How to say "match anything until a specific character, then work your way backwards"?

Tags:

python

regex

python-3.x

WoJ

2 Answers

Sweeper

flokibb

Related questions

Recent Activity

Donate For Us