Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find overlapping Regexp matches

I want to find all matches within a given string including overlapping matches. How could I achieve it?

# Example
"a-b-c-d".???(/\w-\w/)  # => ["a-b", "b-c", "c-d"] expected

# Solution without overlapped results
"a-b-c-d".scan(/\w-\w/) # => ["a-b", "c-d"], but "b-c" is missing
like image 204
sschmeck Avatar asked Dec 07 '16 22:12

sschmeck


People also ask

What does ?= Mean in regex?

?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).

How do I match a pattern in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

How do you use Findall in Python?

findall() module is used to search for “all” occurrences that match a given pattern. In contrast, search() module will only return the first occurrence that matches the specified pattern. findall() will iterate over all the lines of the file and will return all non-overlapping matches of pattern in a single step.


1 Answers

Use capturing inside a positive lookahead:

"a-b-c-d".scan(/(?=(\w-\w))/).flatten
 # => ["a-b", "b-c", "c-d"]

See Ruby demo

like image 117
Wiktor Stribiżew Avatar answered Sep 20 '22 22:09

Wiktor Stribiżew