I'm a network engineer, trying to dip my toes into programming. I got recommended to try Python.
What I'm trying to do is to save some specific data, matching a string with multiple lines with regexp. We got our data to work with stored in SourceData
.
SourceData = '
ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1'
The number of lines stored in SourceData
is always unknown. Could be 0 lines (empty) to unlimited lines.
I want to match all lines containing ipv4-addresses starting with 11.
This is what I've come up with as a start:
ip1 = re.search('11\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}', SourceData)
if ip1:
ip1 = ip1.group()
Verify:
>>> print ip1
11.22.33.44
OK, seems to work. The idea is that when the whole SourceData
is matched, with the example provided, the final result for this case would be 4 matches:
ip1 = 11.22.33.44
ip2 = 11.11.12.11
ip3 = 11.11.13.11
ip4 = 11.11.14.0
Next to learn, how do I continue to check SourceData
for more matches as described above, and how do I store the multiple matches for use later on in the code? For example, later in the code I would like to use the value from a specific match, lets say match number 4 (11.11.14.0
).
I have read some guidelines for Python and Regex, but it seems I quite don't understand it :)
findall() module is used to search for “all” occurrences that match a given pattern.
Method 1: Regex re. To get all occurrences of a pattern in a given string, you can use the regular expression method re. finditer(pattern, string) . The result is an iterable of match objects—you can retrieve the indices of the match using the match. start() and match.
The re.search() returns only the first match to the pattern from the target string. Use a re.search() to search pattern anywhere in the string.
Throw in an * (asterisk), and it will match everything. Read more. \s (whitespace metacharacter) will match any whitespace character (space; tab; line break; ...), and \S (opposite of \s ) will match anything that is not a whitespace character.
You can use re.findall
to return all of the matches
>>> re.findall(r'11\.\d{1,3}\.\d{1,3}\.\d{1,3}', SourceData)
['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']
Several methods, one of them being:
import re
string = """
ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1'
"""
rx = re.compile(r'^[^\d\n]*(11(?:\.\d+){3})', re.M)
lines = [match.group(1) for match in rx.finditer(string)]
print(lines)
This yields:
['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']
^ # match start of the line
[^\d\n]* # NOT a digit or a newline, 0+ times
11 # 11
(?:\.\d+){3} # .0-9 three times
.+ # rest of the line
The rest is done via re.finditer()
and a list comprehension.
See a demo on regex101.com.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With