I want to extract an IP address from a string (actually a one-line HTML) using Python.
>>> s = "<html><head><title>Current IP Check</title></head><body>Current IP Address: 165.91.15.131</body></html>"
-- '165.91.15.131' is what I want!
I tried using regular expressions, but so far I can only get to the first number.
>>> import re >>> ip = re.findall( r'([0-9]+)(?:\.[0-9]+){3}', s ) >>> ip ['165']
But I don't have a firm grasp on reg-expression; the above code was found and modified from elsewhere on the web.
An IP address consists of four numbers (each between 0 and 255) separated by periods. The format of an IP address is a 32-bit numeric address written as four decimal numbers (called octets) separated by periods; each number can be written as 0 to 255 (e.g., 0.0. 0.0 to 255.255. 255.255).
The simplest way to determine the IP address of a website is to use our DNS Lookup Tool. Simply go to the DNS Lookup Tool, type the website URL into the text entry, and select Lookup. You'll notice the search yielded a list of IPv4 addresses that differ from the IPs shown using the other methods.
Remove your capturing group:
ip = re.findall( r'[0-9]+(?:\.[0-9]+){3}', s )
Result:
['165.91.15.131']
Notes:
0.00.999.9999
. This isn't necessarily a problem, but you should be aware of it and possibly handle this situation. You could change the +
to {1,3}
for a partial fix without making the regular expression overly complex.You can use the following regex to capture only valid IP addresses
re.findall(r'\b25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\.25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?\b',s)
returns
['165', '91', '15', '131']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With