What is the fastest way to check if a string matches a certain pattern? Is regex the best way?
For example, I have a bunch of strings and want to check each one to see if they are a valid IP address (valid in this case meaning correct format), is the fastest way to do this using regex? Or is there something faster with like string formatting or something.
Something like this is what I have been doing so far:
for st in strs: if re.match('\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', st) != None: print 'IP!'
The simplest way to validate if a string represents an IP address is by using the Python ipaddress module. Let's open the Python shell and see what the ipaddress. ip_address() function returns when we pass to it strings that represent a valid and an invalid IPv4 address.
The regular expression for valid IP addresses is : ((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) \.){ 3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
To just test against a single IP, you can just use the subnet mask /32 which means "only this IP address" as a subnet, or you can pass the IP address to IPv4Nework or IPv6Nework constructors and they will return a subnet value for you.
update: The original answer bellow is good for 2011, but since 2012, one is likely better using Python's ipaddress stdlib module - besides checking IP validity for IPv4 and IPv6, it can do a lot of other things as well.</update>
It looks like you are trying to validate IP addresses. A regular expression is probably not the best tool for this.
If you want to accept all valid IP addresses (including some addresses that you probably didn't even know were valid) then you can use IPy (Source):
from IPy import IP IP('127.0.0.1')
If the IP address is invalid it will throw an exception.
Or you could use socket
(Source):
import socket try: socket.inet_aton(addr) # legal except socket.error: # Not legal
If you really want to only match IPv4 with 4 decimal parts then you can split on dot and test that each part is an integer between 0 and 255.
def validate_ip(s): a = s.split('.') if len(a) != 4: return False for x in a: if not x.isdigit(): return False i = int(x) if i < 0 or i > 255: return False return True
Note that your regular expression doesn't do this extra check. It would accept 999.999.999.999
as a valid address.
If you use Python3, you can use ipaddress
module http://docs.python.org/py3k/library/ipaddress.html. Example:
>>> import ipaddress >>> ipv6 = "2001:0db8:0a0b:12f0:0000:0000:0000:0001" >>> ipv4 = "192.168.2.10" >>> ipv4invalid = "266.255.9.10" >>> str = "Tay Tay" >>> ipaddress.ip_address(ipv6) IPv6Address('2001:db8:a0b:12f0::1') >>> ipaddress.ip_address(ipv4) IPv4Address('192.168.2.10') >>> ipaddress.ip_address(ipv4invalid) Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/lib/python3.4/ipaddress.py", line 54, in ip_address address) ValueError: '266.255.9.10' does not appear to be an IPv4 or IPv6 address >>> ipaddress.ip_address(str) Traceback (most recent call last): File "<console>", line 1, in <module> File "/usr/lib/python3.4/ipaddress.py", line 54, in ip_address address) ValueError: 'Tay Tay' does not appear to be an IPv4 or IPv6 address
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With