Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract IPs and Ports from a list in Python 3.x

I would like to extract an IP and Port from a returned list. I am currently using str(var).replace command to remove extra characters. This will/has cause problems when the string format changes making the .replace command through an error

def discover_device():
    """ This function will look for available device on the local network and extract the IP from the result"""
    discover_device = '[<Device: 192.168.222.123:8075>]' # Actually: call to broadcasting device
    device_ip = str(discover_device).replace('[<Device: ', '').replace(':8075>]', '')

So the problem would come if: [<Device: xxx.xxx.xxx.xxx:xxxx>]

Changed to this: [<now_what: xxx.xxx.xxx.xxx:xxxx>]

The dicovery_device() would through and error.

What is the best practise to identify an ip/port pattern and extract ip and port without having to rely on the integrity of surrounding characters?

From this: [<Device: 192.168.222.123:8075>]

To this: 192.168.222.123:8075

and preferably: [192.168.222.123, 8075]

Taking into consideration IP variances within dot blocks and largest port number based on 16-bit (normally 4 integers after the colon up to 5 integers)

like image 610
Enrique Bruzual Avatar asked Sep 15 '17 19:09

Enrique Bruzual


2 Answers

Assuming an IPv4 address, try extracting numbers and critical punctuation. Then slice the valid result when necessary. Also validating ip addresses may be a safer approach.

In Python 3:

Code

import string
import ipaddress


def validate_port(func):
    """Return the results or raise and exception for invalid ports."""
    def wrapper(arg):
        result = func(arg)
        if len(result) == 2 and not result[-1].isdigit():
            raise ValueError("Invalid port number.")
        return result
    return wrapper


@validate_port
def discover_device(device):
    """Return a list of ip and optional port number.  Raise exception for invalid ip."""
    result = "".join(i for i in device if i in (string.digits +".:")).strip(":").split(":")

    try:
        ipaddress.ip_address(result[0])
    except ValueError as e:
        # Numbers in the device name (index 0) or invalid ip
        try:
            ipaddress.ip_address(result[1])
        except IndexError:
            raise e
        else:
            return result[1:]
    else:
        return result

Demo

discover_device("[<Device: 192.168.222.123>]")
# ['192.168.222.123']

discover_device("[<Device: 192.168.222.123:8075>]")
# ['192.168.222.123', '8075']

discover_device("[<Device.34: 192.168.222.123:8080>]")
# ['192.168.222.123', '8080']

discover_device("[<Device: 192.168.222123>]")
# ValueError: '192.168.222123' does not appear to be an IPv4 or IPv6 address

discover_device("[<Device21: 192.168.222123>]")
# ValueError: '192.168.222123' does not appear to be an IPv4 or IPv6 address

discover_device("[<device.451: 192.168.222.123:80.805>]")
# ValueError: Invalid port number.

Features

  • insensitive to surrounding characters
  • ip address validation (not IPv6) and exception handling
  • safeguard against numbers in the device name
  • validate port numbers (optional)

Details

Typically result is a list comprising the ip and an optional port number. However, in cases where numbers are in the device name, the first index of the result will include unwanted numbers. Here are examples of result:

    # ['192.168.222.123']                                  ip   
    # ['192.168.222.123', '8075']                          ip, port
    # ['192.168.222123']                                   invalid ip
    # ['.34', '192.168.222.123', '8080']                   device #, ip, port
    # ['192.168.222.123', '80.805']                        invalid port

The exception handling tests for numbers in the device name and validates ip addresses in the first or second indices. If none are found, an exception is raised.

Although validating port numbers is outside the scope of the question, ports are assumed to be a number. A simple test was added to the validate_port decorator, which can be applied or updated as desired. The decorator screens the output from discover_device(). If the port is not a pure number, an exception is raised. See this post for modifying restrictions. See this blog for a great tutorial on Python decorators.

Options

If validation is not a concern, the following code should suffice, provided "." is absent from the device name:

def discover_device(device):
    result = "".join(i for i in device if i in (string.digits +".:")).strip(":").split(":")
    if "." not in result[0]:
        return result[1:]
    return result

If a non-decorator solution is preferred, define the following function:

def validate_port(result):
    """Return the results or raise and exception for invalid ports."""
        if len(result) == 2 and not result[-1].isdigit():
            raise ValueError("Invalid port number.")
        return result

Now pass the return values of discover_device() into the latter function, i.e.return validate_port(result[1:]) and return validate_port(result).

Regards to @coder for suggestions.

like image 66
pylang Avatar answered Nov 11 '22 13:11

pylang


No regex is needed for this. Use str's builtin method split.

>>> device = '[<Device: 192.168.222.123:8075>]'
>>> _, ip, port = device.strip('[<>]').split(':')
>>> print((ip.strip(), port))
('192.168.222.123', '8075')

If you really want to use a regex, I would use a simple one:

>>> import re
>>> ip, port = re.findall('([\d.]+)', device)
>>> print((ip, port))
('192.168.222.123', '8075')
like image 4
Zach Gates Avatar answered Nov 11 '22 13:11

Zach Gates