I would like to encode an IP address in as short a string as possible using all the printable characters. According to https://en.wikipedia.org/wiki/ASCII#Printable_characters these are codes 20hex to 7Ehex.
For example:
shorten("172.45.1.33") --> "^.1 9" maybe.
In order to make decoding easy I also need the length of the encoding always to be the same. I also would like to avoid using the space character in order to make parsing easier in the future.
How can one do this?
I am looking for a solution that works in Python 2.7.x.
My attempt so far to modify Eloims's answer to work in Python 2:
First I installed the ipaddress backport for Python 2 (https://pypi.python.org/pypi/ipaddress) .
#This is needed because ipaddress expects character strings and not byte strings for textual IP address representations
from __future__ import unicode_literals
import ipaddress
import base64
#Taken from http://stackoverflow.com/a/20793663/2179021
def to_bytes(n, length, endianess='big'):
h = '%x' % n
s = ('0'*(len(h) % 2) + h).zfill(length*2).decode('hex')
return s if endianess == 'big' else s[::-1]
def def encode(ip):
ip_as_integer = int(ipaddress.IPv4Address(ip))
ip_as_bytes = to_bytes(ip_as_integer, 4, endianess="big")
ip_base85 = base64.a85encode(ip_as_bytes)
return ip_base
print(encode("192.168.0.1"))
This now fails because base64 doesn't have an attribute 'a85encode'.
IP (Internet Protocol) -Address is the basic fundamental concept of computer networks which provides the address assigning capabilities to a network. Python provides ipaddress module which is used to validate and categorize the IP address according to their types (IPv4 or IPv6).
Encoded Unicode text is represented as binary data ( bytes ). The str type can contain any literal Unicode character, such as "Δv / Δt", all of which will be stored as Unicode. Python 3 accepts many Unicode code points in identifiers, meaning résumé = "~/Documents/resume.pdf" is valid if this strikes your fancy.
Other Encodings Available in Python. So far, you’ve seen four character encodings: ASCII; UTF-8; UTF-16; UTF-32; There are a ton of other ones out there. One example is Latin-1 (also called ISO-8859-1), which is technically the default for the Hypertext Transfer Protocol (HTTP), per RFC 2616. Windows has its own Latin-1 variant called cp1252.
Python’s re module defaults to the re.UNICODE flag rather than re.ASCII. This means, for instance, that r"\w" matches Unicode word characters, not just ASCII letters. The default encoding in str.encode() and bytes.decode() is UTF-8.
An IP stored in binary is 4 bytes.
You can encode it in 5 printable ASCII characters using Base85.
Using more printable characters won't be able to shorten the resulting string more than that.
import ipaddress
import base64
def encode(ip):
ip_as_integer = int(ipaddress.IPv4Address(ip))
ip_as_bytes = ip_as_integer.to_bytes(4, byteorder="big")
ip_base85 = base64.a85encode(ip_as_bytes)
return ip_base85
print(encode("192.168.0.1"))
I found this question looking for a way to use base85/ascii85 on python 2. Eventually I discovered a couple of projects available to install via pypi. I settled on one called hackercodecs
because the project is specific to encoding/decoding whereas the others I found just offered the implementation as a byproduct of necessity
from __future__ import unicode_literals
import ipaddress
from hackercodecs import ascii85_encode
def encode(ip):
return ascii85_encode(ipaddress.ip_address(ip).packed)[0]
print(encode("192.168.0.1"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With