Iam building a web app and I would like my URL scheme to look something like this:
someurl.com/object/FJ1341lj
Currently I just use the primary key from my SQL Alchemy objects, but the problem is that I dont want the Urls to be sequential or low numbers. For instance my URLs look like this:
someurl.com/object/1
someurl.com/object/2
Version-1 UUIDs are generated from a time and a node ID (usually the MAC address); version-2 UUIDs are generated from an identifier (usually a group or user ID), time, and a node ID; versions 3 and 5 produce deterministic UUIDs generated by hashing a namespace identifier and name; and version-4 UUIDs are generated ...
UUID() function in MySQL This function in MySQL is used to return a Universal Unique Identifier (UUID) generated according to RFC 4122, “A Universally Unique Identifier (UUID) URN Namespace”. It is designed as a number that is universally unique.
You could use a reversible encoding for your integers:
def int_str(val, keyspace):
""" Turn a positive integer into a string. """
assert val >= 0
out = ""
while val > 0:
val, digit = divmod(val, len(keyspace))
out += keyspace[digit]
return out[::-1]
def str_int(val, keyspace):
""" Turn a string into a positive integer. """
out = 0
for c in val:
out = out * len(keyspace) + keyspace.index(c)
return out
Quick testing code:
keyspace = "fw59eorpma2nvxb07liqt83_u6kgzs41-ycdjh" # Can be anything you like - this was just shuffled letters and numbers, but...
assert len(set(keyspace)) == len(keyspace) # each character must occur only once
def test(v):
s = int_str(v, keyspace)
w = str_int(s, keyspace)
print "OK? %r -- int_str(%d) = %r; str_int(%r) = %d" % (v == w, v, s, s, w)
test(1064463423090)
test(4319193500)
test(495689346389)
test(2496486533)
outputs
OK? True -- int_str(1064463423090) = 'antmgabi'; str_int('antmgabi') = 1064463423090
OK? True -- int_str(4319193500) = 'w7q0hm-'; str_int('w7q0hm-') = 4319193500
OK? True -- int_str(495689346389) = 'ev_dpe_d'; str_int('ev_dpe_d') = 495689346389
OK? True -- int_str(2496486533) = '1q2t4w'; str_int('1q2t4w') = 2496486533
To make the IDs non-contiguous, you could, say, multiply the original value with some arbitrary value, add random "chaff" as the digits-to-be-discarded - with a simple modulus check in my example:
def chaffify(val, chaff_size = 150, chaff_modulus = 7):
""" Add chaff to the given positive integer.
chaff_size defines how large the chaffing value is; the larger it is, the larger (and more unwieldy) the resulting value will be.
chaff_modulus defines the modulus value for the chaff integer; the larger this is, the less chances there are for the chaff validation in dechaffify() to yield a false "okay".
"""
chaff = random.randint(0, chaff_size / chaff_modulus) * chaff_modulus
return val * chaff_size + chaff
def dechaffify(chaffy_val, chaff_size = 150, chaff_modulus = 7):
""" Dechaffs the given chaffed value. The chaff_size and chaff_modulus parameters must be the same as given to chaffify() for the dechaffification to succeed.
If the chaff value has been tampered with, then a ValueError will (probably - not necessarily) be raised. """
val, chaff = divmod(chaffy_val, chaff_size)
if chaff % chaff_modulus != 0:
raise ValueError("Invalid chaff in value")
return val
for x in xrange(1, 11):
chaffed = chaffify(x)
print x, chaffed, dechaffify(chaffed)
outputs (with randomness):
1 262 1
2 440 2
3 576 3
4 684 4
5 841 5
6 977 6
7 1197 7
8 1326 8
9 1364 9
10 1528 10
EDIT: On second thought, the randomness of the chaff may not be a good idea, as you lose the canonicality of each obfuscated ID -- this lacks the randomness but still has validation (changing one digit will likely invalidate the whole number if chaff_val
is Large Enough).
def chaffify2(val, chaff_val = 87953):
""" Add chaff to the given positive integer. """
return val * chaff_val
def dechaffify2(chaffy_val, chaff_val = 87953):
""" Dechaffs the given chaffed value. chaff_val must be the same as given to chaffify2(). If the value does not seem to be correctly chaffed, raises a ValueError. """
val, chaff = divmod(chaffy_val, chaff_val)
if chaff != 0:
raise ValueError("Invalid chaff in value")
return val
document_id = random.randint(0, 1000000)
url_fragment = int_str(chaffify(document_id))
print "URL for document %d: http://example.com/%s" % (document_id, url_fragment)
request_id = dechaffify(str_int(url_fragment))
print "Requested: Document %d" % request_id
outputs (with randomness)
URL for document 831274: http://example.com/w840pi
Requested: Document 831274
probably a little longer than you would like.
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import uuid
>>> uuid.uuid4()
UUID('ba587488-2a96-4daa-b422-60300eb86155')
>>> str(uuid.uuid4())
'001f8565-6330-44a6-977a-1cca201aedcc'
>>>
And if you are using sqlalchemy you can define an id column of type uuid like so
from sqlalchemy import types
from sqlalchemy.databases.mysql import MSBinary
from sqlalchemy.schema import Column
import uuid
class UUID(types.TypeDecorator):
impl = MSBinary
def __init__(self):
self.impl.length = 16
types.TypeDecorator.__init__(self,length=self.impl.length)
def process_bind_param(self,value,dialect=None):
if value and isinstance(value,uuid.UUID):
return value.bytes
elif value and not isinstance(value,uuid.UUID):
raise ValueError,'value %s is not a valid uuid.UUID' % value
else:
return None
def process_result_value(self,value,dialect=None):
if value:
return uuid.UUID(bytes=value)
else:
return None
def is_mutable(self):
return False
id_column_name = "id"
def id_column():
import uuid
return Column(id_column_name,UUID(),primary_key=True,default=uuid.uuid4)
If you are using Django, Preet's answer is probably more appropriate since a lot of django's stuff depends on primary keys that are ints.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With