Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement "autoincrement" on Google AppEngine

I have to label something in a "strong monotone increasing" fashion. Be it Invoice Numbers, shipping label numbers or the like.

  1. A number MUST NOT BE used twice
  2. Every number SHOULD BE used when exactly all smaller numbers have been used (no holes).

Fancy way of saying: I need to count 1,2,3,4 ... The number Space I have available are typically 100.000 numbers and I need perhaps 1000 a day.

I know this is a hard Problem in distributed systems and often we are much better of with GUIDs. But in this case for legal reasons I need "traditional numbering".

Can this be implemented on Google AppEngine (preferably in Python)?

like image 466
max Avatar asked Oct 21 '10 09:10

max


3 Answers

If you absolutely have to have sequentially increasing numbers with no gaps, you'll need to use a single entity, which you update in a transaction to 'consume' each new number. You'll be limited, in practice, to about 1-5 numbers generated per second - which sounds like it'll be fine for your requirements.

like image 196
Nick Johnson Avatar answered Nov 15 '22 14:11

Nick Johnson


If you drop the requirement that IDs must be strictly sequential, you can use a hierarchical allocation scheme. The basic idea/limitation is that transactions must not affect multiple storage groups.

For example, assuming you have the notion of "users", you can allocate a storage group for each user (creating some global object per user). Each user has a list of reserved IDs. When allocating an ID for a user, pick a reserved one (in a transaction). If no IDs are left, make a new transaction allocating 100 IDs (say) from the global pool, then make a new transaction to add them to the user and simultaneously withdraw one. Assuming each user interacts with the application only sequentially, there will be no concurrency on the user objects.

like image 27
Martin v. Löwis Avatar answered Nov 15 '22 13:11

Martin v. Löwis


The gaetk - Google AppEngine Toolkit now comes with a simple library function to get a number in a sequence. It is based on Nick Johnson's transactional approach and can be used quite easily as a foundation for Martin von Löwis' sharding approach:

>>> from gaeth.sequences import * 
>>> init_sequence('invoce_number', start=1, end=0xffffffff)
>>> get_numbers('invoce_number', 2)
[1, 2]

The functionality is basically implemented like this:

def _get_numbers_helper(keys, needed):
  results = []

  for key in keys:
    seq = db.get(key)
    start = seq.current or seq.start
    end = seq.end
    avail = end - start
    consumed = needed
    if avail <= needed:
      seq.active = False
      consumed = avail
    seq.current = start + consumed
    seq.put()
    results += range(start, start + consumed)
    needed -= consumed
    if needed == 0:
      return results
  raise RuntimeError('Not enough sequence space to allocate %d numbers.' % needed)

def get_numbers(needed):
  query = gaetkSequence.all(keys_only=True).filter('active = ', True)
  return db.run_in_transaction(_get_numbers_helper, query.fetch(5), needed)
like image 7
max Avatar answered Nov 15 '22 13:11

max