Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I store version number strings in a django database so they are correctly comparable/sortable

I'm developing/maintaining/curating a database of test results gathered from various wearable research devices. Each device has three main components, each of which has two version numbers (firmware and hardware). I'm using a django app to provide a web interface to the database. The version numbers represented either as straight integers, or as triplets (Major, minor, build). The integers are easy enough to handle, and I can obviously store the triplets as strings, but as strings they won't sort correctly or compare correctly, for example if I want only test results produced by devices with firmware version less than 14.x.y.

I can't use a float because of the second 'decimal point' separator. I thought of maybe going for a hack by storing it as a date, but that will limit minor numbers to less than 12 and build numbers to less than 29, and besides I know this is a terrible solution. I probably shouldn't even confess to having thought of it.

Short of extending the database with some PL/SQL to provide a comparison function that treats the strings correctly, is there a simple way to do this? If not, can I even use my custom SQL function with django?

like image 293
sirlark Avatar asked Jan 29 '15 06:01

sirlark


2 Answers

Store them as a zero-padded strings:

>>> def sortable_version(version):
...     return '.'.join(bit.zfill(5) for bit in version.split('.'))
... 
>>> sortable_version('1.1')
'00001.00001'
>>> sortable_version('2')
'00002'
>>> sortable_version('2.1.10')
'00002.00001.00010'
>>> sortable_version('10.1')
'00010.00001'
>>> sortable_version('2') > sortable_version('1.3.4')
True
>>> sortable_version('10') > sortable_version('2')
True
>>> sortable_version('2.3.4') > sortable_version('2')
True

And you can always show regular version from this zero-padded format:

>>> def normalize_version(padded_version):
...     return '.'.join(bit.lstrip('0') for bit in padded_version.split('.'))
... 
>>> normalize_version('00010')
'10'
>>> normalize_version('00002.00001.00010')
'2.1.10'
like image 195
catavaran Avatar answered Oct 21 '22 22:10

catavaran


What you are looking for is called "natural sort".

Another way to implement this is to use database sorting. Here is an example for Postgres:

class VersionRecordManager(models.Manager):

    def get_queryset(self):
        return super().get_queryset().extra(
            select={
                'natural_version':
                    "string_to_array(                     "  
                    "   regexp_replace(                   "  # Remove everything except digits
                    "       version, '[^\d\.]+', '', 'g'  "  # and dots, then split string into
                    "   ), '.'                            "  # an array of integers.
                    ")::int[]                             "
            }
        )

    def available_versions(self):
        return self.filter(available=True).order_by('-natural_version')

    def last_stable(self):
        return self.available_versions().filter(stable=True).first()

class VersionRecord(models.Model):
    objects = VersionRecordManager()
    version = models.CharField(max_length=64, db_index=True)
    available = models.BooleanField(default=False, db_index=True)
    stable = models.BooleanField(default=False, db_index=True)
like image 25
Max Malysh Avatar answered Oct 21 '22 23:10

Max Malysh