Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

compare two python strings that contain numbers

UPDATE: I should have specified this sooner, but not all of the names are simply floats. For example, some of them are "prefixed" with "YT". So for example" YT1.1. so, you have the same problem YT1.9 < YT1.11 should be true. I'm really surprised that the string comparison fails....

hello, this should be a pretty simple question but I can't seem to find the answer. I'd like to sort a bunch of XL worksheets by name. Each of the names are numbers but in the same way that textbook "sections" are numbered, meaning section 4.11 comes after 4.10 which both come after 4.9 and 4.1. I thought simply comparing these numbers as string would do but I get the following:

>>> s1 = '4.11'
>>> s2 = '4.2'
>>> s1> s2
False
>>> n1 = 4.11
>>> n2 = 4.2
>>> n1 > n2
False

how can I compare these two values such that 4.11 is greater than 4.2?

like image 454
Ramy Avatar asked May 19 '11 18:05

Ramy


3 Answers

Convert the names to tuples of integers and compare the tuples:

def splittedname(s):
    return tuple(int(x) for x in s.split('.'))

splittedname(s1) > splittedname(s2)

Update: Since your names apparently can contain other characters than digits, you'll need to check for ValueError and leave any values that can't be converted to ints unchanged:

import re

def tryint(x):
    try:
        return int(x)
    except ValueError:
        return x

def splittedname(s):
    return tuple(tryint(x) for x in re.split('([0-9]+)', s))

To sort a list of names, use splittedname as a key function to sorted:

>>> names = ['YT4.11', '4.3', 'YT4.2', '4.10', 'PT2.19', 'PT2.9']
>>> sorted(names, key=splittedname)
['4.3', '4.10', 'PT2.9', 'PT2.19', 'YT4.2', 'YT4.11']
like image 122
Pär Wieslander Avatar answered Sep 20 '22 18:09

Pär Wieslander


This is not a built-in method, but it ought to work:

>>> def lt(num1, num2):
...     for a, b in zip(num1.split('.'), num2.split('.')):
...         if int(a) < int(b):
...             return True
...         if int(a) > int(b):
...             return False
...     return False
... 
... lt('4.2', '4.11')
0: True

That can be cleaned up, but it gives you the gist.

like image 35
nmichaels Avatar answered Sep 20 '22 18:09

nmichaels


What you're looking for is called "natural sorting". That is opposed to "lexicographical sorting". There are several recipes out there that do this, since the exact output of what you want is implementation specific. A quick google search yields this (note* this is not my code, nor have I tested it):

import re

def tryint(s):
    try:
        return int(s)
    except:
        return s

def alphanum_key(s):
    """ Turn a string into a list of string and number chunks.
        "z23a" -> ["z", 23, "a"]
    """
    return [ tryint(c) for c in re.split('([0-9]+)', s) ]

def sort_nicely(l):
    """ Sort the given list in the way that humans expect.
    """
    l.sort(key=alphanum_key)

http://nedbatchelder.com/blog/200712.html#e20071211T054956

like image 35
Falmarri Avatar answered Sep 18 '22 18:09

Falmarri