Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python sort strings with digits at the end

what is the easiest way to sort a list of strings with digits at the end where some have 3 digits and some have 4:

>>> list = ['asdf123', 'asdf1234', 'asdf111', 'asdf124']
>>> list.sort()
>>> print list
['asdf111', 'asdf123', 'asdf1234', 'asdf124']

should put the 1234 one on the end. is there an easy way to do this?

like image 482
crosswired Avatar asked Nov 30 '10 20:11

crosswired


3 Answers

is there an easy way to do this?

Yes

You can use the natsort module.

>>> from natsort import natsorted
>>> natsorted(['asdf123', 'asdf1234', 'asdf111', 'asdf124'])
['asdf111', 'asdf123', 'asdf124', 'asdf1234']

Full disclosure, I am the package's author.

like image 191
SethMMorton Avatar answered Oct 06 '22 11:10

SethMMorton


What you're probably describing is called a Natural Sort, or a Human Sort. If you're using Python, you can borrow from Ned's implementation.

The algorithm for a natural sort is approximately as follows:

  • Split each value into alphabetical "chunks" and numerical "chunks"
  • Sort by the first chunk of each value
    • If the chunk is alphabetical, sort it as usual
    • If the chunk is numerical, sort by the numerical value represented
  • Take the values that have the same first chunk and sort them by the second chunk
  • And so on
like image 21
Wesley Avatar answered Oct 06 '22 11:10

Wesley


is there an easy way to do this?

No

It's perfectly unclear what the real rules are. The "some have 3 digits and some have 4" isn't really a very precise or complete specification. All your examples show 4 letters in front of the digits. Is this always true?

import re
key_pat = re.compile(r"^(\D+)(\d+)$")
def key(item):
    m = key_pat.match(item)
    return m.group(1), int(m.group(2))

That key function might do what you want. Or it might be too complex. Or maybe the pattern is really r"^(.*)(\d{3,4})$" or maybe the rules are even more obscure.

>>> data= ['asdf123', 'asdf1234', 'asdf111', 'asdf124']
>>> data.sort( key=key )
>>> data
['asdf111', 'asdf123', 'asdf124', 'asdf1234']
like image 26
S.Lott Avatar answered Oct 06 '22 12:10

S.Lott