Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort a multidimensional list by a variable number of keys

Tags:

python

sorting

I've read this post and is hasn't ended up working for me.

Edit: the functionality I'm describing is just like the sorting function in Excel... if that makes it any clearer

Here's my situation, I have a tab-delimited text document. There are about 125,000 lines and 6 columns per line (columns are separated by a tab character). I've split the document into a two-dimension list.

I am trying to write a generic function to sort two-dimensional lists. Basically I would like to have a function where I can pass the big list, and the key of one or more columns I would like to sort the big list by. Obviously, I would like the first key passed to be the primary sorting point, then the second key, etc.

Still confuzzled?

Here's an example of what I would like to do.

Joel    18  Orange  1
Anna    17  Blue    2
Ryan    18  Green   3
Luke    16  Blue    1
Katy    13  Pink    5
Tyler   22  Blue    6
Bob     22  Blue    10
Garrett 24  Red 7
Ryan    18  Green   8
Leland  18  Yellow  9

Say I passed this list to my magical function, like so:

sortByColumn(bigList, 0)

Anna    17  Blue    2
Bob     22  Blue    10
Garrett 24  Red 7
Joel    18  Orange  1
Katy    13  Pink    5
Leland  18  Yellow  9
Luke    16  Blue    1
Ryan    18  Green   3
Ryan    18  Green   8
Tyler   22  Blue    6

and...

sortByColumn(bigList, 2, 3)

Luke    16  Blue    1
Anna    17  Blue    2
Tyler   22  Blue    6
Bob     22  Blue    10
Ryan    18  Green   3
Ryan    18  Green   8
Joel    18  Orange  1
Katy    13  Pink    5
Garrett 24  Red 7
Leland  18  Yellow  9

Any clues?

like image 793
Joel Verhagen Avatar asked Nov 05 '09 21:11

Joel Verhagen


People also ask

How do you sort a multidimensional list by column in Python?

To sort a two-dimensional list in Python use the sort() list method, which mutates the list, or the sorted() function, which does not. Set the key parameter for both types using a lambda function and return a tuple of the columns to sort according to the required sort order.

How do you sort multiple elements in a list Python?

To sort a list of tuples by multiple elements in Python: Pass the list to the sorted() function. Use the key argument to select the elements at the specific indices in each tuple. The sorted() function will sort the list of tuples by the specified elements.

Can you sort a nested list?

There will be three distinct ways to sort the nested lists. The first is to use Bubble Sort, the second is to use the sort() method, and the third is to use the sorted() method.


4 Answers

import operator:
def sortByColumn(bigList, *args)
    bigList.sort(key=operator.itemgetter(*args)) # sorts the list in place
like image 50
Tendayi Mawushe Avatar answered Nov 19 '22 21:11

Tendayi Mawushe


This will sort by columns 2 and 3:

a.sort(key=operator.itemgetter(2,3))
like image 22
interjay Avatar answered Nov 19 '22 21:11

interjay


The key idea here (pun intended) is to use a key function that returns a tuple. Below, the key function is lambda x: (x[idx] for idx in args) x is set to equal an element of aList -- that is, a row of data. It returns a tuple of values, not just one value. The sort() method sorts according to the first element of the list, then breaks ties with the second, and so on. See http://wiki.python.org/moin/HowTo/Sorting#Sortingbykeys

#!/usr/bin/env python
import csv
def sortByColumn(aList,*args):
    aList.sort(key=lambda x: (x[idx] for idx in args))
    return aList

filename='file.txt'
def convert_ints(astr):
    try:
        return int(astr)
    except ValueError:
        return astr    
biglist=[[convert_ints(elt) for elt in line]
         for line in csv.reader(open(filename,'r'),delimiter='\t')]

for row in sortByColumn(biglist,0):
    print row

for row in sortByColumn(biglist,2,3):
    print row
like image 39
unutbu Avatar answered Nov 19 '22 21:11

unutbu


Make sure you have converted the numbers to ints, otherwise they will sort alphabetically rather than numerically

# Sort the list in place
def sortByColumn(A,*args):
    import operator
    A.sort(key=operator.itemgetter(*args))
    return A

or

# Leave the original list alone and return a new sorted one
def sortByColumn(A,*args):
    import opertator
    return sorted(A,key=operator.itemgetter(*args))
like image 34
John La Rooy Avatar answered Nov 19 '22 22:11

John La Rooy