Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Sorting elements in a list of lists

Apologies if this has been answered elsewhere; I've tried searching, but haven't found anything that answers my question (or perhaps I have, but didn't understand it)...

I'm fairly new to Python (v2.6.2) and have a list of lists containing floating point values which looks something like the following (except the full thing has 2+ million entries for each list):

cat = [[152.123, 150.456, 151.789, ...], [4.123, 3.456, 1.789, ...], [20.123, 22.456, 21.789, ...]]

Now what I would like to do is sort all 3 of the lists by ascending order of the elements of the 3rd list, such that I get:

cat_sorted = [[152.123, 151.789, 150.456, ...], [4.123, 1.789, 3.456, ...], [20.123, 21.789, 22.456, ...]]

I've tried a few things, but they don't give me what I'm looking for (or perhaps I'm using them incorrectly). Is there a way to do what I am looking for and if so, what's the easiest & quickest (considering I have 3 x 2million entries)? Is there a way of sorting one list using another?

like image 926
Shanagar Avatar asked Jan 04 '13 17:01

Shanagar


People also ask

Can we sort nested list in Python?

There will be three distinct ways to sort the nested lists. The first is to use Bubble Sort, the second is to use the sort() method, and the third is to use the sorted() method.

How do you sort a specific element in a list Python?

Use the Python List sort() method to sort a list in place. The sort() method sorts the string elements in alphabetical order and sorts the numeric elements from smallest to largest. Use the sort(reverse=True) to reverse the default sort order.

How do you arrange nested list in ascending order in Python?

If you want to sort a list like that, just give sorted a key: sorted_list = sorted([['1', 'A', 2, 5, 45, 10], ['2', 'B', 8, 15, 65, 20], ['3', 'C', 32, 35, 25, 140], ['4', 'D', 82, 305, 75, 90], ['5', 'E', 39, 43, 89, 55]], key=lambda lst: lst[2], lst[1], ....)


2 Answers

This is going to be painful, but using default python you have 2 options:

  • decorate the 1st and 2nd lists with enumerate(), then sort these using the index to refer to values from the 3rd list:

    cat_sorted = [
        [e for i, e in sorted(enumerate(cat[0]), key=lambda p: cat[2][p[0]])],
        [e for i, e in sorted(enumerate(cat[1]), key=lambda p: cat[2][p[0]])],
        sorted(cat[2])
    ]
    

    although it may help to sort cat[2] in-place instead of using sorted(); you cannot get around using sorted() for the other two.

  • zip() the three lists together, then sort on the third element of this new list of lists, then zip() again to get back to the original structure:

    from operator import itemgetter
    cat_sorted = zip(*sorted(zip(*cat), key=itemgetter(2)))
    

Neither will be a performance buster, not with plain python lists of millions of numbers.

like image 133
Martijn Pieters Avatar answered Oct 24 '22 07:10

Martijn Pieters


If you're willing to use an additional library, I suggest Python Pandas. It has a DataFrame object similar to R's data.frame and accepts a list of lists in the constructor, which will create a 3-column data array. Then you can easily use the built-in pandas.DataFrame.sort function to sort by the third column (ascending or descending).

There are many plain Python ways to do this, but given the size of your problem, using the optimized functions in Pandas is a better approach. And if you need any kind of aggregated statistics from your sorted data, then Pandas is a no-brainer for this.

like image 22
ely Avatar answered Oct 24 '22 08:10

ely