Is there a better way to sort a list by a nested tuple values than writing an itemgetter alternative that extracts the nested tuple value: <pre class="prettyprint"><code>def deep_get(*idx): def g(t): for i in idx: t = t[i] return t return g >>> l = [((2,1), 1),((1,3), 1),((3,6), 1),((4,5), 2)] >>> sorted(l, key=deep_get(0,0)) [((1, 3), 1), ((2, 1), 1), ((3, 6), 1), ((4, 5), 2)] >>> sorted(l, key=deep_get(0,1)) [((2, 1), 1), ((1, 3), 1), ((4, 5), 2), ((3, 6), 1)] </code></pre> I thought about using compose, but that's not in the standard library: <pre class="prettyprint"><code>sorted(l, key=compose(itemgetter(1), itemgetter(0)) </code></pre> Is there something I missed in the libs that would make this code nicer? The implementation should work reasonably with 100k items. Context: I would like to sort a dictionary of items that are a histogram. The keys are a tuples (a,b) and the value is the count. In the end the items should be sorted by count descending, a and b. An alternative is to flatten the tuple and use the itemgetter directly but this way a lot of tuples will be generated.

Yes, you could just use a <code>key=lambda x: x[0][1]</code>

Your approach is quite good, given the data structure that you have. Another approach would be to use another structure. If you want speed, the de-factor standard NumPy is the way to go. Its job is to efficiently handle large arrays. It even has some nice sorting routines for arrays like yours. Here is how you would write your sort over the counts, and then over (a, b): <pre class="prettyprint"><code>>>> arr = numpy.array([((2,1), 1),((1,3), 1),((3,6), 1),((4,5), 2)], dtype=[('pos', [('a', int), ('b', int)]), ('count', int)]) >>> print numpy.sort(arr, order=['count', 'pos']) [((1, 3), 1) ((2, 1), 1) ((3, 6), 1) ((4, 5), 2)] </code></pre> This is very fast (it's implemented in C). If you want to stick with standard Python, a list containing (count, a, b) tuples would automatically get sorted in the way you want by Python (which uses lexicographic order on tuples).

This might be a little faster version of your approach: <pre class="prettyprint"><code>l = [((2,1), 1), ((1,3), 1), ((3,6), 1), ((4,5), 2)] def deep_get(*idx): def g(t): return reduce(lambda t, i: t[i], idx, t) return g >>> sorted(l, key=deep_get(0,1)) [((2, 1), 1), ((1, 3), 1), ((4, 5), 2), ((3, 6), 1)] </code></pre> Which could be shortened to: <pre class="prettyprint"><code>def deep_get(*idx): return lambda t: reduce(lambda t, i: t[i], idx, t) </code></pre> or even just simply written-out: <pre class="prettyprint"><code>sorted(l, key=lambda t: reduce(lambda t, i: t[i], (0,1), t)) </code></pre>

Sort list by nested tuple values

Tags:

python

sorting

tuples

Is there a better way to sort a list by a nested tuple values than writing an itemgetter alternative that extracts the nested tuple value:

Click to copy

def deep_get(*idx):
  def g(t):
      for i in idx: t = t[i]
      return t
  return g

>>> l = [((2,1), 1),((1,3), 1),((3,6), 1),((4,5), 2)]
>>> sorted(l, key=deep_get(0,0))
[((1, 3), 1), ((2, 1), 1), ((3, 6), 1), ((4, 5), 2)]
>>> sorted(l, key=deep_get(0,1))
[((2, 1), 1), ((1, 3), 1), ((4, 5), 2), ((3, 6), 1)]

I thought about using compose, but that's not in the standard library:

Click to copy

sorted(l, key=compose(itemgetter(1), itemgetter(0))

Is there something I missed in the libs that would make this code nicer?

The implementation should work reasonably with 100k items.

Context: I would like to sort a dictionary of items that are a histogram. The keys are a tuples (a,b) and the value is the count. In the end the items should be sorted by count descending, a and b. An alternative is to flatten the tuple and use the itemgetter directly but this way a lot of tuples will be generated.

766

asked May 28 '11 16:05

Thomas Jung

3 Answers

Yes, you could just use a key=lambda x: x[0][1]

139

answered Oct 05 '22 22:10

ninjagecko

Your approach is quite good, given the data structure that you have.

Another approach would be to use another structure.

If you want speed, the de-factor standard NumPy is the way to go. Its job is to efficiently handle large arrays. It even has some nice sorting routines for arrays like yours. Here is how you would write your sort over the counts, and then over (a, b):

Click to copy

>>> arr = numpy.array([((2,1), 1),((1,3), 1),((3,6), 1),((4,5), 2)],
                  dtype=[('pos', [('a', int), ('b', int)]), ('count', int)])
>>> print numpy.sort(arr, order=['count', 'pos'])
[((1, 3), 1) ((2, 1), 1) ((3, 6), 1) ((4, 5), 2)]

This is very fast (it's implemented in C).

If you want to stick with standard Python, a list containing (count, a, b) tuples would automatically get sorted in the way you want by Python (which uses lexicographic order on tuples).

answered Oct 06 '22 00:10

Eric O Lebigot

This might be a little faster version of your approach:

Click to copy

l = [((2,1), 1), ((1,3), 1), ((3,6), 1), ((4,5), 2)]

def deep_get(*idx):
    def g(t):
        return reduce(lambda t, i: t[i], idx, t)
    return g

>>> sorted(l, key=deep_get(0,1))
[((2, 1), 1), ((1, 3), 1), ((4, 5), 2), ((3, 6), 1)]

Which could be shortened to:

Click to copy

def deep_get(*idx):
    return lambda t: reduce(lambda t, i: t[i], idx, t)

or even just simply written-out:

Click to copy

sorted(l, key=lambda t: reduce(lambda t, i: t[i], (0,1), t))

answered Oct 05 '22 23:10

martineau

Related questions
                            
                                python unittest methods
                            
                                Indexer with two keys in python
                            
                                Abort a running task in Celery within django
                            
                                Smart way to find out string encoding?
                            
                                Multiple values for key in dictionary in Python
                            
                                Can this Python postfix notation (reverse polish notation) interpreter be made more efficient and accurate?
                            
                                Find gaps in a sequence of Strings
                            
                                What is the equivalent of Perl's (<>) in Python? fileinput doesn't work as expected
                            
                                Does anyone know a regular expression to validate MSISDN-format mobile numbers?
                            
                                MySQL-python Cannot connect to server
                            
                                List directory file contents in a Django template
                            
                                Pass many pieces of data from Python to C program
                            
                                Unable to reinstall PyTables for Python 2.7
                            
                                Dynamic template "Includes" with django
                            
                                Enum converter in python
                            
                                how do I use py2app?
                            
                                How can I install my project from source with Buildout?
                            
                                Python syntax inconsistency?
                            
                                Are there any story based BDD test framework in python? [closed]
                            
                                C++ equivalent of Python difference_update?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sort list by nested tuple values

Tags:

python

sorting

tuples

Thomas Jung

People also ask

3 Answers

ninjagecko

Eric O Lebigot

martineau

Recent Activity

Donate For Us