I have an object that is a list of lists of dictionaries:
myObject =[[{ "play": 5.00, "id": 1, "uid": "abc" }, \
{ "play": 1.00, "id": 2, "uid": "def" }], \
[{ "play": 6.00, "id": 3, "uid": "ghi" }, \
{ "play": 7.00, "id": 4, "uid": "jkl" }], \
[{ "play": 3.00, "id": 5, "uid": "mno" }, \
{ "play": 1.00, "id": 6, "uid": "pqr" }]]
I want to sort the list by the sum of play
values in the dictionaries of each nested list. The object would then be sorted like this:
myObject =[[{ "play": 6.00, "id": 3, "uid": "ghi" }, \
{ "play": 7.00, "id": 4, "uid": "jkl" }], \
[{ "play": 5.00, "id": 1, "uid": "abc" }, \
{ "play": 1.00, "id": 2, "uid": "def" }], \
[{ "play": 3.00, "id": 5, "uid": "mno" }, \
{ "play": 1.00, "id": 6, "uid": "pqr" }]]
If it were just a list of dicts then:
sorted(myObject, key=sum(map(itemgetter(play))), reverse=True)
would work. I can't figure out how to do this without looping over the list, calculating the sum, then sorting. That is what I am doing now, but I'm trying to increase the efficiency of this code by removing loops because my list has 100's of millions of lists in it.
Your idea is already very good, to use a custom key function when sorting and using sum
, map
and an itemgetter
on the play
key:
key=sum(map(itemgetter(play)))
You do have a problem there though: The key
argument expects a function that takes an item of your list you are sorting. But neither sum
nor map
return a function, so you cannot use it as a key function. Instead, you could make a lambda function that executes this combination for each item.
The other problems are that play
should be a string 'play'
instead, and that map
should take the sublist as an argument. So your key function would look like this:
key=lambda x: sum(map(itemgetter('play'), x))
This is btw. functionally equivalent to the following generator comprehension which might be more readable:
key=lambda x: sum(y['play'] for y in x)
Using this with sorted
should work but you should consider sorting your list directly using list.sort
instead:
>>> myObject = [[{ "play": 5.00, "id": 1, "uid": "abc" },
{ "play": 1.00, "id": 2, "uid": "def" }],
[{ "play": 6.00, "id": 3, "uid": "ghi" },
{ "play": 7.00, "id": 4, "uid": "jkl" }],
[{ "play": 3.00, "id": 5, "uid": "mno" },
{ "play": 1.00, "id": 6, "uid": "pqr" }]]
>>> myObject.sort(key=lambda x: sum(y['play'] for y in x), reverse=True)
>>> for x in myObject:
print(x)
[{'play': 6.0, 'uid': 'ghi', 'id': 3}, {'play': 7.0, 'uid': 'jkl', 'id': 4}]
[{'play': 5.0, 'uid': 'abc', 'id': 1}, {'play': 1.0, 'uid': 'def', 'id': 2}]
[{'play': 3.0, 'uid': 'mno', 'id': 5}, {'play': 1.0, 'uid': 'pqr', 'id': 6}]
(Btw. myObject
is kind of a bad name for a list of things.)
As far as the efficiency or complexity of your problem goes, you really cannot avoid having to loop through every sublist eventually. It’s impossible to determine the sum of those values without looking at the values, so obviously you cannot possibly avoid this.
However, you should ensure that every sum is only ever calculated once, to avoid having to look at the items in the sublists more than once. Luckily, the default sorting using list.sort
does exactly guarantee that:
The key corresponding to each item in the list is calculated once and then used for the entire sorting process.
So you will have a very efficient solution for this sorting problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With