Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting JSON in python by a specific value

Hello I am trying to sort the following JSON by the "data_two" field in python:

{
"1.2.3.4": {
    "data_one": 1,
    "data_two": 8,
    "list_one": [],
    "list_two": [
        "item_one"
    ],
    "data_three": "string1"
},
"5.6.7.8": {
    "data_one": 1,
    "data_two": 9,
    "list_two": [
        "item_one"
    ],
    "data_three": "string1",
    "data_four": "string2",
    "data_five": "string3"
}
}

I have tried using something like

entries = sorted(json_data['1.2.3.4'], key=lambda k: k['data_two'])

However I am not having much luck / keep getting confused. My ultimate goal is to sort all of the json entries by the "data_two" value, with the key for each entry in the JSON being a random IP like string. I am new to the world of JSON so forgive me if this is a simple question, any help would be greatly appreciated.

Thank you

like image 664
LinuxShell696 Avatar asked Dec 08 '15 05:12

LinuxShell696


2 Answers

If you have control over how the data is aggregated, it's better to have a list of dicts, and the IP would be a value inside the data dict {..., 'ip': '127.0.0.1'}, not a key in the container parent dict

Convert to a container that preserves element order

You can only sort a structure that maintains elements order, like a list e.g. there are dict implementation that maintain order like OrderedDict e.g.

You can always convert to those (might not be your first choice if slow/big data)

Converting to a list [(key, value), ...] or list [value, ...]

A possible way is to retrieve all values in the dict and then return a list of those values, sorted by your field of choice

You can also sort the (key, value) returned by ips_data.items(), but that's going to create a new list. key being the IP, and value being the IP data

sorted_list_of_keyvalues = sorted(ips_data.items(), key=item[1]['data_two'])

The list above in the form of [(key1, value1), (key2, value2), ...]

You can also pluck the values and remove the keys

sorted_list_of_values = [item[1] for item in sorted_list_of_keyvalues]

This list is in the form of [value1, value2, ...]

Note that you might think that you can sort by just value instead of (key. value), but your data has the IP in they key and you might want to keep that.

Converting to an OrderedDict

If you absolutely want to keep the structure as a dict, you can use an OrderedDict

from collections import OrderedDict
ordered_items = sorted(ips_data.items(), key=lambda item: item[1]['data_two'])
ordered_ips_data_dict = OrderedDict(ordered_items)

The ordered dict behaves just like a dict, but keys and items iteration will maintain the order of elements.

Or, Keep a sorted list of keys, and process in that order

Or alternatively, you can sort the keys of that dict into a list, then you can process the dict in that order. Advantage is you don't have to copy/convert the data

>>> ips_data = {
... "1.2.3.4": {
...     "data_one": 1,
...     "data_two": 8,
...     "list_one": [],
...     "list_two": [
...         "item_one"
...     ],
...     "data_three": "string1"
... },
... "5.6.7.8": {
...     "data_one": 1,
...     "data_two": 9,
...     "list_two": [
...         "item_one"
...     ],
...     "data_three": "string1",
...     "data_four": "string2",
...     "data_five": "string3"
... }
... }
>>> ips_data.keys()
['1.2.3.4', '5.6.7.8']
>>> ips = ips_data.keys()

Now you can sort the keys by the field data_two

>>> sorted_ips = sorted(ips, key=lambda ip: ips_data[ip]['data_two'], reverse=True)
>>> sorted_ips
['5.6.7.8', '1.2.3.4']

Having sorted keys, you can do what you want to your dict, in that sorted keys order, e.g. processing it in this order might be more efficient than copying the dict into a new structure like a list

# Trivial example of processing that just puts the values into a list   
>>> [ips_data[ip] for ip in sorted_ips]
[{'data_three': 'string1', 'data_two': 9, 'data_five': 'string3', 'data_four': 'string2', 'list_two': ['item_one'], 'data_one': 1}, {'list_two': ['item_one'], 'data_two': 8, 'data_one': 1, 'data_three': 'string1', 'list_one': []}]
>>> 
like image 63
bakkal Avatar answered Oct 19 '22 22:10

bakkal


It looks like what you tried was really close. Below will get you a sorted list of tuples, with the key in the 0th position and the value (which is a dictionary) in the 1st position. You should be able to use this to do what you'd like afterward.

entries = sorted(json_data.items(), key=lambda items: items[1]['data_two'])

So for example

{ "k1": {"data_one": 1, "data_two": 50 ...}, "k2": {"data_one": 50, "data_two": 2}}

would result in:

[("k2", {..."data_two": 2...}), ("k1", {..."data_two": 50...})]

Hope that helps!

like image 1
tonyl7126 Avatar answered Oct 19 '22 22:10

tonyl7126