Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter JSON data using Python?

Tags:

python

json

How to convert JSON data from input.json to output.json using Python? In general, what data structures are used for filtering JSON data?

File: input.json

[
{
    "id":1,
    "a":22,
    "b":11
},
{
    "id":1,
    "e":44,
    "c":77,
    "f":55,
    "d":66
},
{
    "id":3,
    "b":11,
    "a":22
},
{
    "id":3,
    "d":44,
    "c":88
}
]

File: output.json

[
{
    "id":1,
    "a":22,
    "b":11,
    "e":44,
    "c":77,
    "f":55,
    "d":66
},
{
    "id":3,
    "b":11,
    "a":22,
    "d":44,
    "c":88
}
]

Any pointers would be appreciated!

like image 785
Xplora Avatar asked Apr 25 '16 20:04

Xplora


People also ask

How do I query JSON data in Python?

Example-1: Search key in simple JSON data Here, a variable named customerData is defined to store the JSON data. The value of the key will be taken as input from the user. loads() method of JSON module is used to load JSON data in the variable named customer. Next, 'in' operator is used to search the key.

How do I filter nested JSON data in Python?

As the loaded json data is just nested lists and dicts, you can use the ordinary list/dict operations; in particular, list comprehension is useful. Show activity on this post. import json # Loding the data pathToFile = "bb. json" with open(pathToFile, 'r') as file: content = file.

What is JSON filter?

The json filter converts a JavaScript object into a JSON string. This filter can be useful when debugging your applications. The JavaScript object can be any kind of JavaScript object.


1 Answers

The idea is to:

  • use json.load() to load the JSON content from file to a Python list
  • regroup the data by the id, using collections.defaultdict and .update() method
  • use json.dump() to dump the result into the JSON file

Implementation:

import json
from collections import defaultdict

# read JSON data
with open("input.json") as input_file:
    old_data = json.load(input_file)

# regroup data
d = defaultdict(dict)
for item in old_data:
    d[item["id"]].update(item)

# write JSON data
with open("output.json", "w") as output_file:
    json.dump(list(d.values()), output_file, indent=4)

Now the output.json would contain:

[
    {
        "d": 66,
        "e": 44,
        "a": 22,
        "b": 11,
        "c": 77,
        "id": 1,
        "f": 55
    },
    {
        "b": 11,
        "id": 3,
        "d": 44,
        "c": 88,
        "a": 22
    }
]
like image 113
alecxe Avatar answered Oct 01 '22 00:10

alecxe