Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python flatten multilevel/nested JSON

Tags:

I am trying to convert JSON to CSV file, that I can use for further analysis. Issue with my structure is that I have quite some nested dict/lists when I convert my JSON file.

I tried to use pandas json_normalize(), but it only flattens first level.

import json import pandas as pd from pandas.io.json import json_normalize from cs import CloudStack  api_key = xxxx secret = xxxx endpoint = xxxx  cs = CloudStack(endpoint=endpoint,                 key=api_key,                 secret=secret)  virtual_machines = cs.virtMach()  test = json_normalize(virtual_machines["virtualmachine"])  test.to_csv("test.csv", sep="|", index=False) 

Any idea how to flatter whole JSON file, so I can create single line input to CSV file for single (in this case virtual machine) entry? I have tried couple of solutions posted here, but my result was always only first level was flattened.

This is sample JSON (in this case, I still get "securitygroup" and "nic" output as JSON format:

{     "count": 13,     "virtualmachine": [         {             "id": "1082e2ed-ff66-40b1-a41b-26061afd4a0b",             "name": "test-2",             "displayname": "test-2",             "securitygroup": [                 {                     "id": "9e649fbc-3e64-4395-9629-5e1215b34e58",                     "name": "test",                     "tags": []                 }             ],             "nic": [                 {                     "id": "79568b14-b377-4d4f-b024-87dc22492b8e",                     "networkid": "05c0e278-7ab4-4a6d-aa9c-3158620b6471"                 },                 {                     "id": "3d7f2818-1f19-46e7-aa98-956526c5b1ad",                     "networkid": "b4648cfd-0795-43fc-9e50-6ee9ddefc5bd"                     "traffictype": "Guest"                 }             ],             "hypervisor": "KVM",             "affinitygroup": [],             "isdynamicallyscalable": false         }     ] } 
like image 794
Bostjan Avatar asked Jul 16 '18 10:07

Bostjan


People also ask

How do you flatten a JSON in Python?

There are many ways to flatten JSON. There is one recursive way and another by using the json-flatten library. Now we can flatten the dictionary array by a recursive approach which is quite easy to understand. The recursive approach is a bit slower than using the json-flatten library.

What does flattening JSON mean?

Use the Flatten JSON Objects extension to convert a nested data layer object into a new object with only one layer of key/value pairs. For more advanced options in converting a data layer, see the Data Layer Converter.


1 Answers

I used the following function (details can be found here):

def flatten_data(y):     out = {}      def flatten(x, name=''):         if type(x) is dict:             for a in x:                 flatten(x[a], name + a + '_')         elif type(x) is list:             i = 0             for a in x:                 flatten(a, name + str(i) + '_')                 i += 1         else:             out[name[:-1]] = x      flatten(y)     return out 

This unfortunately completely flattens whole JSON, meaning that if you have multi-level JSON (many nested dictionaries), it might flatten everything into single line with tons of columns.

What I used, in the end, was json_normalize() and specified structure that I required. A nice example of how to do it that way can be found here.

like image 148
Bostjan Avatar answered Oct 19 '22 22:10

Bostjan