Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I convert JSON to CSV?

People also ask

Why JSON Cannot convert to CSV?

As mentioned in the previous answers the difficulty in converting json to csv is because a json file can contain nested dictionaries and therefore be a multidimensional data structure verses a csv which is a 2D data structure.

How do I convert a JSON file to Excel?

On the spreadsheet window, in Excel's ribbon at the top, click the “Data” tab. On the “Data” tab, from the “Get & Transform Data” section, select Get Data > From File > From JSON. You will see your computer's standard “Import” window. Here, open the folder where your JSON file is located.


With the pandas library, this is as easy as using two commands!

df = pd.read_json()

read_json converts a JSON string to a pandas object (either a series or dataframe). Then:

df.to_csv()

Which can either return a string or write directly to a csv-file. See the docs for to_csv.

Based on the verbosity of previous answers, we should all thank pandas for the shortcut.

For unstructured JSON see this answer.


First, your JSON has nested objects, so it normally cannot be directly converted to CSV. You need to change that to something like this:

{
    "pk": 22,
    "model": "auth.permission",
    "codename": "add_logentry",
    "content_type": 8,
    "name": "Can add log entry"
},
......]

Here is my code to generate CSV from that:

import csv
import json

x = """[
    {
        "pk": 22,
        "model": "auth.permission",
        "fields": {
            "codename": "add_logentry",
            "name": "Can add log entry",
            "content_type": 8
        }
    },
    {
        "pk": 23,
        "model": "auth.permission",
        "fields": {
            "codename": "change_logentry",
            "name": "Can change log entry",
            "content_type": 8
        }
    },
    {
        "pk": 24,
        "model": "auth.permission",
        "fields": {
            "codename": "delete_logentry",
            "name": "Can delete log entry",
            "content_type": 8
        }
    }
]"""

x = json.loads(x)

f = csv.writer(open("test.csv", "wb+"))

# Write CSV Header, If you dont need that, remove this line
f.writerow(["pk", "model", "codename", "name", "content_type"])

for x in x:
    f.writerow([x["pk"],
                x["model"],
                x["fields"]["codename"],
                x["fields"]["name"],
                x["fields"]["content_type"]])

You will get output as:

pk,model,codename,name,content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8

I am assuming that your JSON file will decode into a list of dictionaries. First we need a function which will flatten the JSON objects:

def flattenjson(b, delim):
    val = {}
    for i in b.keys():
        if isinstance(b[i], dict):
            get = flattenjson(b[i], delim)
            for j in get.keys():
                val[i + delim + j] = get[j]
        else:
            val[i] = b[i]
            
    return val

The result of running this snippet on your JSON object:

flattenjson({
    "pk": 22, 
    "model": "auth.permission", 
    "fields": {
      "codename": "add_message", 
      "name": "Can add message", 
      "content_type": 8
    }
  }, "__")

is

{
    "pk": 22, 
    "model": "auth.permission", 
    "fields__codename": "add_message", 
    "fields__name": "Can add message", 
    "fields__content_type": 8
}

After applying this function to each dict in the input array of JSON objects:

input = map(lambda x: flattenjson( x, "__" ), input)

and finding the relevant column names:

columns = [x for row in input for x in row.keys()]
columns = list(set(columns))

it's not hard to run this through the csv module:

with open(fname, 'wb') as out_file:
    csv_w = csv.writer(out_file)
    csv_w.writerow(columns)

    for i_r in input:
        csv_w.writerow(map(lambda x: i_r.get(x, ""), columns))

I hope this helps!


JSON can represent a wide variety of data structures -- a JS "object" is roughly like a Python dict (with string keys), a JS "array" roughly like a Python list, and you can nest them as long as the final "leaf" elements are numbers or strings.

CSV can essentially represent only a 2-D table -- optionally with a first row of "headers", i.e., "column names", which can make the table interpretable as a list of dicts, instead of the normal interpretation, a list of lists (again, "leaf" elements can be numbers or strings).

So, in the general case, you can't translate an arbitrary JSON structure to a CSV. In a few special cases you can (array of arrays with no further nesting; arrays of objects which all have exactly the same keys). Which special case, if any, applies to your problem? The details of the solution depend on which special case you do have. Given the astonishing fact that you don't even mention which one applies, I suspect you may not have considered the constraint, neither usable case in fact applies, and your problem is impossible to solve. But please do clarify!


A generic solution which translates any json list of flat objects to csv.

Pass the input.json file as first argument on command line.

import csv, json, sys

input = open(sys.argv[1])
data = json.load(input)
input.close()

output = csv.writer(sys.stdout)

output.writerow(data[0].keys())  # header row

for row in data:
    output.writerow(row.values())