Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting JSON into newline delimited JSON in Python

My goal is to convert JSON file into a format that can uploaded from Cloud Storage into BigQuery (as described here) with Python.

I have tried using newlineJSON package for the conversion but receives the following error.

JSONDecodeError: Expecting value or ']': line 2 column 1 (char 5)

Does anyone have the solution to this?

Here is the sample JSON code:

[{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    ...
    "keyN": "valueN"
}
]

And here's the existing python script:

with nlj.open(url_samplejson, json_lib = "simplejson") as src_:
    with nlj.open(url_convertedjson, "w") as dst_:
        for line_ in src_:
            dst_.write(line_)
like image 563
Fxs7576 Avatar asked Jul 12 '18 08:07

Fxs7576


People also ask

Can JSON have newline?

JSON strings do not allow real newlines in its data; it can only have escaped newlines.

How do I encode a new line in JSON?

In JSON object make sure that you are having a sentence where you need to print in different lines. Now in-order to print the statements in different lines we need to use '\\n' (backward slash). As we now know the technique to print in newlines, now just add '\\n' wherever you want.

What does JSON dumps do in Python?

dumps() json. dumps() function converts a Python object into a json string. skipkeys:If skipkeys is true (default: False), then dict keys that are not of a basic type (str, int, float, bool, None) will be skipped instead of raising a TypeError.

How do I convert a JSON to a string in Python?

Try to use str() and json. dumps() when converting JSON to string in python. It is not necessary to change the output string to json (dict) again for me.


3 Answers

The answer with jq is really useful, but if you still want to do it with Python (as it seems from the question), you can do it with built-in json module.

import json
from io import StringIO
in_json = StringIO("""[{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",

    "keyN": "valueN"
}
]""")

result = [json.dumps(record) for record in json.load(in_json)]  # the only significant line to convert the JSON to the desired format

print('\n'.join(result))

{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}

* I'm using StringIO and print here just to make a sample easier to test locally.

As an alternative, you can use Python jq binding to combine it with the other answer.

like image 159
Oleh Rybalchenko Avatar answered Oct 19 '22 10:10

Oleh Rybalchenko


If you are willing to get out of Python, use jq:

$ cat a.json 
[{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
},
{
    "key01": "value01",
    "key02": "value02",
    "keyN": "valueN"
}
]


$ cat a.json | jq -c '.[]'
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}

The iterator I used is '.[]' to go through the array, and -c puts each JSON object on a single line.

Resources:

  • https://stedolan.github.io/jq/manual/
  • https://github.com/stedolan/jq
like image 22
Felipe Hoffa Avatar answered Oct 19 '22 09:10

Felipe Hoffa


This takes a JSON file and converts into ND-JSON file.

import json

with open("results-20190312-113458.json", "r") as read_file:
    data = json.load(read_file)
result = [json.dumps(record) for record in data]
with open('nd-proceesed.json', 'w') as obj:
    for i in result:
        obj.write(i+'\n')

Hope this helps someone.

like image 11
Saurav Joshi Avatar answered Oct 19 '22 08:10

Saurav Joshi