Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract multiple JSON objects from one file?

I am very new to Json files. If I have a json file with multiple json objects such as following:

{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
 "Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
 "Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
 "Code":[{"event1":"B","result":"0"},…]}
…

I want to extract all "Timestamp" and "Usefulness" into a data frames:

    Timestamp    Usefulness
 0   20140101      Yes
 1   20140102      No
 2   20140103      No
 …

Does anyone know a general way to deal with such problems?

like image 937
user6396 Avatar asked Jan 12 '15 17:01

user6396


People also ask

Can JSON have multiple objects?

JSON arrays can be of multiple data types. JSON array can store string , number , boolean , object or other array inside JSON array. In JSON array, values must be separated by comma. Arrays in JSON are almost the same as arrays in JavaScript.

How do I create multiple JSON objects?

JSONArray pdoInformation = new JSONArray(); JSONObject pDetail1 = new JSONObject(); JSONObject pDetail2 = new JSONObject(); JSONObject pDetail3 = new JSONObject(); pDetail1. put("productid", 1); pDetail1. put("qty", 3); pDetail1. put("listprice", 9500); pDetail2.

How to extract multiple JSON objects from one file with Python?

To extract multiple JSON objects from one file with Python, we put the JSON objects in a JSON array. Then we call json.load to parse the content of the JSON file. Then we open the file and parse the file into a list of dicts with

How to load and parse a JSON file with multiple JSON objects?

To Load and parse a JSON file with multiple JSON objects we need to follow below steps: Read the file line by line because each line contains valid JSON. i.e., read one JSON object at a time. Convert each JSON object into Python dict using a json.loads () Save this dictionary into a list called result jsonList. Let’ see the example now.

How do I combine multiple JSON files into one file?

If you want to combine JSON files into a single file, you cannot just concatenate them since you, almost certainly, get a JSON syntax error. The only safe way to combine multiple files, is to read them into an array, which serializes to valid JSON.

Is it possible to separate an array of JSON objects?

However, this method only really works when the file is written as you have it -- with each object separated by a newline character. Below I wrote an example of a writer that separates an array of json objects and saves each one on a new line.


2 Answers

Update: I wrote a solution that doesn't require reading the entire file in one go. It's too big for a stackoverflow answer, but can be found here jsonstream.

You can use json.JSONDecoder.raw_decode to decode arbitarily big strings of "stacked" JSON (so long as they can fit in memory). raw_decode stops once it has a valid object and returns the last position where wasn't part of the parsed object. It's not documented, but you can pass this position back to raw_decode and it start parsing again from that position. Unfortunately, the Python json module doesn't accept strings that have prefixing whitespace. So we need to search to find the first none-whitespace part of your document.

from json import JSONDecoder, JSONDecodeError
import re

NOT_WHITESPACE = re.compile(r'[^\s]')

def decode_stacked(document, pos=0, decoder=JSONDecoder()):
    while True:
        match = NOT_WHITESPACE.search(document, pos)
        if not match:
            return
        pos = match.start()
        
        try:
            obj, pos = decoder.raw_decode(document, pos)
        except JSONDecodeError:
            # do something sensible if there's some error
            raise
        yield obj

s = """

{"a": 1}  


   [
1
,   
2
]


"""

for obj in decode_stacked(s):
    print(obj)

prints:

{'a': 1}
[1, 2]
like image 198
Dunes Avatar answered Oct 12 '22 04:10

Dunes


Use a json array, in the format:

[
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
  "Code":[{"event1":"A","result":"1"},…]},
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
  "Code":[{"event1":"B","result":"1"},…]},
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
  "Code":[{"event1":"B","result":"0"},…]},
...
]

Then import it into your python code

import json

with open('file.json') as json_file:

    data = json.load(json_file)

Now the content of data is an array with dictionaries representing each of the elements.

You can access it easily, i.e:

data[0]["ID"]
like image 34
dfranca Avatar answered Oct 12 '22 02:10

dfranca