I am very new to Json files. If I have a json file with multiple json objects such as following:
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
"Code":[{"event1":"A","result":"1"},…]}
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
"Code":[{"event1":"B","result":"1"},…]}
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
"Code":[{"event1":"B","result":"0"},…]}
…
I want to extract all "Timestamp" and "Usefulness" into a data frames:
Timestamp Usefulness
0 20140101 Yes
1 20140102 No
2 20140103 No
…
Does anyone know a general way to deal with such problems?
JSON arrays can be of multiple data types. JSON array can store string , number , boolean , object or other array inside JSON array. In JSON array, values must be separated by comma. Arrays in JSON are almost the same as arrays in JavaScript.
JSONArray pdoInformation = new JSONArray(); JSONObject pDetail1 = new JSONObject(); JSONObject pDetail2 = new JSONObject(); JSONObject pDetail3 = new JSONObject(); pDetail1. put("productid", 1); pDetail1. put("qty", 3); pDetail1. put("listprice", 9500); pDetail2.
To extract multiple JSON objects from one file with Python, we put the JSON objects in a JSON array. Then we call json.load to parse the content of the JSON file. Then we open the file and parse the file into a list of dicts with
To Load and parse a JSON file with multiple JSON objects we need to follow below steps: Read the file line by line because each line contains valid JSON. i.e., read one JSON object at a time. Convert each JSON object into Python dict using a json.loads () Save this dictionary into a list called result jsonList. Let’ see the example now.
If you want to combine JSON files into a single file, you cannot just concatenate them since you, almost certainly, get a JSON syntax error. The only safe way to combine multiple files, is to read them into an array, which serializes to valid JSON.
However, this method only really works when the file is written as you have it -- with each object separated by a newline character. Below I wrote an example of a writer that separates an array of json objects and saves each one on a new line.
Update: I wrote a solution that doesn't require reading the entire file in one go. It's too big for a stackoverflow answer, but can be found here jsonstream
.
You can use json.JSONDecoder.raw_decode
to decode arbitarily big strings of "stacked" JSON (so long as they can fit in memory). raw_decode
stops once it has a valid object and returns the last position where wasn't part of the parsed object. It's not documented, but you can pass this position back to raw_decode
and it start parsing again from that position. Unfortunately, the Python json
module doesn't accept strings that have prefixing whitespace. So we need to search to find the first none-whitespace part of your document.
from json import JSONDecoder, JSONDecodeError
import re
NOT_WHITESPACE = re.compile(r'[^\s]')
def decode_stacked(document, pos=0, decoder=JSONDecoder()):
while True:
match = NOT_WHITESPACE.search(document, pos)
if not match:
return
pos = match.start()
try:
obj, pos = decoder.raw_decode(document, pos)
except JSONDecodeError:
# do something sensible if there's some error
raise
yield obj
s = """
{"a": 1}
[
1
,
2
]
"""
for obj in decode_stacked(s):
print(obj)
prints:
{'a': 1}
[1, 2]
Use a json array, in the format:
[
{"ID":"12345","Timestamp":"20140101", "Usefulness":"Yes",
"Code":[{"event1":"A","result":"1"},…]},
{"ID":"1A35B","Timestamp":"20140102", "Usefulness":"No",
"Code":[{"event1":"B","result":"1"},…]},
{"ID":"AA356","Timestamp":"20140103", "Usefulness":"No",
"Code":[{"event1":"B","result":"0"},…]},
...
]
Then import it into your python code
import json
with open('file.json') as json_file:
data = json.load(json_file)
Now the content of data is an array with dictionaries representing each of the elements.
You can access it easily, i.e:
data[0]["ID"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With