i need to format the contents of a Json file in a certain format in a pandas DataFrame so that i can run pandassql to transform the data and run it through a scoring model.
file = C:\scoring_model\json.js (contents of 'file' are below)
{ "response":{ "version":"1.1", "token":"dsfgf", "body":{ "customer":{ "customer_id":"1234567", "verified":"true" }, "contact":{ "email":"[email protected]", "mobile_number":"0123456789" }, "personal":{ "gender": "m", "title":"Dr.", "last_name":"Muster", "first_name":"Max", "family_status":"single", "dob":"1985-12-23", } } }
I need the dataframe to look like this (obviously all values on same row, tried to format it best as possible for this question):
version | token | customer_id | verified | email | mobile_number | gender | 1.1 | dsfgf | 1234567 | true | [email protected] | 0123456789 | m | title | last_name | first_name |family_status | dob Dr. | Muster | Max | single | 23.12.1985
I have looked at all the other questions on this topic, have tried various ways to load Json file into pandas
`with open(r'C:\scoring_model\json.js', 'r') as f:` c = pd.read_json(f.read()) `with open(r'C:\scoring_model\json.js', 'r') as f:` c = f.readlines()
tried pd.Panel() in this solution Python Pandas: How to split a sorted dictionary in a column of a dataframe
with dataframe results from [yo = f.readlines()] thought about trying to split contents of each cell based on ("") and find a way to put the split contents into different columns but no luck so far. Your expertise is greatly appreciated. Thank you in advance.
Flatten a JSON object: var flatten = (function (isArray, wrapped) { return function (table) { return reduce("", {}, table); }; function reduce(path, accumulator, table) { if (isArray(table)) { var length = table.
Python has built in functions that easily imports JSON files as a Python dictionary or a Pandas dataframe. Use pd. read_json() to load simple JSONs and pd. json_normalize() to load nested JSONs.
If you load in the entire json as a dict (or list) e.g. using json.load
, you can use json_normalize
:
In [11]: d = {"response": {"body": {"contact": {"email": "[email protected]", "mobile_number": "0123456789"}, "personal": {"last_name": "Muster", "gender": "m", "first_name": "Max", "dob": "1985-12-23", "family_status": "single", "title": "Dr."}, "customer": {"verified": "true", "customer_id": "1234567"}}, "token": "dsfgf", "version": "1.1"}} In [12]: df = pd.json_normalize(d) In [13]: df.columns = df.columns.map(lambda x: x.split(".")[-1]) In [14]: df Out[14]: email mobile_number customer_id verified dob family_status first_name gender last_name title token version 0 [email protected] 0123456789 1234567 true 1985-12-23 single Max m Muster Dr. dsfgf 1.1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With