Nested Json to pandas DataFrame with specific format

Tags:

i need to format the contents of a Json file in a certain format in a pandas DataFrame so that i can run pandassql to transform the data and run it through a scoring model.

file = C:\scoring_model\json.js (contents of 'file' are below)

{ "response":{   "version":"1.1",   "token":"dsfgf",    "body":{      "customer":{          "customer_id":"1234567",          "verified":"true"        },      "contact":{          "email":"[email protected]",          "mobile_number":"0123456789"       },      "personal":{          "gender": "m",          "title":"Dr.",          "last_name":"Muster",          "first_name":"Max",          "family_status":"single",          "dob":"1985-12-23",      }    }  }

I need the dataframe to look like this (obviously all values on same row, tried to format it best as possible for this question):

version | token | customer_id | verified | email      | mobile_number | gender | 1.1     | dsfgf | 1234567     | true     | [email protected] | 0123456789    | m      |  title | last_name | first_name |family_status | dob Dr.   | Muster    | Max        | single       | 23.12.1985

I have looked at all the other questions on this topic, have tried various ways to load Json file into pandas

`with open(r'C:\scoring_model\json.js', 'r') as f:`     c = pd.read_json(f.read())   `with open(r'C:\scoring_model\json.js', 'r') as f:`     c = f.readlines()

tried pd.Panel() in this solution Python Pandas: How to split a sorted dictionary in a column of a dataframe

with dataframe results from [yo = f.readlines()] thought about trying to split contents of each cell based on ("") and find a way to put the split contents into different columns but no luck so far. Your expertise is greatly appreciated. Thank you in advance.

580

asked Dec 17 '15 18:12

figgy

1 Answers

If you load in the entire json as a dict (or list) e.g. using json.load, you can use json_normalize:

In [11]: d = {"response": {"body": {"contact": {"email": "[email protected]", "mobile_number": "0123456789"}, "personal": {"last_name": "Muster", "gender": "m", "first_name": "Max", "dob": "1985-12-23", "family_status": "single", "title": "Dr."}, "customer": {"verified": "true", "customer_id": "1234567"}}, "token": "dsfgf", "version": "1.1"}}  In [12]: df = pd.json_normalize(d)  In [13]: df.columns = df.columns.map(lambda x: x.split(".")[-1])  In [14]: df Out[14]:         email mobile_number customer_id verified         dob family_status first_name gender last_name title  token version 0  [email protected]    0123456789     1234567     true  1985-12-23        single        Max      m    Muster   Dr.  dsfgf     1.1

195

answered Sep 20 '22 10:09

Andy Hayden

Related questions
                            
                                Selecting fields from JSON output
                            
                                Why were True and False changed to keywords in Python 3
                            
                                In which order are pytest fixtures executed?
                            
                                Where do you need to use lit() in Pyspark SQL?
                            
                                "OverflowError: Python int too large to convert to C long" on windows but not mac
                            
                                What is the difference between C.UTF-8 and en_US.UTF-8 locales?
                            
                                In pdb how do you reset the list (l) command line count?
                            
                                Pointers and arrays in Python ctypes
                            
                                What's the best way to sum all values in a Pandas dataframe?
                            
                                Why is scikit-learn SVM.SVC() extremely slow?
                            
                                Delete file from zipfile with the ZipFile Module
                            
                                Difference between scipy.spatial.KDTree and scipy.spatial.cKDTree
                            
                                Define an order for ManyToManyField with Django
                            
                                Can subprocess.call be invoked without waiting for process to finish?
                            
                                Tensorflow variable scope: reuse if variable exists
                            
                                Pythonic way to create a numpy array from a list of numpy arrays
                            
                                How do you pass a Queue reference to a function managed by pool.map_async()?
                            
                                How can I detect Heroku's environment?
                            
                                sqlalchemy simple example of `sum`, `average`, `min`, `max`
                            
                                sqlalchemy foreign key relationship attributes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Nested Json to pandas DataFrame with specific format

Tags:

python

json

format

pandas

nested

figgy

People also ask

1 Answers

Andy Hayden

Recent Activity

Donate For Us