Pandas Dataframe to Nested JSON

Tags:

I am trying to convert a Pandas Dataframe to a JSON object. My Dataframe contains data in the following format:

         student      date    grade         course
0     Student_1    2017-06-25  93          ENGLISH
1     Student_2    2017-06-25  83          ENGLISH
2     Student_1    2017-06-25  93          MATH
3     Student_2    2017-06-25  83          MATH
4     Student_1    2017-06-26  90          MATH
5     Student_2    2017-06-26  85          MATH
6     Student_1    2017-06-26  96          ENGLISH
7     Student_2    2017-06-26  99          ENGLISH

I want to convert it to a JSON object in the following format:

[
    {'ENGLISH': [
      {
        'date' : '2017-06-25',
        'Student_1' : 93,
        'Student_2' : 83
      },

      {
        'date' : '2017-06-26',
        'Student_1' : 96,
        'Student_2' : 89
      }]
   },

    {'MATH': [
      {
        'date' : '2017-06-25',
        'Student_1' : 93,
        'Student_2' : 83
      },

      {
        'date' : '2017-06-26',
        'Student_1' : 90,
        'Student_2' : 85
      }]
    }
]

A simple .to_json() call did not do the trick for me. Is there anyway I can create the JSON object in the required format in Pandas?

303

asked Jun 25 '17 20:06

Nishant Roy

2 Answers

Try that :

file.csv:

student,date,grade,course
0,Student_1,2017-06-25,93,ENGLISH
1,Student_2,2017-06-25,83,ENGLISH
2,Student_1,2017-06-25,93,MATH
3,Student_2,2017-06-25,83,MATH
4,Student_1,2017-06-26,90,MATH
5,Student_2,2017-06-26,85,MATH
6,Student_1,2017-06-26,96,ENGLISH
7,Student_2,2017-06-26,99,ENGLISH

Execute:

from collections import defaultdict

import json
import pandas as pd


df = pd.read_csv('file.csv')

json_doc = defaultdict(list)
for _id in df.T:
    data = df.T[_id]
    key = data.course
    for elt in json_doc[key]:
        if elt["date"] == data.date:
            elt[data.student] = data.grade
            break
    else:
        values = {'date': data.date, data.student: data.grade}
        json_doc[key].append(values)

print(json.dumps(json_doc, indent=4))

Output:

{
    "ENGLISH": [
        {
            "date": "2017-06-25",
            "Student_1": 93,
            "Student_2": 83
        },
        {
            "date": "2017-06-26",
            "Student_1": 96,
            "Student_2": 99
        }
    ],
    "MATH": [
        {
            "date": "2017-06-25",
            "Student_1": 93,
            "Student_2": 83
        },
        {
            "date": "2017-06-26",
            "Student_1": 90,
            "Student_2": 85
        }
    ]
}

answered Oct 10 '22 09:10

glegoux

You can first define a function to convert sub-groups to json, then apply this function to each group, and then merge sub-group jsons to a single json object.

def f(x):
    return (dict({'date':x.date.iloc[0]},**{k:v for k,v in zip(x.student,x.grade)}))

(
    df.groupby(['course','date'])
      .apply(f)
      .groupby(level=0)
      .apply(lambda x: x.tolist())
      .to_dict()
)
Out[1006]: 
{'ENGLISH': [{'Student_1': 93, 'Student_2': 83, 'date': '2017-06-25'},
  {'Student_1': 96, 'Student_2': 99, 'date': '2017-06-26'}],
 'MATH': [{'Student_1': 93, 'Student_2': 83, 'date': '2017-06-25'},
  {'Student_1': 90, 'Student_2': 85, 'date': '2017-06-26'}]}

answered Oct 10 '22 10:10

Allen

Related questions
                            
                                Getting black plots with plt.imshow after multiplying RGB image array by a scalar
                            
                                Unable to use summary.merge in tensorboard for separate training and evaluation summaries
                            
                                Python Installation Compilation Errors
                            
                                How can I keep cells square in heatmap?
                            
                                TypeError: 'module' object is not subscriptable
                            
                                Center crop a numpy array
                            
                                How to change values of url query in python?
                            
                                Multiple comparison operators in single statement (chaining comparison operators)
                            
                                How to import Bokeh palettes
                            
                                ImportError: libgomp.so.1: cannot open shared object file: No such file or directory
                            
                                Displaying both sides of a ManyToMany relationship in Django admin
                            
                                Can't pip install packages in python 3.6 due to ssl error
                            
                                Is there a method in numpy to multiply every element in an array?
                            
                                Speeding up an .exe created with Pyinstaller
                            
                                pandas to_latex() escapes mathmode
                            
                                Multiprocessing - map over list, killing processes that stall above timeout limit
                            
                                Python script runs on boot then reboots at end - How to regain control?
                            
                                OSError: Unable to locate Ghostscript on paths
                            
                                Select columns using pandas dataframe.query()
                            
                                Count number of columns with some values for each row in pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas Dataframe to Nested JSON

Tags:

python

json

pandas

dataframe

Nishant Roy

People also ask

2 Answers

glegoux

Allen

Recent Activity

Donate For Us