Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to merge two JSON file with pandas

I'm trying to do a python script that merge 2 json files for example:

First file: students.json

{"John Smith":{"age":16, "id": 1}, ...., "Paul abercom":{"age":18, "id": 764}}

Second file: teacher.json

{"Agathe Magesti":{"age":36, "id": 765}, ...., "Tom Ranliver":{"age":54, "id": 801}}

So in a first time, to not lose any informations I modify the files to add the status of each person like that:

{"John Smith":{"age":16, "id": 1, "status":"student"}, ...., "Paul abercom":{"age":18, "id": 764, "status":"student"}}

{"Agathe Magesti":{"age":36, "id": 765, "status":"teacher"}, ...., "Tom Ranliver":{"age":54, "id": 801, "status":"teacher"}}

To do that I did the following code:

import pandas as pd
type_student = pd.read_json('student.json')
type_student.loc["status"] = "student"
type_student.to_json("testStudent.json")
type_teacher = pd.read_json('teacher.json')
type_teacher.loc["status"] = "teacher"
type_teacher.to_json("testTeacher.json")
with open("testStudent.json") as data_file:
   data_student = json.load(data_file)
with open("testTeacher.json") as data_file:
   data_teacher = json.load(data_file)

What I want to do is to merge data_student and data_teacher and print the resulting JSON in a json file, but I can only use the standard library, pandas, numpy and scipy.

After some tests I realize that some teacher are also students which can be a problem for the merge.

like image 513
mel Avatar asked Feb 06 '16 00:02

mel


2 Answers

it looks like your JSON files contain "objects" as top-level structures. These map to Python dictionaries. So this should be easy using just Python. Just update the first dictionary with the second.

import json

with open("mel1.json") as fo:
    data1 = json.load(fo)

with open("mel2.json") as fo:
    data2 = json.load(fo)

data1.update(data2)

with open("melout.json", "w") as fo:
    json.dump(data1, fo)
like image 61
Keith Avatar answered Oct 09 '22 23:10

Keith


You should concatenate the two data frames before converting to JSON:

pd.concat([data_teacher, data_student], axis=1).to_json()
like image 30
Régis B. Avatar answered Oct 09 '22 22:10

Régis B.