Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas json_normalize and null values in JSON

I have this sample JSON

{
    "name":"John",
    "age":30,
    "cars": [
        { "name":"Ford", "models":[ "Fiesta", "Focus", "Mustang" ] },
        { "name":"BMW", "models":[ "320", "X3", "X5" ] },
        { "name":"Fiat", "models":[ "500", "Panda" ] }
    ]
 }

When I need to convert JSON to pandas DataFrame I use following code

import json
from pandas.io.json import json_normalize
from pprint import pprint

with open('example.json', encoding="utf8") as data_file:
    data = json.load(data_file)
normalized = json_normalize(data['cars'])

This code works well but in the case of some empty cars (null values) I'm not possible to normalize_json.

Example of json

{
    "name":"John",
    "age":30,
    "cars": [
        { "name":"Ford", "models":[ "Fiesta", "Focus", "Mustang" ] },
        null,
        { "name":"Fiat", "models":[ "500", "Panda" ] }
    ]
 }

Error that was thrown

AttributeError: 'NoneType' object has no attribute 'keys'

I tried to ignore errors in json_normalize, but didn't help

normalized = json_normalize(data['cars'], errors='ignore')

How should I handle null values in JSON?

like image 724
Jozef Cechovsky Avatar asked May 18 '17 14:05

Jozef Cechovsky


People also ask

Does pandas check null?

isnull. Detect missing values for an array-like object. This function takes a scalar or array-like object and indicates whether values are missing ( NaN in numeric arrays, None or NaN in object arrays, NaT in datetimelike).

What does json_normalize return?

This package contains a function, json_normalize. It will take a json-like structure and convert it to a map object which returns dicts. Output dicts will have their path joined by ".", this can of course be customized.

What is Isnull () SUM () pandas?

isnull is an alias for DataFrame. isna. Detect missing values. Return a boolean same-sized object indicating if the values are NA. NA values, such as None or numpy.


1 Answers

You can fill cars with empty dicts to prevent this error

data['cars'] = data['cars'].apply(lambda x: {} if pd.isna(x) else x)
like image 76
vozman Avatar answered Sep 17 '22 13:09

vozman