Convert JSON file to Pandas dataframe

I would like to convert a JSON to Pandas dataframe.

My JSON looks like: like:

{ 
   "country1":{ 
      "AdUnit1":{ 
         "floor_price1":{ 
            "feature1":1111,
            "feature2":1112
         },
         "floor_price2":{ 
            "feature1":1121
         }
      },
      "AdUnit2":{ 
         "floor_price1":{ 
            "feature1":1211
         },
         "floor_price2":{ 
            "feature1":1221
         }
      }
   },
   "country2":{ 
      "AdUnit1":{ 
         "floor_price1":{ 
            "feature1":2111,
            "feature2":2112
         }
      }
   }
}

I read the file from GCP using this code:

project = Context.default().project_id
sample_bucket_name = 'my_bucket'
sample_bucket_path = 'gs://' + sample_bucket_name
print('Object: ' + sample_bucket_path + '/json_output.json')

sample_bucket = storage.Bucket(sample_bucket_name)
sample_bucket.create()
sample_bucket.exists()

sample_object = sample_bucket.object('json_output.json')
list(sample_bucket.objects())
json = sample_object.read_stream()

My goal to get Pandas dataframe which looks like:

Given dataframe

I tried using json_normalize, but didn't succeed.

How do I convert JSON data to pandas?

You can convert JSON to pandas DataFrame by using json_normalize() , read_json() and from_dict() functions. Some of these methods are also used to extract data from JSON files and store them as DataFrame. JSON stands for JavaScript object notation . JSON is used for sharing data between servers and web applications.

Can Panda read JSON file?

pandas read_json() function can be used to read JSON file or string into DataFrame. It supports JSON in several formats by using orient param. JSON is shorthand for JavaScript Object Notation which is the most used file format that is used to exchange data between two systems or web applications.

Can you convert JSON to Python?

Parse JSON - Convert from JSON to Python If you have a JSON string, you can parse it by using the json.loads() method. The result will be a Python dictionary.

Nested JSONs are always quite tricky to handle correctly.

A few months ago, I figured out a way to provide an "universal answer" using the beautifully written flatten_json_iterative_solution from here: which unpacks iteratively each level of a given json.

Then one can simply transform it to a Pandas.Series then Pandas.DataFrame like so:

df = pd.Series(flatten_json_iterative_solution(dict(json_))).to_frame().reset_index()

Intermediate Dataframe result

Some data transformation can easily be performed to split the index in the columns names you asked for:

df[["index", "col1", "col2", "col3", "col4"]] = df['index'].apply(lambda x: pd.Series(x.split('_')))

Final result

You could use this:

def flatten_dict(d):
    """ Returns list of lists from given dictionary """
    l = []
    for k, v in sorted(d.items()):
        if isinstance(v, dict):
            flatten_v = flatten_dict(v)
            for my_l in reversed(flatten_v):
                my_l.insert(0, k)

            l.extend(flatten_v)

        elif isinstance(v, list):
            for l_val in v:
                l.append([k, l_val])

        else:
            l.append([k, v])

    return l

This function receives a dictionary (including nesting where values could also be lists) and flattens it to a list of lists.

Then, you can simply:

df = pd.DataFrame(flatten_dict(my_dict))

Where my_dict is your JSON object. Taking your example, what you get when you run print(df) is:

          0        1             2         3     4
0  country1  AdUnit1  floor_price1  feature1  1111
1  country1  AdUnit1  floor_price1  feature2  1112
2  country1  AdUnit1  floor_price2  feature1  1121
3  country1  AdUnit2  floor_price1  feature1  1211
4  country1  AdUnit2  floor_price2  feature1  1221
5  country2  AdUnit1  floor_price1  feature1  2111
6  country2  AdUnit1  floor_price1  feature2  2112

And when you create the dataframe, you can name your columns and index

Convert JSON file to Pandas dataframe

Tags:

python

json

pandas

dataframe

Alexandr Fruman

People also ask

2 Answers

Luc Bertin

Zionsof

Recent Activity

Donate For Us

Convert JSON file to Pandas dataframe

Tags:

python

json

pandas

dataframe

Alexandr Fruman

People also ask

2 Answers

Luc Bertin

Zionsof

Related questions

Recent Activity

Donate For Us