I have an input dataframe df which is as follows: <pre class="prettyprint"><code>id e 1 {"k1":"v1","k2":"v2"} 2 {"k1":"v3","k2":"v4"} 3 {"k1":"v5","k2":"v6"} </code></pre> I want to "flatten" the column 'e' so that my resultant dataframe is: <pre class="prettyprint"><code>id e.k1 e.k2 1 v1 v2 2 v3 v4 3 v5 v6 </code></pre> How can I do this? I tried using json_normalize but did not have much success

Here is a way to use <code>pandas.io.json.json_normalize()</code>: <pre class="prettyprint"><code>from pandas.io.json import json_normalize df = df.join(json_normalize(df["e"].tolist()).add_prefix("e.")).drop(["e"], axis=1) print(df) # e.k1 e.k2 #0 v1 v2 #1 v3 v4 #2 v5 v6 </code></pre> However, if you're column is actually a <code>str</code> and not a <code>dict</code>, then you'd first have to map it using <code>json.loads()</code>: <pre class="prettyprint"><code>import json df = df.join(json_normalize(df['e'].map(json.loads).tolist()).add_prefix('e.'))\ .drop(['e'], axis=1) </code></pre>

I want to flatten JSON column in a Pandas DataFrame

I have an input dataframe df which is as follows:

id  e
1   {"k1":"v1","k2":"v2"}
2   {"k1":"v3","k2":"v4"}
3   {"k1":"v5","k2":"v6"}

I want to "flatten" the column 'e' so that my resultant dataframe is:

id  e.k1    e.k2
1   v1  v2
2   v3  v4
3   v5  v6

How can I do this? I tried using json_normalize but did not have much success

How do I flatten nested JSON in a data frame?

Pandas have a nice inbuilt function called json_normalize() to flatten the simple to moderately semi-structured nested JSON structures to flat tables. Parameters: data – dict or list of dicts.

How do I convert JSON data to Pandas?

You can convert JSON to Pandas DataFrame by simply using read_json() . Just pass JSON string to the function. It takes multiple parameters, for our case I am using orient that specifies the format of JSON string. This function is also used to read JSON files into pandas DataFrame.

Here is a way to use pandas.io.json.json_normalize():

from pandas.io.json import json_normalize
df = df.join(json_normalize(df["e"].tolist()).add_prefix("e.")).drop(["e"], axis=1)
print(df)
#  e.k1 e.k2
#0   v1   v2
#1   v3   v4
#2   v5   v6

However, if you're column is actually a str and not a dict, then you'd first have to map it using json.loads():

import json
df = df.join(json_normalize(df['e'].map(json.loads).tolist()).add_prefix('e.'))\
    .drop(['e'], axis=1)

I want to flatten JSON column in a Pandas DataFrame

Tags:

python

json

pandas

normalize

Symphony

People also ask

1 Answers

pault

Recent Activity

Donate For Us

I want to flatten JSON column in a Pandas DataFrame

Tags:

python

json

pandas

normalize

Symphony

People also ask

1 Answers

pault

Related questions

Recent Activity

Donate For Us