I have an input dataframe df which is as follows:
id e
1 {"k1":"v1","k2":"v2"}
2 {"k1":"v3","k2":"v4"}
3 {"k1":"v5","k2":"v6"}
I want to "flatten" the column 'e' so that my resultant dataframe is:
id e.k1 e.k2
1 v1 v2
2 v3 v4
3 v5 v6
How can I do this? I tried using json_normalize but did not have much success
Pandas have a nice inbuilt function called json_normalize() to flatten the simple to moderately semi-structured nested JSON structures to flat tables. Parameters: data – dict or list of dicts.
You can convert JSON to Pandas DataFrame by simply using read_json() . Just pass JSON string to the function. It takes multiple parameters, for our case I am using orient that specifies the format of JSON string. This function is also used to read JSON files into pandas DataFrame.
Here is a way to use pandas.io.json.json_normalize()
:
from pandas.io.json import json_normalize
df = df.join(json_normalize(df["e"].tolist()).add_prefix("e.")).drop(["e"], axis=1)
print(df)
# e.k1 e.k2
#0 v1 v2
#1 v3 v4
#2 v5 v6
However, if you're column is actually a str
and not a dict
, then you'd first have to map it using json.loads()
:
import json
df = df.join(json_normalize(df['e'].map(json.loads).tolist()).add_prefix('e.'))\
.drop(['e'], axis=1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With