I went through the entire documentation of Polars but couldn't find anything which could convert nested json into dataframe.
test = {
"name": "Ravi",
"Subjects": {
"Maths": 92,
"English": 94,
"Hindi": 98
}
}
json_normalize in pandas would convert this to a dataframe by naming the columns as name, Subjects.Maths, Subjects.English and Subjects.Hindi. So is this a possibility in Polars? I did try all the functions but it always throws an error as it doesn't undersand the nested structure.
For a simple JSON-like dictionary, you can use a comprehension list to convert the values into list of values.
Below an example:
grades = {
"name": "Ravi",
"Subjects": {
"Maths": 92,
"English": 94,
"Hindi": 98
}}
grades_with_list = {key:[value] for key, value in grades.items()}
pl.DataFrame(grades_with_list)
# Output
shape: (1, 2)
┌──────┬────────────┐
│ name ┆ Subjects │
│ --- ┆ --- │
│ str ┆ struct[3] │
╞══════╪════════════╡
│ Ravi ┆ {92,94,98} │
└──────┴────────────┘
# You can also un-nest the Subjets column, to get a separate column for each subject.
pl.DataFrame(grades_with_list).unnest('Subjects')
# Output
shape: (1, 4)
┌──────┬───────┬─────────┬───────┐
│ name ┆ Maths ┆ English ┆ Hindi │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ i64 │
╞══════╪═══════╪═════════╪═══════╡
│ Ravi ┆ 92 ┆ 94 ┆ 98 │
└──────┴───────┴─────────┴───────┘
A simple version of json_normalize() was added in Polars v1.0
pl.json_normalize(test)
shape: (1, 4)
┌──────┬────────────────┬──────────────────┬────────────────┐
│ name ┆ Subjects.Maths ┆ Subjects.English ┆ Subjects.Hindi │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ i64 │
╞══════╪════════════════╪══════════════════╪════════════════╡
│ Ravi ┆ 92 ┆ 94 ┆ 98 │
└──────┴────────────────┴──────────────────┴────────────────┘
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With