viewing nested JSON data into a pandas dataframe

Tags:

I have now added the current problem onto GitHib. Please find the URL for the repo. I have included a Jupyter notebook that also explains the problem. Thanks guys.

https://github.com/simongraham/dataExplore.git

I am currently working with nutritional data for a project, where the data is in raw JSON format, and I want to use python and pandas to obtain an understandable data frame. I understand that this is an easy task when the JSON is not nested. Here I would use:

nutrition = pd.read_json('data')

However I have nested information and I am finding it very difficult to get it into a reasonable data frame. The JSON format is as follows, where the nutritionNutrients element itself is a nested element. The nest for this element will describe the nutritional contents for a variety of different things such as alcohol and bcfa, as included. I have only included a sample because this is a large data file.

  [
        {
            "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
            "vcNutritionId": "2476378b-79ee-4857-a81d-489661a039a1",
            "vcUserId": "cc51145b-5a70-4344-9b55-1a4455f0a9d2",
            "vcPortionId": "1",
            "vcPortionName": "1 average pepper",
            "vcPortionSize": "20",
            "ftEnergyKcal": 5.2,
            "vcPortionUnit": "g",
            "dtConsumedDate": "2016-05-04T00:00:00",
            "nutritionNutrients": [
                {
                    "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
                    "vcNutrient": "alcohol",
                    "ftValue": 0,
                    "vcUnit": "g",
                    "nPercentRI": 0,
                    "vcTrafficLight": ""
                },
                {
                    "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
                    "vcNutrient": "bcfa",
                    "ftValue": 0,
                    "vcUnit": "g",
                    "nPercentRI": 0,
                    "vcTrafficLight": ""
                },
                {
                    "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
                    "vcNutrient": "biotin",
                    "ftValue": 0,
                    "vcUnit": "µg",
                    "nPercentRI": 0,
                    "vcTrafficLight": ""
                },
                ...
            ]
        }
    ]

Any help would be appreciated.

Thanks.

.... ....

Now that I have found out how to solve this problem using json_normalize, I return the same problem, but this time my code is nested twice. Ie:

[
{
...
}
[,
"nutritionPortions": [
    {
        "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
        "vcNutritionId": "2476378b-79ee-4857-a81d-489661a039a1",
        "vcUserId": "cc51145b-5a70-4344-9b55-1a4455f0a9d2",
        "vcPortionId": "1",
        "vcPortionName": "1 average pepper",
        "vcPortionSize": "20",
        "ftEnergyKcal": 5.2,
        "vcPortionUnit": "g",
        "dtConsumedDate": "2016-05-04T00:00:00",
        "nutritionNutrients": [
            {
                "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
                "vcNutrient": "alcohol",
                "ftValue": 0,
                "vcUnit": "g",
                "nPercentRI": 0,
                "vcTrafficLight": ""
            },
            {
                "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
                "vcNutrient": "bcfa",
                "ftValue": 0,
                "vcUnit": "g",
                "nPercentRI": 0,
                "vcTrafficLight": ""
            },
            {
                "vcNutritionPortionId": "478d1905-f264-4d9b-ab76-0ed4252193fd",
                "vcNutrient": "biotin",
                "ftValue": 0,
                "vcUnit": "µg",
                "nPercentRI": 0,
                "vcTrafficLight": ""
            },
            ...
           }
          ]
        }
      ]

When I have a JSON consisting of only nutrition data I can use:

nutrition = (pd.io
   .json
   .json_normalize((data, ['nutritionPortions']), 'nutritionNutrients',
        ['vcNutritionId','vcUserId','vcPortionId','vcPortionName','vcPortionSize',
         'ftEnergyKcal','vcPortionUnit','dtConsumedDate'])
)

However, my data does not only contain nutrition information. For example it will contain activity information, and therefore the nutrition information is nested with "nutrtitionPortions" at the start. Let's assume that all other columns are not nested and they are represented by "Activity" and "Wellbeing".

If I use the code:

nutrition = (pd.io
   .json
   .json_normalize(data, ['nutritionPortions'])
)

I will return to the original problem where "nutritionNutrients" is nested, but I am having no success then obtaining the corresponding data frame.

Thanks

556

asked Jun 23 '16 22:06

simongraham

1 Answers

UPDATE: this should work for your kaidoData.json file:

df = (pd.io
        .json
        .json_normalize(data[0]['ionPortions'], 'nutritionNutrients',
            ['vcNutritionId','vcUserId','vcPortionId','vcPortionName','vcPortionSize',
             'dtCreatedDate','dtUpdatedDate','nProcessingStatus',
             'vcPortionUnit','dtConsumedDate'
            ]
        )
)

PS i don't know what is wrong with the 'ftEnergyKcal' - it throws me:

KeyError: 'ftEnergyKcal'

maybe it's missing in some sections

OLD answer:

use json_normalize():

(pd.io
   .json
   .json_normalize(l, 'nutritionNutrients',
        ['vcNutritionId','vcUserId','vcPortionId','vcPortionName','vcPortionSize',
         'ftEnergyKcal','vcPortionUnit','dtConsumedDate'])
)

demo:

In [107]: (pd.io
   .....:    .json
   .....:    .json_normalize(l, 'nutritionNutrients',
   .....:         ['vcNutritionId','vcUserId','vcPortionId','vcPortionName','vcPortionSize',
   .....:          'ftEnergyKcal','vcPortionUnit','dtConsumedDate'])
   .....: )
Out[107]:
   ftValue  nPercentRI vcNutrient vcNutritionPortionId vcTrafficLight        ...        vcPortionSize  \
0        0           0    alcohol  478d1905-f264-4d...                       ...                   20
1        0           0       bcfa  478d1905-f264-4d...                       ...                   20
2        0           0     biotin  478d1905-f264-4d...                       ...                   20

         vcNutritionId vcPortionId ftEnergyKcal     vcPortionName
0  2476378b-79ee-48...           1          5.2  1 average pepper
1  2476378b-79ee-48...           1          5.2  1 average pepper
2  2476378b-79ee-48...           1          5.2  1 average pepper

[3 rows x 14 columns]

where l is your list (parsed JSON)

192

answered Sep 28 '22 02:09

MaxU - stop WAR against UA

Related questions
                            
                                Postgres SELECT COUNT ... FOR UPDATE
                            
                                Dot Product on 8086 and DSP microprocessor
                            
                                How do I play .wav files in C++ on Linux?
                            
                                SQL Server 2016 _can't start mirror
                            
                                How to run Windows instance on EC2 from Jenkins?
                            
                                How can I send a "Message" to another process in .Net, without using SendMessage?
                            
                                React bootstrap table with tooltip
                            
                                Mock dependencies with Mocha and Typescript
                            
                                How to change icon color in Xamarin.Forms xaml page?
                            
                                Angular 2: Input variable in Component does not update once called
                            
                                Produce same result as CryptoJS.enc.Base64 using PHP
                            
                                How can I fill out void* C pointer in Go?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With