Convert nested DataFrame with sorted unique values, to a nested Dictionary in Python

Question

I'm trying to take a nested DataFrame and convert it to a nested Dictionary.

Here is my original DataFrame with the following unique values:

input: df.head(5)

output:

    reviewerName                                  title    reviewerRatings
0        Charles       Harry Potter Book Seven News:...                3.0
1      Katherine       Harry Potter Boxed Set, Books...                5.0
2           Lora       Harry Potter and the Sorcerer...                5.0
3           Cait       Harry Potter and the Half-Blo...                5.0
4          Diane       Harry Potter and the Order of...                5.0

input: len(df['reviewerName'].unique())

output: 66130

Given that there are multiple values in each of the 66130 unqiue values (ie. "Charles" would occur 3 times), I took the 66130 unique "reviewerName" and assign them all as the key in the new nested DataFrame, then assign the value using "title" and "reviewerRatings" as another layer of key:value in the same nested DataFrame.

input: df = df.set_index(['reviewerName', 'title']).sort_index()

output:

                                                       reviewerRatings
    reviewerName                               title
         Charles    Harry Potter Book Seven News:...               3.0
                    Harry Potter and the Half-Blo...               3.5
                    Harry Potter and the Order of...               4.0
       Katherine    Harry Potter Boxed Set, Books...               5.0
                    Harry Potter and the Half-Blo...               2.5
                    Harry Potter and the Order of...               5.0
...
230898 rows x 1 columns

As a follow up to the first question, I tried to convert the nested DataFrame to a nested Dictionary.

The new nested DataFrame column indexing above shows "reviewerRatings" in the 1st row (column 3) and "reviewerName" and "title" in the 2nd row (column 1 and 2), and when I run the df.to_dict() method below, output shows {reviewerRatingsIndexName: {(reviewerName, title): reviewerRatings}}

input: df.to_dict()

output:

{'reviewerRatings': 
 {
  ('Charles', 'Harry Potter Book Seven News:...'): 3.0, 
  ('Charles', 'Harry Potter and the Half-Blo...'): 3.5, 
  ('Charles', 'Harry Potter and the Order of...'): 4.0,   
  ('Katherine', 'Harry Potter Boxed Set, Books...'): 5.0, 
  ('Katherine', 'Harry Potter and the Half-Blo...'): 2.5, 
  ('Katherine', 'Harry Potter and the Order of...'): 5.0,
 ...}
}

But for my desired output below, I'm looking to get my output as {reviewerName: {title: reviewerRating}} which is exactly the way I had sorted in the nested DataFrame.

{'Charles': 
 {'Harry Potter Book Seven News:...': 3.0, 
  'Harry Potter and the Half-Blo...': 3.5, 
  'Harry Potter and the Order of...': 4.0},   
 'Katherine':
 {'Harry Potter Boxed Set, Books...': 5.0, 
  'Harry Potter and the Half-Blo...': 2.5, 
  'Harry Potter and the Order of...': 5.0},
...}

Is there any way to manipulate the nested DataFrame or nested Dictionary so that when I run df.to_dict() method, it would show {reviewerName: {title: reviewerRating}}.

Thanks!

jezrael · Accepted Answer

Use groupby with lambda function for dictionaries per reviewerName and then output Series convert by to_dict:

print (df)
  reviewerName                             title  reviewerRatings
0      Charles  Harry Potter Book Seven News:...              3.0
1      Charles  Harry Potter Boxed Set, Books...              5.0
2      Charles  Harry Potter and the Sorcerer...              5.0
3    Katherine  Harry Potter and the Half-Blo...              5.0
4    Katherine   Harry otter and the Order of...              5.0

d = (df.groupby('reviewerName')['title','reviewerRatings']
       .apply(lambda x: dict(x.values))
       .to_dict())
print (d)

{
    'Charles': {
        'Harry Potter Book Seven News:...': 3.0,
        'Harry Potter Boxed Set, Books...': 5.0,
        'Harry Potter and the Sorcerer...': 5.0
    },
    'Katherine': {
        'Harry Potter and the Half-Blo...': 5.0,
        'Harry otter and the Order of...': 5.0
    }
}

Convert nested DataFrame with sorted unique values, to a nested Dictionary in Python

Tags:

python

dictionary

pandas

dataframe

nested

Mick

1 Answers

jezrael

Recent Activity

Donate For Us

Convert nested DataFrame with sorted unique values, to a nested Dictionary in Python

Tags:

python

dictionary

pandas

dataframe

nested

Mick

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us