I have a dataframe: <pre class="prettyprint"><code> Name Sub_Marks 0 Tom {'Maths': 30, 'English': 40, 'Science': 35} 1 Harry {'Maths': 35, 'English': 30, 'Science': 25} 2 Nick {'Maths': 32, 'English': 23, 'Science': 20} </code></pre> I need to explode the dictionary into multiple rows. E.g. <pre class="prettyprint"><code> Name Subject Marks 0 Tom Maths 30 1 Tom English 40 2 Tom Science 35 3 Harry Maths 35 4 Harry English 30 5 Harry Science 25 6 Nick Maths 32 7 Nick English 23 8 Nick Science 20 </code></pre> I know we can explode list in dataframe. Is there any way to do it with dictionary?

We can create a list of tuples of <code>Name</code>, <code>Marks</code> and <code>Subjects</code> by iterating over the values of dataframe inside a list comprehension, then we can create a new dataframe from this list of tuples <pre class="prettyprint"><code>out = pd.DataFrame([(n, k, v) for (n, d) in df.values for k, v in d.items()]) out.columns = ['Name', 'Subject', 'Marks'] </code></pre> Alternative <code>pandas</code> based approach <pre class="prettyprint"><code>m = pd.DataFrame([*df['Sub_Marks']], df.index).stack()\ .rename_axis([None,'Subject']).reset_index(1, name='Marks') out = df[['Name']].join(m) </code></pre> <hr> <pre class="prettyprint"><code>>>> out Name Subject Marks 0 Tom Maths 30 1 Tom English 40 2 Tom Science 35 3 Harry Maths 35 4 Harry English 30 5 Harry Science 25 6 Nick Maths 32 7 Nick English 23 8 Nick Science 20 </code></pre>

Alternate method via <code>explode</code> - <pre class="prettyprint"><code>df['Marks'] = df['Sub_Marks'].apply(lambda x: x.values()) df = df.apply(pd.Series.explode).rename(columns = {'Sub_Marks': 'Subject'}) </code></pre> You might wanna use <code>ast.literal_eval</code> first (if above method doesn't work) - <pre class="prettyprint"><code>import ast df['Sub_Marks'] = df['Sub_Marks'].apply(ast.literal_eval) </code></pre>

You can use <code>.apply()</code> with <code>pd.Series()</code> to 'explode' the dictionary into columns and then use <code>.melt()</code> to transform the columns into <code>Subject</code> and <code>Marks</code> columns, as follows: <pre class="prettyprint"><code>(df.drop(columns='Sub_Marks') .join(df.apply(lambda x: pd.Series(x['Sub_Marks']), axis=1)) .melt(id_vars='Name', value_vars=['Maths', 'English', 'Science'], var_name='Subject', value_name='Marks') .sort_values('Name') ).reset_index(drop=True) </code></pre> You can also use <code>pd.DataFrame()</code> together with <code>to_list()</code> to 'explode' the dictionary: <pre class="prettyprint"><code>(df.join(pd.DataFrame(df.pop('Sub_Marks').to_list())) .melt(id_vars='Name', value_vars=['Maths', 'English', 'Science'], var_name='Subject', value_name='Marks') .sort_values('Name') ).reset_index(drop=True) </code></pre> Output: <pre class="prettyprint"><code> Name Subject Marks 0 Harry Maths 35 1 Harry English 30 2 Harry Science 25 3 Nick Maths 32 4 Nick English 23 5 Nick Science 20 6 Tom Maths 30 7 Tom English 40 8 Tom Science 35 </code></pre>

Pandas explode dictionary to rows

Tags:

python

pandas

dataframe

I have a dataframe:

    Name                                    Sub_Marks
0    Tom  {'Maths': 30, 'English': 40, 'Science': 35}
1  Harry  {'Maths': 35, 'English': 30, 'Science': 25}
2   Nick  {'Maths': 32, 'English': 23, 'Science': 20}

I need to explode the dictionary into multiple rows. E.g.

    Name  Subject  Marks
0    Tom    Maths     30
1    Tom  English     40
2    Tom  Science     35
3  Harry    Maths     35
4  Harry  English     30
5  Harry  Science     25
6   Nick    Maths     32
7   Nick  English     23
8   Nick  Science     20

I know we can explode list in dataframe. Is there any way to do it with dictionary?

295

asked Apr 30 '21 15:04

Asif Iqbal

Video Answer

4 Answers

We can create a list of tuples of Name, Marks and Subjects by iterating over the values of dataframe inside a list comprehension, then we can create a new dataframe from this list of tuples

out = pd.DataFrame([(n, k, v) for (n, d) in df.values for k, v in d.items()])
out.columns = ['Name', 'Subject', 'Marks']

Alternative pandas based approach

m = pd.DataFrame([*df['Sub_Marks']], df.index).stack()\
      .rename_axis([None,'Subject']).reset_index(1, name='Marks')

out = df[['Name']].join(m)

>>> out

    Name  Subject  Marks
0    Tom    Maths     30
1    Tom  English     40
2    Tom  Science     35
3  Harry    Maths     35
4  Harry  English     30
5  Harry  Science     25
6   Nick    Maths     32
7   Nick  English     23
8   Nick  Science     20

answered Oct 21 '22 15:10

Shubham Sharma

You can extract only the values from the dictionaries and then expand it to multiple columns like this:

data = {'Name' : ['Tom', "Harry", "Nick"], "Sub_Marks" : [{'Maths': 30, 'English': 40, 'Science': 35},{'Maths': 35, 'English': 42, 'Science': 31},{'Maths': 20, 'English': 14, 'Science': 65}]}
df = pd.DataFrame(data)

df[['Maths','English', 'Science']] = df['Sub_Marks'].apply(pd.Series)
df.drop(columns=['Sub_Marks'], inplace=True)
df = df.set_index('Name').stack().reset_index()
df.columns = ['Name', 'Subject', 'Marks']

answered Oct 21 '22 14:10

Aditya

Alternate method via explode -

df['Marks'] = df['Sub_Marks'].apply(lambda x: x.values())
df = df.apply(pd.Series.explode).rename(columns = {'Sub_Marks': 'Subject'})

You might wanna use ast.literal_eval first (if above method doesn't work) -

import ast
df['Sub_Marks'] = df['Sub_Marks'].apply(ast.literal_eval)

answered Oct 21 '22 13:10

Nk03

You can use .apply() with pd.Series() to 'explode' the dictionary into columns and then use .melt() to transform the columns into Subject and Marks columns, as follows:

(df.drop(columns='Sub_Marks')
   .join(df.apply(lambda x: pd.Series(x['Sub_Marks']), axis=1))
   .melt(id_vars='Name', value_vars=['Maths', 'English', 'Science'], var_name='Subject', value_name='Marks')
   .sort_values('Name')
).reset_index(drop=True)

You can also use pd.DataFrame() together with to_list() to 'explode' the dictionary:

(df.join(pd.DataFrame(df.pop('Sub_Marks').to_list()))
   .melt(id_vars='Name', value_vars=['Maths', 'English', 'Science'], var_name='Subject', value_name='Marks')
   .sort_values('Name')
).reset_index(drop=True)

Output:

    Name  Subject  Marks
0  Harry    Maths     35
1  Harry  English     30
2  Harry  Science     25
3   Nick    Maths     32
4   Nick  English     23
5   Nick  Science     20
6    Tom    Maths     30
7    Tom  English     40
8    Tom  Science     35