Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a Pandas DataFrame to a dictionary

I have a DataFrame with four columns. I want to convert this DataFrame to a python dictionary. I want the elements of first column be keys and the elements of other columns in same row be values.

DataFrame:

    ID   A   B   C
0   p    1   3   2
1   q    4   3   2
2   r    4   0   9  

Output should be like this:

Dictionary:

{'p': [1,3,2], 'q': [4,3,2], 'r': [4,0,9]}
like image 269
Prince Bhatti Avatar asked Nov 03 '14 14:11

Prince Bhatti


People also ask

How do I convert a pandas DataFrame to a dictionary?

To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}} . When no orient is specified, to_dict() returns in this format.

Can we convert DataFrame to dictionary in Python?

to_dict() method is used to convert a dataframe into a dictionary of series or list like data type depending on orient parameter. Parameters: orient: String value, ('dict', 'list', 'series', 'split', 'records', 'index') Defines which dtype to convert Columns(series into).


2 Answers

The to_dict() method sets the column names as dictionary keys so you'll need to reshape your DataFrame slightly. Setting the 'ID' column as the index and then transposing the DataFrame is one way to achieve this.

to_dict() also accepts an 'orient' argument which you'll need in order to output a list of values for each column. Otherwise, a dictionary of the form {index: value} will be returned for each column.

These steps can be done with the following line:

>>> df.set_index('ID').T.to_dict('list') {'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]} 

In case a different dictionary format is needed, here are examples of the possible orient arguments. Consider the following simple DataFrame:

>>> df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]}) >>> df         a      b 0     red  0.500 1  yellow  0.250 2    blue  0.125 

Then the options are as follows.

dict - the default: column names are keys, values are dictionaries of index:data pairs

>>> df.to_dict('dict') {'a': {0: 'red', 1: 'yellow', 2: 'blue'},   'b': {0: 0.5, 1: 0.25, 2: 0.125}} 

list - keys are column names, values are lists of column data

>>> df.to_dict('list') {'a': ['red', 'yellow', 'blue'],   'b': [0.5, 0.25, 0.125]} 

series - like 'list', but values are Series

>>> df.to_dict('series') {'a': 0       red       1    yellow       2      blue       Name: a, dtype: object,    'b': 0    0.500       1    0.250       2    0.125       Name: b, dtype: float64} 

split - splits columns/data/index as keys with values being column names, data values by row and index labels respectively

>>> df.to_dict('split') {'columns': ['a', 'b'],  'data': [['red', 0.5], ['yellow', 0.25], ['blue', 0.125]],  'index': [0, 1, 2]} 

records - each row becomes a dictionary where key is column name and value is the data in the cell

>>> df.to_dict('records') [{'a': 'red', 'b': 0.5},   {'a': 'yellow', 'b': 0.25},   {'a': 'blue', 'b': 0.125}] 

index - like 'records', but a dictionary of dictionaries with keys as index labels (rather than a list)

>>> df.to_dict('index') {0: {'a': 'red', 'b': 0.5},  1: {'a': 'yellow', 'b': 0.25},  2: {'a': 'blue', 'b': 0.125}} 
like image 96
Alex Riley Avatar answered Oct 24 '22 09:10

Alex Riley


Should a dictionary like:

{'red': '0.500', 'yellow': '0.250', 'blue': '0.125'}

be required out of a dataframe like:

        a      b
0     red  0.500
1  yellow  0.250
2    blue  0.125

simplest way would be to do:

dict(df.values)

working snippet below:

import pandas as pd
df = pd.DataFrame({'a': ['red', 'yellow', 'blue'], 'b': [0.5, 0.25, 0.125]})
dict(df.values)
like image 29
Muhammad Moiz Ahmed Avatar answered Oct 24 '22 09:10

Muhammad Moiz Ahmed