Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame from Dictionary with Lists

Tags:

python

pandas

I have an API that returns a single row of data as a Python dictionary. Most of the keys have a single value, but some of the keys have values that are lists (or even lists-of-lists or lists-of-dictionaries).

When I throw the dictionary into pd.DataFrame to try to convert it to a pandas DataFrame, it throws a "Arrays must be the same length" error. This is because it cannot process the keys which have multiple values (i.e. the keys which have values of lists).

How do I get pandas to treat the lists as 'single values'?

As a hypothetical example:

data = { 'building': 'White House', 'DC?': True,
         'occupants': ['Barack', 'Michelle', 'Sasha', 'Malia'] }

I want to turn it into a DataFrame like this:

ix   building         DC?      occupants
0    'White House'    True     ['Barack', 'Michelle', 'Sasha', 'Malia']
like image 499
Conway Avatar asked Nov 03 '15 16:11

Conway


People also ask

Can we create DataFrame from dictionary of lists?

It is the most commonly used pandas object. Creating pandas data-frame from lists using dictionary can be achieved in multiple ways. Let's discuss different ways to create a DataFrame one by one. With this method in Pandas, we can transform a dictionary of lists into a dataframe.

How do I make a pandas DataFrame from a list of dictionaries?

Use pd. DataFrame. from_dict() to transform a list of dictionaries to pandas DatFrame. This function is used to construct DataFrame from dict of array-like or dicts.

Can we create DataFrame from list and dictionary in Python?

When we create dataframe from a list of dictionaries, matching keys will be the columns and corresponding values will be the rows of the dataframe. If there is no matching values and columns in the dictionary, then NaN value will be inserted in the resulted dataframe. For example, Python3.

When we create DataFrame from dictionary of list then list becomes?

On Initialising a DataFrame object with this kind of dictionary, each item (Key / Value pair) in dictionary will be converted to one column i.e. key will become Column Name and list in the value field will be the column data i.e.


3 Answers

This works if you pass a list (of rows):

In [11]: pd.DataFrame(data)
Out[11]:
    DC?     building occupants
0  True  White House    Barack
1  True  White House  Michelle
2  True  White House     Sasha
3  True  White House     Malia

In [12]: pd.DataFrame([data])
Out[12]:
    DC?     building                         occupants
0  True  White House  [Barack, Michelle, Sasha, Malia]
like image 127
Andy Hayden Avatar answered Oct 17 '22 13:10

Andy Hayden


This turns out to be very trivial in the end

data = { 'building': 'White House', 'DC?': True, 'occupants': ['Barack', 'Michelle', 'Sasha', 'Malia'] }
df = pandas.DataFrame([data])
print df

Which results in:

    DC?     building                         occupants
0  True  White House  [Barack, Michelle, Sasha, Malia]
like image 6
Chinmay Kanchi Avatar answered Oct 17 '22 13:10

Chinmay Kanchi


Solution to make dataframe from dictionary of lists where keys become a sorted index and column names are provided. Good for creating dataframes from scraped html tables.

d = { 'B':[10,11], 'A':[20,21] }
df = pd.DataFrame(d.values(),columns=['C1','C2'],index=d.keys()).sort_index()
df

    C1  C2
A   20  21
B   10  11
like image 2
BSalita Avatar answered Oct 17 '22 13:10

BSalita