Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a Pandas DataFrame from dictionary of dataframes?

I have a dictionary that is a list of dataframes that have all the same columns and data structure. I am wanting to essentially 'union' all of these into a single dataframe again, where the dictionary keys are converted into another column: df_list{}

{'A' : col1 col2 col3 \
001    val1  val2  val3
002    val3  val4  val5

'B' : col1 col2 col3 \
001    val1  val2  val3
002    val3  val4  val5

...and so on

but am wanting:

key  Col1  Col2  Col3
A    val1  val2  val3
A    val4  val5  val6
B    val1  val2  val3
B    val4  val5  val6

I tried using pd.DataFrame.from_dict() but either I am not using it right or I need something else..

final_df = pd.DataFrame.from_dict(df_list)

but get: ValueError: If using all scalar values, you must pass an index

when I try passing the index, I get one column back vs a dataframe.

like image 721
user3486773 Avatar asked Jun 17 '19 20:06

user3486773


People also ask

How do I convert a dictionary to a DataFrame pandas?

We can convert a dictionary to a pandas dataframe by using the pd. DataFrame. from_dict() class-method.

How do you create a DataFrame from a dictionary?

Method 1: Create DataFrame from Dictionary using default Constructor of pandas. Dataframe class. Method 2: Create DataFrame from Dictionary with user-defined indexes. Method 3: Create DataFrame from simple dictionary i.e dictionary with key and simple value like integer or string value.

Can we create DataFrame from list of dictionaries?

When we create Dataframe from a list of dictionaries, matching keys will be the columns and corresponding values will be the rows of the Dataframe. If there are no matching values and columns in the dictionary, then the NaN value will be inserted into the resulted Dataframe.

Can we create pandas DataFrame using dictionary of tuples?

We can create pandas dataframe by using tuples.


1 Answers

This should do it:

import pandas as pd

df1 = pd.DataFrame({
    "col1":['val1','val3'],
    "col2":['val2','val3'],
    "col3":['val3','val5']
})


df2 = pd.DataFrame({
    "col1":['val7','val3'],
    "col2":['val2','val3'],
    "col3":['val3','val5']
})

pd_dct = {"A": df1, "B": df2}

# adding the key in 
for key in pd_dct.keys():
    pd_dct[key]['key'] = key 

# concatenating the DataFrames
df = pd.concat(pd_dct.values())

Alternatively, we can also do this in one line with:

pd.concat(pd_dct, axis=0).reset_index(level=0).rename({'level_0':'key'}, axis=1)
like image 174
Ian Avatar answered Oct 27 '22 16:10

Ian