Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a list of Pydantic BaseModels to Pandas Dataframe

I can't seem to find any built-in way of simply converting a list of Pydantic BaseModels to a Pandas Dataframe.

from pydantic import BaseModel
import pandas as pd

class SomeModel(BaseModel):
    col1: str
    col2: str

data = [SomeModel(**{'col1': 'foo', 'col2': 'bar'})] * 10
pd.DataFrame(data)

Output

>>         0            1
>> 0  (col1, foo)  (col2, bar)
>> 1  (col1, foo)  (col2, bar)
>> ...

In this way the columns are loaded as data. A workaround is to do the following

pd.Dataframe([model.dict() for model in data])

Output

>>    col1 col2
>> 0  foo  bar
>> 1  foo  bar
>> ...

However this method is a bit slow for larger amounts of data. Is there a faster way?

like image 890
Jerry Avatar asked May 15 '20 08:05

Jerry


People also ask

How do you convert a list of tuples into a DataFrame pandas?

To convert a Python tuple to DataFrame, use the pd. DataFrame() constructor that accepts a tuple as an argument and it returns a DataFrame.

Can we create DataFrame from list of dictionaries?

When we create Dataframe from a list of dictionaries, matching keys will be the columns and corresponding values will be the rows of the Dataframe. If there are no matching values and columns in the dictionary, then the NaN value will be inserted into the resulted Dataframe.

Can you put a list in a pandas DataFrame?

You can insert a list of values into a cell in Pandas DataFrame using DataFrame.at() , DataFrame. iat() , and DataFrame.

How to convert a list to a Dataframe in Python?

Python / October 18, 2019. At times, you may need to convert your list to a DataFrame in Python. You may then use this template to convert your list to pandas DataFrame: from pandas import DataFrame your_list = ['item1', 'item2', 'item3',...] df = DataFrame (your_list,columns= ['Column_Name']) In the next section, I’ll review few examples ...

What is Dataframe in pandas?

A data frame could be a two-dimensional data structure, i.e., knowledge is aligned in a very tabular fashion in rows and columns. Pandas Dataframe consists of 3 principal elements, the data, rows, and columns. There are many ways to create a data frame from the list.

How do I convert a pandas series to a Dataframe?

In this tutorial, you’ll see how to convert Pandas Series to a DataFrame. You’ll also observe how to convert multiple Series into a DataFrame. To begin, here is the syntax that you may use to convert your Series to a DataFrame: df = my_series.to_frame () Alternatively, you can use this approach to convert your Series: df = pd.DataFrame (my_series)

What is pandas in Python?

Pandas is a software library written for the Python programming language for data manipulation and analysis. Pandas Dataframe is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns).


Video Answer


2 Answers

A quick and dirty profiling yield the following values:

from pydantic import BaseModel
import pandas as pd
from fastapi.encoders import jsonable_encoder
class SomeModel(BaseModel):
    col1: int
    col2: str

data = [SomeModel(col1=1,col2="foo"),SomeModel(col1=2,col2="bar")]*4*10**5

import cProfile

cProfile.run( 'pd.DataFrame([s.dict() for s in data])' ) # around 8.2s
cProfile.run( 'pd.DataFrame(jsonable_encoder(data))' ) # around 30.8s
cProfile.run( 'pd.DataFrame([s.__dict__ for s in data])' ) # around 1.7s
cProfile.run( 'pd.DataFrame([dict(s) for s in data])' ) # around 3s
like image 142
camo Avatar answered Nov 12 '22 17:11

camo


Not sure if it's faster, but FastAPI exposes jsonable_encoder which essentially performs that same transformation on an arbitrarily nested structure of BaseModel:

from fastapi.encoders import jsonable_encoder
pd.DataFrame(jsonable_encoder(data))
like image 26
patricksurry Avatar answered Nov 12 '22 17:11

patricksurry