Using the PeeWee ORM I have the following query:
query = DataModel.select()where(DataModel.field == "value")
Is there any way to convert query
into a pandas DataFrame without iterating over all the values? I'm looking for a more "Pythonic" way of doing this.
Pandasql is a python library that allows manipulation of a Pandas Dataframe using SQL. Under the hood, Pandasql creates an SQLite table from the Pandas Dataframe of interest and allow users to query from the SQLite table using SQL.
You can create a new DataFrame of a specific column by using DataFrame. assign() method. The assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.
Pandas Series: astype() functionThe astype() function is used to cast a pandas object to a specified data type. Use a numpy. dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.
Assuming query
is of type peewee.SelectQuery
, you could do:
df = pd.DataFrame(list(query.dicts()))
EDIT: As Nicola points out below, you're now able to do pd.DataFrame(query.dicts())
directly.
Just in case someone finds this useful, I was searching for the same conversion but in Python 3. Inspired by @toto_tico's previous answer, this is what I came up with:
import pandas
import peewee
def data_frame_from_peewee_query(query: peewee.Query) -> pandas.DataFrame:
connection = query._database.connection() # noqa
sql, params = query.sql()
return pandas.read_sql_query(sql, connection, params=params)
Checked with Python 3.9.6, pandas==1.3.2
and peewee==3.14.4
, using peewee.SqliteDatabase
.
The following is a more efficient way, because it avoids creating the list and then pass it to the pandas dataframe. It also has the side benefit of preserving the order of the columns:
df = pd.read_sql(query.sql()[0], database.connection())
You need direct access to the peewee database
, for example, in the quickstart tutorial corresponds to:
db = SqliteDatabase('people.db')
Of course, you can also create your own connection to the database.
Drawback: you should be careful if you have repeated columns in the two tables, e.g. id
columns would appear twice. So make sure to correct those before continuing.
If you are using a peewee proxy
import peewee as pw;
database_proxy = pw.Proxy()
then the connection is here:
database_proxy.obj.connection()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With