I have a pandas dataframe that has a column of IDs. I need to run another sql query whose 'WHERE' clause is dictated by all of the IDs in the aforementioned column.
Ex:
df1 = DataFrame({'IDs' : [1,2,3,4,5,6]})
query = """ Select id, SUM(revenue) AS revenue WHERE id IN (***want df1['IDs'] here***) Group by 1"""
df2 = my_database.select_dataframe(query)
Pandasql can work both on Pandas DataFrame and Series . The sqldf method is used to query the Dataframes and it requires 2 inputs: The SQL query string. globals() or locals() function.
The WHERE clause can be used with SQL statements like INSERT, UPDATE, SELECT, and DELETE to filter records and perform various operations on the data.
The IN operator allows you to specify multiple values in a WHERE clause. The IN operator is a shorthand for multiple OR conditions.
One simple way to iterate over columns of pandas DataFrame is by using for loop. You can use column-labels to run the for loop over the pandas DataFrame using the get item syntax ([]) . Yields below output. The values() function is used to extract the object elements as a list.
Convert the series to string
str = ','.join([str(x) for x in df1['IDs'].tolist()])
str
'1,2,3,4,5,6'
And, then insert it into the query string -
qry = "Select id, SUM(revenue) AS revenue WHERE id IN (%s) Group by 1" % str
qry
'Select id, SUM(revenue) AS revenue WHERE id IN (1,2,3,4,5,6) Group by 1'
for this to work for me I had to surround the list items with single quotes.
str = ','.join(["'" + str(x) + "'" for x in df1['IDs'].tolist()])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With