I am trying to execute queries on a large duckdb connection that takes user inputs, some inputs may end up taking too long and I am having trouble figuring out how to time out the queries in the event they take >2 min to conserve resources
Signal is not an option for me since I am not using the main thread
I can not utilize a lot of common libraries that use pickle since duckdb connections are not pickleable
Using concurrent futures has odd behavior for me where the timeout will occur, but then my grpc server response will not continue
def query_timeout(self, request, dev_oid):
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(query_duckdb, request, self.duckdb_client)
try:
# Get the result within timeout (10 seconds)
result = future.result(timeout=10)
# Return the result
return result
except concurrent.futures.TimeoutError:
logger.info(f"Query Timeout on request: {request}")
return "Timed out", None
above is the way I tried to implement that
Any solution is viable etc. using grpc server timeouts, python timeouts, or duckdb settings
You can interrupt all pending operations of a connection using con.interrupt. You can find the docs here. Below is a simple example of how to use this with threading.Timer.
It will throw an duckdb.duckdb.InterruptException if the interrupt is triggered.
import duckdb
import threading
con = duckdb.connect()
sql = "SELECT * FROM range(1000000000)" # example long query
# schedule an interrupt in 10s
timeout_sec = 10
timer = threading.Timer(timeout_sec, con.interrupt)
timer.start()
res = con.execute(sql).fetchall()
timer.cancel() # cancel if finished in time
print(f"Query executed successfully, fetched {len(res)} rows.")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With