I am doing testing against AWS Redshift, and to replicate real world scenarios I need my test queries to not be cached so as not to give a false picture of performance. Is there any way for me to clear the Redshift cache between query runs?
When a user submits a query, Amazon Redshift checks the results cache for a valid, cached copy of the query results. If a match is found in the result cache, Amazon Redshift uses the cached results and doesn't run the query. Result caching is transparent to the user. Result caching is turned on by default.
To delete rows in a Redshift table, use the DELETE FROM statement: DELETE FROM products WHERE product_id=1; The WHERE clause is optional, but you'll usually want it, unless you really want to delete every row from the table.
Dataset size – A higher volume of data in the cluster can slow query performance for queries, because more rows need to be scanned and redistributed. You can mitigate this effect by regular vacuuming and archiving of data, and by using a predicate to restrict the query dataset.
For Redshift Spectrum, Amazon Redshift manages all the computing infrastructure, load balancing, planning, scheduling, and execution of your queries on data stored in Amazon S3.
I believe you can disable the cache for the testing sessions by setting the value enable_result_cache_for_session
to off
From the documentation
If enable_result_cache_for_session is off, Amazon Redshift ignores the results cache and executes all queries when they are submitted.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With