Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Long runtime when query is executed the first time in RedShift

I noticed that the first time I run a query on RedShift, it takes 3-10 second. When I run same query again, even with different arguments in WHERE condition, it runs fast (0.2 sec). Query I was talking about runs on a table of ~1M rows, on 3 integer columns.

Is this huge difference in execution times caused by the fact that RedShift compiles the query first time its run, and then re-uses the compiled code?

If yes - how to always keep this cache of compiled queries warm?

One more question: Given queryA and queryB. Let's assume queryA was compiled and executed first. How similar should queryB be to queryA, such that execution of queryB will use the code compiled for queryA?

like image 878
diemacht Avatar asked Nov 21 '13 07:11

diemacht


1 Answers

The answer of first question is yes. Amazon Redshift compiles code for the query and cache it. The compiled code is shared across sessions in a cluster, so the same query with even different parameters in the different session will run faster because of no overhead.

Also they recommend to use the result of the second execution of the query for the benchmark.

There is the answer for this question and details in the following link. http://docs.aws.amazon.com/redshift/latest/dg/c-compiled-code.html

like image 158
Masashi M Avatar answered Nov 03 '22 02:11

Masashi M