Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Redshift : DISTKEY / SORTKEY columns should be compressed?

Let me ask something about column compression on AWS Redshift. Now we're verifying what can be made better performance using appropriate diststyle, sortkeys and column compression.

If my understanding is correct, the column compression can help to reduce IO cost. I tried "analyze compression table_name;". And mostly Redshift suggests to use 'zstd' or 'lzo' as compression method for our columns.

In general speaking, may I ask the columns set as DISTKEY/SORTKEY should be also compressed like other columns?

I'm totally new to Redshift and any advice would be appreciated.

Sincerly.

like image 880
Sachiko Avatar asked Dec 18 '22 20:12

Sachiko


1 Answers

DISTKEY can be compressed but the first SORTKEY column should be uncompressed (ENCODE raw). If you have multiple sort keys (compound) the other sort key columns can be compressed.

Also, generally recommend using a commonly filtered date/timestamp column (if one exists) as the first sort key column in a compound sort key.

Finally, if you are joining between very large tables try using the same dist and sort keys on both tables so Redshift can use a faster merge join.

like image 180
Joe Harris Avatar answered Jan 13 '23 11:01

Joe Harris