Let me ask something about column compression on AWS Redshift. Now we're verifying what can be made better performance using appropriate diststyle, sortkeys and column compression.
If my understanding is correct, the column compression can help to reduce IO cost. I tried "analyze compression table_name;". And mostly Redshift suggests to use 'zstd' or 'lzo' as compression method for our columns.
In general speaking, may I ask the columns set as DISTKEY/SORTKEY should be also compressed like other columns?
I'm totally new to Redshift and any advice would be appreciated.
Sincerly.
DISTKEY
can be compressed but the first SORTKEY
column should be uncompressed (ENCODE raw
). If you have multiple sort keys (compound) the other sort key columns can be compressed.
Also, generally recommend using a commonly filtered date/timestamp column (if one exists) as the first sort key column in a compound sort key.
Finally, if you are joining between very large tables try using the same dist and sort keys on both tables so Redshift can use a faster merge join.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With