in the Snowflake Docs it says:
What is meant with the second step?
Let's take the example table t1 shown in the link. In this example table I use the following query:
SELECT * FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘
Because of the Date = ‚11/3‘ it would only scan micro partitions 2, 3 and 4. Because of the Name = 'C' it can prune even more and only scan micro-partions 2 and 4.
So in the end only micro-partitions 2 and 4 would be scanned.
But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?
Does it mean, that only rows 4, 5 and 6 on micro-partition 2 and row 1 on micro-partition 4 are scanned, because date is my clustering key and is sorted so you can prune even further with the date?
So in the end only 4 rows would be scanned?
But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?
Benefits of Micro-partitioning:
Columns are stored independently within micro-partitions, often referred to as columnar storage.
This enables efficient scanning of individual columns; only the columns referenced by a query are scanned.
It is recommended to avoid SELECT *
and specify required columns explicitly.
It simply means to only select the columns that are required for the query. So in your example it would be:
SELECT col_1, col_2 FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With