Snowflake query pruning by Column

Question

in the Snowflake Docs it says:

First, prune micro-partitions that are not needed for the query.
Then, prune by column within the remaining micro-partitions.

What is meant with the second step?

Let's take the example table t1 shown in the link. In this example table I use the following query:

SELECT * FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘

Because of the Date = ‚11/3‘ it would only scan micro partitions 2, 3 and 4. Because of the Name = 'C' it can prune even more and only scan micro-partions 2 and 4.

So in the end only micro-partitions 2 and 4 would be scanned.

But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?

Does it mean, that only rows 4, 5 and 6 on micro-partition 2 and row 1 on micro-partition 4 are scanned, because date is my clustering key and is sorted so you can prune even further with the date?

So in the end only 4 rows would be scanned?

Lukasz Szozda · Accepted Answer

But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?

Benefits of Micro-partitioning:

Columns are stored independently within micro-partitions, often referred to as columnar storage.

This enables efficient scanning of individual columns; only the columns referenced by a query are scanned.

It is recommended to avoid SELECT * and specify required columns explicitly.

Simon D · Answer

It simply means to only select the columns that are required for the query. So in your example it would be:

SELECT col_1, col_2 FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘

Snowflake query pruning by Column

Tags:

snowflake-cloud-data-platform

Schnurres

2 Answers

Lukasz Szozda

Simon D

Recent Activity

Donate For Us

Snowflake query pruning by Column

Tags:

snowflake-cloud-data-platform

Schnurres

2 Answers

Lukasz Szozda

Simon D

Related questions

Recent Activity

Donate For Us