Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Incremental updates to a Transformed Table

I am trying to setup an ELT pipeline into Snowflake and it involves a transformation after loading.

This transformation will currently create or replace a Table using data queried from a source table in Snowflake after performing some manipulations of JSON data.

My question is, is this the proper way of doing it via create or replace Table everytime the transformation runs or is there a way to update the data in the transformed table incrementally?

Any advise would be greatly appreciated!

Thanks!

like image 484
Heng Zhi Feng Avatar asked Nov 21 '25 12:11

Heng Zhi Feng


1 Answers

You can Insert into the load (soruce) table, and put into a stream, then you can know the rows, ranges of rows that need to be "reviewed" and then upsert into the output transform table.

That is is you doing something like "daily aggregates", thus if in "this batch you have data for the last 4 days, you then read the "last four days" of data from source (space a full read) and then aggregate and upsert via merge command. Thus with the model you can save reads/aggregate/write.

We have also used high water tables, to know last seen data, and/or lowest value in current batch.

like image 97
Simeon Pilgrim Avatar answered Nov 23 '25 10:11

Simeon Pilgrim



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!