Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

will DBT support temp table creation like create table #temp1 as select * from tab1 or it works only CTE way

Tags:

dbt

I found out a way to handle the temp tables in DBT, write all those in pre-hook and call the final temp table in the outside of the pre-hook, tested and is working fine, able to reduce the code running time from more than 20 mins to 1 min. But I see one problem that we can't see the lineage graph in the DBT documents. Is there any way to handle the temp tables other than pre-hook and with lineage in Docs?

like image 242
Niks A Avatar asked Jul 20 '20 18:07

Niks A


People also ask

Does DBT create tables?

Source tables in dbt There are no create or replace statements written in model statements. This means that dbt does not offer methods for issuing CREATE TABLE statements which can be used for source tables.

Can you create temp tables in SSRS?

Yes, you can. SSRS simply executes the SQL you put in the query window, or the SP you put in. If that SQL creates and uses a temporary table, then SSRS will make use of it.

Can we create temp table?

To create a temporary table, you must have the CREATE TEMPORARY TABLES privilege. After a session has created a temporary table, the server performs no further privilege checks on the table. The creating session can perform any operation on the table, such as DROP TABLE , INSERT , UPDATE , or SELECT .

Which is better CTE or temp table?

Looking at SQL Profiler results from these queries (each were run 10 times and averages are below) we can see that the CTE just slightly outperforms both the temporary table and table variable queries when it comes to overall duration.


1 Answers

You're right in thinking that dbt does not support temporary tables. That's because temporary tables only persist in a single session, and dbt opens one connection/session per thread. Therefore any temporary tables created on one thread would not be visible to a model running on a different thread.

It sounds like CTEs are a performance drag for you though — out of interest, which warehouse are you using?

You've identified two workarounds, and there's another one worth discussing:

Option 1: Materialize your model as CTEs using the ephemeral materialization (docs)

Pros:

  • The models show up in the lineage graph
  • You can re-use these transformations in multiple downstream models by ref-ing them
  • You can test and document these models

Cons:

  • At some point there is a performance degradation with too many stacked CTEs (especially on older versions of postgres, where CTEs are an optimization fence)
  • Compiled SQL can be harder to debug

Option 2: Use pre-hooks to create temp tables

I would generally recommend against this — you can't test or document your models, and they won't be in the lineage graph (as you've noted).

Option 3: Materialize these models as tables in a separate schema, and drop the schema at the end of a run

I think Michael's suggestion is a good one! I'd tweak it just a little bit:

  1. Use the schema config to materialize a model in a separate schema
{{ config(
  materialized='table',
  schema='my_temporary_schema'
) }}
  1. Then, at the end of a run, use an on-run-end hook (docs) to drop that schema — in your dbt_project.yml:
on-run-end: "drop schema my_temporary_schema cascade"

Pros:

  • All the benefits of Option 1
  • Sounds like it might be more performant than using CTEs

Cons:

  • Make sure you don't have any dependent views on top of that schema! They might get dropped when you run a drop cascade command! This introduces fragility into your project!
like image 126
Claire Carroll Avatar answered Nov 06 '22 11:11

Claire Carroll