Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using dbt and Meltano, how can I prevent multiple dbt job runs from conflicting with each other?

Tags:

dbt

meltano

When running dbt jobs in Meltano, dbt run jobs may collide with each other if run out of a triggered context - for instance, when an on-demand job collides with a scheduled job or a CI-based job.

If dbt run operates on the same tables at the same time, this generally causes a crash and sometimes a data quality issue if the same insert is performed twice on a single target table.

Any way to prevent run collisions, using either Meltano functionality or native dbt functionality?

like image 664
aaronsteers Avatar asked Dec 16 '21 17:12

aaronsteers


1 Answers

One way is to generate a lock in your target database. Here's an example for MSSQL.

I chose a on-run-start: hook. This hook attempts to grab a lock for dbt that lasts for the duration of the DB session.

dbt_project.yml

on-run-start: 
  - "{{ application_lock() }}" #Be sure only one dbt project runs at a time

Macro: application_lock.sql

{% macro application_lock() %}
    
DECLARE 
    @result int,
    @msg varchar(max);

EXEC @result = sp_getapplock @resource = 'dbt', @lockMode = 'Exclusive', @locktimeout = '1200000', @lockowner = 'Session';
IF @result not in (0, 1)
    BEGIN
    SET @msg = 'Result from sp_getapplock:'+ CAST(@result as varchar(max))+'. Lock could not be taken after waiting 20 minutes. Another DBT process probably has the applock, try again later.';
    THROW 51000, @msg, 1
    END

{% endmacro %}
like image 151
visch Avatar answered Oct 12 '22 11:10

visch