I am new to DBT and currently trying to build a Docker container where I can directly run DBT commands within. I have a file where I export env variables (envs.sh
) that looks like:
export DB_HOST="secret"
export DB_PWD="evenabiggersecret"
My packages.yml
looks like:
packages:
- package: fishtown-analytics/dbt_utils
version: 0.6.2
I structured my docker file like:
FROM fishtownanalytics/dbt:0.19.0b1
# Define working directory
WORKDIR /usr/app/profile/
ENV DBT_DIR /usr/app
ENV DBT_PROFILES_DIR /usr/app
# Load ENV Vars
COPY ./dbt ${DBT_DIR}
# Load env variables and install packages
COPY envs.sh envs.sh
RUN . ./envs.sh \
&& dbt deps # Exporting envs to avoid profile not found errors when install deps
However, when I run dbt run
inside the docker container I get the error:
'dbt_utils' is undefined
. When I manually run dbt deps
it seems to fix the issue and dbt run
succeeds. Am I missing something when I am originally installing the dependencies?
Update:
In other words, running dbt deps
when building the Docker image seems to have no effect. So I have to run it manually (when I do docker run for example) before I can start doing my workflows. This issue does not happen when I use a Python image (not the image from fishtown-analytics)
Because the base image in the Dockerfile (fishtownanalytics/dbt:0.19.0b1) includes a VOLUME declaration for /usr/app, you can't modify anything in that directory during the build process (see Dockerfile reference notes on VOLUME). Because the working directory is using /usr/app, the modules that are being downloaded and installed by the RUN dbt deps
command in the Dockerfile are being discarded rather than being added to the final image. The python image doesn't have the same VOLUME declaration so isn't causing the same issue.
To get around this you can change the working directory to something other than the declared volume name (e.g., /usr/dbt).
Running dbt deps
is a necessary step in preparing your dbt environment, so you should feel fine invoking dbt deps
in the Dockerfile
prior to dbt run
.
I think, however, your intention is getting lost in the RUN
instruction on the last line: either the last-line RUN
command should be converted to a CMD
instruction or you could perform a RUN dbt depts
by itself prior. (See this question for more detail on the differences between RUN
and CMD
.)
And, for what it's worth: dbt Cloud, the hosted SaaS build environment for dbt, also runs dbt deps
as one of its standard steps for all dbt build jobs -- meaning executing at run time, every time, similar to Docker's CMD
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With