Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Referencing table values from md file in dbt yaml files

I have a markdown file with a table in it and I have another dbt schema.yml file which is used to serve and generate docs. Traditionally I enter the table name and column name and description of the column name in schema.yml but now that I think of I'd want to reference the column names from the md document into the yaml file rather than manually entering.

This is how my doc.md file looks like

{% docs column_description %}

| COLUMN\_NAME                   | DESCRIPTION                                                               |
| ------------------------------ | ------------------------------------------------------------------------- |
| cycle\_id                      | Customer cycle\_id(todays start sleep time to next days start sleep time) |
| user\_id                       | Customers user\_id                                                        |
| yes\_alcohol                   | User consumed alcohol or not                                              |
| blank\_alcohol                 | User did not answer or user answered "No"                                 |
                    


{% enddocs %}

And, currently, this is how my schema.yml file looks

version: 2

models:

- name: journal_pivot_dev
    description: One for for each journal entry of within one customer cycle
    columns:
      - name: cycle_id
        description:  Customer cycleid
        tests: &not_null
          - not_null:
              severity: warn
      - name: user_id
        description: customer userid
        tests:
          - not_null:
              severity: warn
      - name: yes_alcohol
        description: User consumed alcohol or not
        tests: &values_accepted
          - accepted_values:
              severity: warn
              values: [0,1]
      - name: blank_alcohol
        description: User did not answer          
        tests: *values_accepted 

What I tried:

version: 2
    
    models:
    
    - name: journal_pivot_dev
        description: One for for each journal entry of within one customer cycle
        columns:
          - name: cycle_id
            description:  '{{ doc(column_description") }}'
            tests: &not_null
              - not_null:
                  severity: warn
          - name: user_id
            description: '{{ doc(column_description") }}'
            tests:
              - not_null:
                  severity: warn
          - name: yes_alcohol
            description: '{{ doc(column_description") }}'
            tests: &values_accepted
              - accepted_values:
                  severity: warn
                  values: [0,1]
          - name: blank_alcohol
            description: '{{ doc(column_description") }}'          
            tests: *values_accepted 

But when I do that the description is not rendering to just cycle_id but it is giving me the whole table in the md file.

enter image description here

I am expecting something like this

enter image description here

like image 324
NAB0815 Avatar asked Oct 15 '22 19:10

NAB0815


1 Answers

TLDR: what you are trying to do is not currently possible given that function.

When you call the {{ docs('filename.md') }} function in dbt right now, the code that is being executed (see doc() from dbtlabs' github) is just importing the compiled version of the entire markdown file (after making sure that it is a valid, non-zero length markdown of course).

Arguable, this could be better indicated as such in the docs themselves: https://docs.getdbt.com/reference/resource-properties/description#use-a-docs-block-in-a-description


Let's talk alternatives.

How about we create each of your docs and try to import them into a single "aggregate" doc?

So, using a brand new dbt project my-project let's see:

New project layout:

    | analysis
    | data
    | macros
    | models
         | docs
            | agg_table.md
            | table_item_1.md
            | table_item_2.md
            | table_item_3.md
         | examples
            | schema.yml
            | my_first_dbt_model.sql
            | my_first_dbt_model.sql
     | .gitignore
     | dbt_project.yml

Where schema.yml is something like:


version: 2

models:
    - name: my_first_dbt_model
      description: '{{ doc("agg_table") }}'
      columns:
          - name: id
            description: '{{ doc("table_item_1") }}'
            tests:
                - unique
                - not_null

So testing based on this model, I tried to use at least these options:

Which resulted in variations on:

Where agg_table.md is like:

{% docs agg_table %}

### Agg Table

| COLUMN\_NAME                   | DESCRIPTION                                                                   |
| ------------------------------ | ----------------------------------------------------------------------------- |
| table_item_1                   | {{ docs("table_item_1") }}                                                  |
| table_item_2                   | {{ docs("table_item_2") }}                                                  |
| table_item_3                   | {{ docs("table_item_3") }}                                                  |
| table_item_4                   | {{ docs("table_item_4") }}                                                  |
                    
{% enddocs %}

or

{% docs agg_table %}

### Agg Table

| COLUMN\_NAME                   | DESCRIPTION                                                                              |
| ------------------------------ | ---------------------------------------------------------------------------------------- |
| table_item_1                   | [first_model_description](#!/model/model.my_new_project.my_first_dbt_model#description)  |
| table_item_2                   | [first_model_columns](#!/model/model.my_new_project.my_first_dbt_model#columns)          |
| table_item_3                   | [table_item_doc](#!/docs/docs.my_new_project.table_item_3)                               |
| table_item_4                   | [table_item_doc](#!/docs/docs.my_new_project.table_item_4#description)                   |
                    

{% enddocs %}

But so far it looks like dbt is just completely missing the step of pre-processing markdown documents for jinja macros / references. The links are just pure HTML and not actually "compiling" any content.

Example result: markdown links

Good feature to request on a github issue though.

like image 133
sgdata Avatar answered Oct 20 '22 06:10

sgdata