Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bigquery dbt_external_tables External Data Configuration

I need some help when using the dbt_external_tables package.

I realized that in the csv I have in GCS some lines appear to have line breaks and this is causing some issues when trying to query the table created by the macro.

Sometimes when doing this configuration of the external table manually the BigQuery UI has two options: Allow jagged rows (CSV) Allow quoted newlines (CSV) true I usually put those options in true and sometimes the issues are solved. I don't know how to do this using the dbt_external_tables.

This is important as I am receiving this errors when trying to query the table created by dbt "Error while reading table: kpi-process.file_csv.History, error message: CSV table references column position 9, but line starting at position:10956 contains only 7 columns. "

like image 514
Tomas Conticello Avatar asked Mar 16 '26 20:03

Tomas Conticello


1 Answers

The dbt-external-tables package supports passing a dictionary of options for BigQuery external tables, which maps to the options documented here. In your case, it sounds like you want to turn on allow_jagged_rows and allow_quoted_newlines, so you can specify them like so:

version: 2
sources:
 - name: my_external_source
   tables:
     - name: my_external_table
       location: 'gs://bucket/path/*'
       options:
         format: csv
         allow_jagged_rows: true
         allow_quoted_newlines: true

And dbt will template a DDL statement accordingly:

create or replace external statement my_external_source.my_external_table
options (
  format = 'csv',
  allow_jagged_rows = true,
  allow_quoted_newlines = true,
  uris = ['gs://bucket/path/*']
)
like image 81
Jeremy Cohen Avatar answered Mar 19 '26 02:03

Jeremy Cohen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!