Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Specifiying escape characters when importing into BigQuery. "Error: Data between close double quote (") and field separator."

I am attempting to import a large amount of .csv files into Google BigQuery.

One of my String columns sometimes contains an array which contains double quotes that are escaped with \'s. Example:

...,"{\"ex1\":\"somestuffhere\"\,\"ex2\":\"somestuffhere\"\,\"ex3\":\"somestuffhere\"}",...

I believe the issue is caused by BigQuery by default expecting double quotes to be escaped with another double quote - "". I have experimented sending different parameters to the --quote argument but it does not function as I was hoping.

The current command line load command I am attempting:

bq load --noreplace --skip_leading_rows=1 --allow_jagged_rows='true' --ignore_unknown_values='true' --quote='"' --source_format=CSV TestDataset.cmdLineTest gs://s3_data_transfers/Events_10422/LIVE/* myschema.json

Is there anyway I can get the platform to accept these rows? Or am I forced to investigate preprocessing steps? This table will be updated every day based on new csv's being sent, so ideally trying to keep preprocessing steps to a minimum.

Thanks!

like image 390
Eric Keen Avatar asked Nov 07 '22 21:11

Eric Keen


1 Answers

I think this issue is due to --quote argument that you have already pointed out. Since " is the default value to quote CSV data sections, it is incoherent if your data contains ". I have been testing with some similar data (with \") and BigQuery returns the same error:

Error: Data between close double quote (") and field separator.

The solution for this is to specify the --quote flag to another character (in this way, the incoherence will be solved). In my case, it worked with '. The following command worked for me:

bq --location=US load --autodetect --source_format=CSV --quote "'" MY_DATASET.MY_TABLE MY_CSV_FILE

Make sure that your data does not contain ', Otherwise you should find another character for that or set it to empty string as documentation states. Refer to this document for details about this flag.

like image 53
enle lin Avatar answered Dec 04 '22 10:12

enle lin