Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redshift COPY command delimiter not found

I'm trying to load some text files to Redshift. They are tab delimited, except for after the final row value. That's causing a delimiter not found error. I only see a way to set the field delimiter in the COPY statement, not a way to set a row delimiter. Any ideas that don't involve processing all my files to add a tab to the end of each row?

Thanks

like image 492
Erik Darling Avatar asked Feb 18 '14 18:02

Erik Darling


People also ask

What is redshift delimiter?

By default, the COPY command expects the source data to be character-delimited UTF-8 text. The default delimiter is a pipe character ( | ). If the source data is in another format, use the following parameters to specify the data format: FORMAT. CSV.

What is Stl_load_errors?

STL_LOAD_ERRORS contains a history of all Amazon Redshift load errors. See Load error reference for a comprehensive list of possible load errors and explanations.


3 Answers

I don't think the problem is with missing <tab> at the end of lines. Are you sure that ALL lines have correct number of fields?

Run the query:

select le.starttime, d.query, d.line_number, d.colname, d.value,
le.raw_line, le.err_reason    
from stl_loaderror_detail d, stl_load_errors le
where d.query = le.query
order by le.starttime desc
limit 100

to get the full error report. It will show the filename with errors, incorrect line number, and error details.

This will help to find where the problem lies.

like image 103
Tomasz Tybulewicz Avatar answered Sep 20 '22 15:09

Tomasz Tybulewicz


You can get the delimiter not found error if your row has less columns than expected. Some CSV generators may just output a single quote at the end if last columns are null.

To solve this you can use FILLRECORD on Redshift copy options.

like image 32
Madhava Carrillo Avatar answered Sep 19 '22 15:09

Madhava Carrillo


From my understanding the error message Delimiter not found may be caused also by not specifying correctly the COPY command, in particular by not specifying the Data format parameters https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html

In my case I was trying to load Parquet data with this expression:

COPY my_schema.my_table
FROM 's3://my_bucket/my/folder/'
IAM_ROLE 'arn:aws:iam::my_role:role/my_redshift_role'
REGION 'my-region-1';

and I received the Delimiter not found error message when looking into the system table stl_load_errors. But specifying I'm dealing with Parquet data in the expression in this way:

COPY my_schema.my_table
FROM 's3://my_bucket/my/folder/'
IAM_ROLE 'arn:aws:iam::my_role:role/my_redshift_role'
FORMAT AS PARQUET;

solved my problem and I was able to correctly load the data.

like image 3
Vzzarr Avatar answered Sep 21 '22 15:09

Vzzarr