Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

skip bad record in redshift data load

Tags:

I am trying to load data into AWS redshift using following command

copy venue from 's3://mybucket/venue' credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>' delimiter '\t'; 

but data load is failing, when I checked Query section for that specific load I noticed it failed because of "Bad UTF8 hex sequence: a4 (error 3)"

Is there a way to skip bad records in data load into redshift?

like image 252
roy Avatar asked May 12 '14 19:05

roy


People also ask

Does Redshift support data masking?

You can apply masking based on Redshift Users, identity provider (IdP) Groups, or even Data Directory Groups!

Does Redshift cache query results?

When a user submits a query, Amazon Redshift checks the results cache for a valid, cached copy of the query results. If a match is found in the result cache, Amazon Redshift uses the cached results and doesn't run the query. Result caching is transparent to the user. Result caching is turned on by default.


1 Answers

Yes, you can use the maxerror parameter. This example will allow up to 250 bad records to be skipped (the errors are written to stl_load_errors):

copy venue from 's3://mybucket/venue' credentials 'aws_access_key_id=;aws_secret_access_key=' delimiter '\t' maxerror as 250; 
like image 151
mike_pdb Avatar answered Oct 03 '22 02:10

mike_pdb