During the ETL we do the following operations:
begin transaction;
drop table if exists target_tmp;
create table target_tmp like target;
insert into target_tmp select * from source_a inner join source_b on ...;
analyze table target_tmp;
drop table target;
alter table target_tmp rename to target;
commit;
The SQL command is performed by AWS Data Pipeline, if this is important.
However, the pipelines sometimes fail with the following error:
ERROR: table 111566 dropped by concurrent transaction
Redshift supports serializable isolation. Does one of the commands break isolation?
Yes that works, but if generating the temp table takes a while you can expect to see that error for other queries while it runs. You could try generating the temp table in a separate transaction (transaction may not be needed unless you worry about updates to the source tables). Then do a quick rotation of the table names so there is much less time for contention:
-- generate target_tmp first then
begin;
alter table target rename to target_old;
alter table target_tmp rename to target;
commit;
drop table target_old;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With