Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to unload only 1000 rows from redshift tables

I am creating a copy of a production redshift database at a development level. I know how to unload data from my production instance/cluster to s3, then copy that data into my development instance/cluster, but only if I unload all the data at once. What I would like to do instead, is to copy just 1000 or so rows from each of my tables, to cut down on space and transfer time between my redshift instances.

e.g.

UNLOAD ('SELECT * FROM myschema.mytable LIMIT 1000') TO 's3://my-bucket' CREDENTIALS etcetcetc

Is there a way to do this LIMIT with UNLOAD, or am I going to have to switch to a bulk-insert-style paradigm?

EDIT: I am programmatically unloading and copying a bunch of tables, so I don't want to hard code in any key-based limits in case we add new tables or change table structures, etc.

like image 825
adavea Avatar asked Aug 21 '15 21:08

adavea


1 Answers

While "LIMIT" is not part of the actual "UNLOAD" command, the Redshift documentation on UNLOAD provides a few alternatives:

Limit Clause

The SELECT query cannot use a LIMIT clause in the outer SELECT. For example, the following UNLOAD statement will fail:

unload ('select * from venue limit 10') 
to 's3://mybucket/venue_pipe_' credentials 
'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'; 

Instead, use a nested LIMIT clause. For example:

unload ('select * from venue where venueid in 
(select venueid from venue order by venueid desc limit 10)') 
to 's3://mybucket/venue_pipe_' credentials 
'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>';

Alternatively, you could populate a table using SELECT…INTO or CREATE TABLE AS using a LIMIT clause, then unload from that table.

like image 167
Mark Hildreth Avatar answered Jan 01 '23 12:01

Mark Hildreth