I have a redshift table and it is storing a lot of data. Every weekend I go and manually using Workbench TRUNCATE last week of data that I no longer need. I manually have to run <pre class="prettyprint"><code>DELETE FROM tableName WHERE created_date BETWEEN timeStamp1 AND timeStamp2; </code></pre> Is it possible to have some way to tell the table or have some expiration policy that removes the data every Sunday for me? If not, Is there a way to automate the delete process every 7 days? Some sort of shell script or cron job in nodeJS that does this.

No, there is no in-built ability to run commands on a regular basis on Amazon Redshift. You could, however, run a script on another system that connects to Redshift and runs the command. For example, a <code>cron</code> job that calls <code>psql</code> to connect to Redshift and execute the command. This could be done in a one-line script. Alternatively, you could configure an AWS Lambda function to connect to Redshift and execute the command. (You would need to write the function yourself, but there are libraries that make this easier.) Then, you would configure Amazon CloudWatch Events to trigger the Lambda function on a desired schedule (eg once a week). A common strategy is to actually store data in separate tables per time period (eg a month, but in your case it would be a week). Then, define a view that combines several tables. To delete a week of data, simply drop the table that contains that week of data, create a new table for this week's data, then update the view to point to the new table but not the old table. By the way... Your example uses the <code>DELETE</code> command, which is not the same as the <code>TRUNCATE</code> command. <code>TRUNCATE</code> removes all data from a table. It is an efficient way to completely empty a table. <code>DELETE</code> is good for removing part of a table but it simply marks rows as deleted. The data still occupies space on disk. Therefore, it is recommended that you <code>VACUUM</code> the table after deleting a significant quantity of data.

Automate redshift truncate/delete data after a retention period

Tags:

node.js

amazon-web-services

amazon-redshift

I have a redshift table and it is storing a lot of data. Every weekend I go and manually using Workbench TRUNCATE last week of data that I no longer need. I manually have to run

DELETE FROM tableName WHERE created_date BETWEEN timeStamp1 AND timeStamp2;

Is it possible to have some way to tell the table or have some expiration policy that removes the data every Sunday for me?
If not, Is there a way to automate the delete process every 7 days? Some sort of shell script or cron job in nodeJS that does this.

293

asked Sep 24 '17 07:09

Piqué

1 Answers

No, there is no in-built ability to run commands on a regular basis on Amazon Redshift. You could, however, run a script on another system that connects to Redshift and runs the command.

For example, a cron job that calls psql to connect to Redshift and execute the command. This could be done in a one-line script.

Alternatively, you could configure an AWS Lambda function to connect to Redshift and execute the command. (You would need to write the function yourself, but there are libraries that make this easier.) Then, you would configure Amazon CloudWatch Events to trigger the Lambda function on a desired schedule (eg once a week).

A common strategy is to actually store data in separate tables per time period (eg a month, but in your case it would be a week). Then, define a view that combines several tables. To delete a week of data, simply drop the table that contains that week of data, create a new table for this week's data, then update the view to point to the new table but not the old table.

By the way...

Your example uses the DELETE command, which is not the same as the TRUNCATE command.

TRUNCATE removes all data from a table. It is an efficient way to completely empty a table.

DELETE is good for removing part of a table but it simply marks rows as deleted. The data still occupies space on disk. Therefore, it is recommended that you VACUUM the table after deleting a significant quantity of data.

105

answered Sep 17 '22 16:09

John Rotenstein

Related questions
                            
                                Mongoose read-only without schema
                            
                                webdriver-manager update error: EACCES, permission denied
                            
                                Run express middleware for all routes except that are starting with /api?
                            
                                Where should I store cache of a custom CLI npm module?
                            
                                how to delete cookie on logout in express + passport js?
                            
                                What's exactly the meaning of "saveUninitialized","resave" and "rolling" properties in express-session?
                            
                                How to compile scss to css with node-sass
                            
                                Learnyounode #6 make it modular: correct results AND throwing error at the same time?
                            
                                How to run nightmare.js on google appengine for node.js
                            
                                Best way to connect to MongoDB using Node.js [duplicate]
                            
                                What happens if you overload the same route with express js?
                            
                                How to append new row in exist csv file in nodejs json2csv?
                            
                                Node.js supported operating systems
                            
                                only require specific exports from a required commonjs module
                            
                                Alternative for .then() after request()
                            
                                How to distinguish between user types when authenticating with JWT
                            
                                How to test response data from Express in Jest
                            
                                Error: RootQueryType.resolve field config must be an object
                            
                                Node: Sending JSON Web token to client with page redirect
                            
                                Recommended way to grow a Buffer?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With