Is there any support for running Athena queries on a schedule? We want to query some data daily, and dump a summarized CSV file, but it would be best if this happened on an automated schedule.
If you're using Athena in an ETL pipeline, use AWS Step Functions to create the pipeline and schedule the query. On a Linux machine, use crontab to schedule the query. Use an AWS Glue Python shell job to run the Athena query using the Athena boto3 API. Then, define a schedule for the AWS Glue job.
Open the Amazon Athena console at https://console.aws.amazon.com/athena/ . In the left navigation pane, choose Workflows. In the Execute multiple queries tile, choose Get started. In the Get started dialog box, choose Deploy a sample project, and then choose Continue.
Built on Presto, runs standard SQL Amazon Athena uses Presto with ANSI SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Avro, and Parquet.
Amazon Athena can be accessed via the AWS Management Console, an API, or an ODBC or JDBC driver. You can programmatically run queries, add tables or partitions using the ODBC or JDBC driver.
Schedule an AWS Lambda task to kick this off, or use a cron job on one of your servers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With