Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use external table redshift spectrum defined in glue data catalog

I have a table defined in Glue data catalog that I can query using Athena. As there is some data in the table that I want to use with other Redshift tables, can I access the table defined in Glue data catalog?

What will be the create external table query to reference the table definition in Glue catalog?

like image 412
Abhay Dubey Avatar asked Jan 10 '18 06:01

Abhay Dubey


People also ask

Can Redshift use glue data catalog?

An Amazon Redshift external schema references an external database in an external data catalog. You can create the external database in Amazon Redshift, in Amazon Athena, in AWS Glue Data Catalog, or in an Apache Hive metastore, such as Amazon EMR.

What is Redshift Spectrum external table?

The external table statement defines the table columns, the format of your data files, and the location of your data in Amazon S3. Redshift Spectrum scans the files in the specified folder and any subfolders.

How do you use the Spectrum in Redshift?

Getting started with Redshift Spectrum step by stepCreate an IAM role for Amazon Redshift. Step 2: Associate the IAM role with your cluster. Step 3: Create an external schema and an external table. Step 4: Query your data in Amazon S3.


1 Answers

From AWS (Creating External Schemas),

create external schema athena_schema from data catalog 
database 'sampledb' 
iam_role 'arn:aws:iam::123456789012:role/MySpectrumRole' 
region 'us-east-2';

This creates a schema athena_schema that points to the sampledb database in Athena / Glue.

You need to grant appropriate access to the IAM role you specify: the Redshift cluster needs to be able to assume the role, and the role needs access to Glue.

like image 134
Kirk Broadhurst Avatar answered Sep 20 '22 19:09

Kirk Broadhurst