AWS Glue crawler - partition keys types

Tags:

I am using Spark to write files to S3 in ORC format. Also using Athena to query this data.

I am using the following partition keys:

s3://bucket/company=1123/date=20190207

Once I execute the Glue crawler to run on the bucket everything works as expected except the types of the partitions keys.

The Crawler configures them in the catalog as String type instead of int

Is there a configuration to define the default type of the partition keys ?

I know it can be changed manually later and set the Crawler config to Add new columns only.

673

asked Feb 07 '19 13:02

Alex Stanovsky

1 Answers

Glue crawlers always treat partition keys as type string and unfortunately there is no configuration option available to change this behavior.

answered Oct 10 '22 01:10

Yuriy Bondaruk

Related questions
                            
                                Restrict access to S3 static website that uses API Gateway as a proxy
                            
                                How do you let only authorized user have access contents stored in Amazon's S3?
                            
                                Carrierwave & Amazon S3 file downloading/uploading
                            
                                Git deleting things mysteriously (edit: actually django-storages)
                            
                                Getting S3 to always include a Vary header in its response
                            
                                Bypassing org.apache.hadoop.mapred.InvalidInputException: Input Pattern s3n://[...] matches 0 files
                            
                                How to change http response code on an object in Amazon S3
                            
                                S3.putObject - callback never gets called
                            
                                S3 warning: "No content length specified for stream data"
                            
                                Role issue using AWS ElasticSearch with S3
                            
                                Django Storages using s3boto ignoring MEDIA_URL
                            
                                Amazon S3 bucket permission for unauthenticated cognito role user
                            
                                Client IP Address to Closest AWS Region
                            
                                AWS Lambda attached to S3 ObjectCreated event returns "NoSuchKey: The specified key does not exist:
                            
                                How to deploy to a specific object key inside an S3 Bucket with the Serverless framework?
                            
                                AWS S3 Glacier upload-archive taking a long time to finish execution - ways to check status or speed upload?
                            
                                Amazon AWS S3 file naming strategy for performance
                            
                                Run Apache Flink with Amazon S3
                            
                                Read large csv file from S3 into R
                            
                                Dump Symfony2 assets to Amazon S3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

AWS Glue crawler - partition keys types

Tags:

amazon-s3

amazon-athena

aws-glue

aws-glue-data-catalog

Alex Stanovsky

People also ask

1 Answers

Yuriy Bondaruk

Recent Activity

Donate For Us