I am trying to read csv data from s3 bucket and creating a table in AWS Athena. My table when created was unable to skip the header information of my CSV file.
Query Example :
CREATE EXTERNAL TABLE IF NOT EXISTS table_name ( `event_type_id`
string, `customer_id` string, `date` string, `email` string )
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH
SERDEPROPERTIES ( "separatorChar" = "|", "quoteChar" = "\"" )
LOCATION 's3://location/'
TBLPROPERTIES ("skip.header.line.count"="1");
skip.header.line.count doesn't seem to work. But this does not work out. I think Aws has some issue with this.Is there any other way that I could get through this?
This is what works in Redshift:
You want to use table properties ('skip.header.line.count'='1')
Along with other properties if you want, e.g. 'numRows'='100'
.
Here's a sample:
create external table exreddb1.test_table
(ID BIGINT
,NAME VARCHAR
)
row format delimited
fields terminated by ','
stored as textfile
location 's3://mybucket/myfolder/'
table properties ('numRows'='100', 'skip.header.line.count'='1');
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With