Below is the hive table i have created:
CREATE EXTERNAL TABLE Activity (
column1 type, </br>
column2 type
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/exttable/';
In my HDFS location /exttable, i have lot of CSV files and each CSV file also contain the header row. When i am doing select queries, the result contains the header row as well.
Is there any way in HIVE where we can ignore the header row or first line ?
CSV and spreadsheet content rules. Each row in the file must contain the same number of cells. This rule also applies to the header row. The first row must contain column headers.
An external table is stored on HDFS or any storage compatible with HDFS, because we want to use the data outside of Hive. Thus, Hive is not responsible for managing the storage of the external table. Tables can be stored on an external location for instance on a cloud platform like google cloud or AWS.
you can now skip the header count in hive 0.13.0.
tblproperties ("skip.header.line.count"="1");
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With