Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get existing Hive table delimiter

Tags:

hadoop

hive

Is there any way to know the stored hive tables delimiter? I tried Describe extended but no use.. I have searched a lot, not yet getting the answer.

like image 534
Rengasamy Avatar asked Aug 12 '15 09:08

Rengasamy


People also ask

How do I change the delimiter of a table in hive?

To handle the delimiters within the data, you can create a table as "ROW FORMAT DELIMITED FIELDS TERMINATED BY "," ESCAPED BY '\\';" which will handle the data “1,some text\, with comma in it,123,more text”. @Sindhu: Thanks.

What is the default delimiter for hive tables?

What are the default record and field delimiter used for hive text files? The default record delimiter is − \n And the filed delimiters are − \001,\002,\003 What do you mean by schema on read?

What is delimiter in hive?

Introduced in HIVE-5871, MultiDelimitSerDe allows user to specify multiple-character string as the field delimiter when creating a table.

What is u0001 delimiter in hive?

'\u0001' is a single character Ctrl-A. What do you have in your data file as the delimiter? A single ctrl-A or 6 characters '\u0001'? The delimiter in Hive must be a single character, and actually Ctrl-A is the default.


3 Answers

Other answers are correct in the sense that you would get the field delimiter if it is other than default. However, I don't see it if the delimiter is the default one, which is Control-A character or "\01" in ASCII

like image 190
Camilo Martinez Avatar answered Oct 13 '22 19:10

Camilo Martinez


Try running a "show create table" command and it will show you the delimiter.

like image 7
Amar Avatar answered Oct 13 '22 18:10

Amar


I am seeing by using describe extended table command

example:

hive> create table difdelimiter (id int, name string) 
      row format delimited 
      fields terminated by ',';

hive> describe extended difdelimiter; 
OK id                   int     
   name                 string                                      
   Detailed Table Information   Table(tableName:difdelimiter, dbName:default, owner:cloudera, createTime:1439375349, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://quickstart.cloudera:8020/user/hive/warehouse/difdelimiter, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=,, field.delim=,}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1439375349}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)  Time taken: 0.154 seconds, Fetched: 4 row(s)

Here is the information about delimiter

parameters:{serialization.format=,, field.delim=,}

Adding as per the comments

hive> create table tb3 (id int, name string) row format delimited fields terminated by '/t';
OK
Time taken: 0.09 seconds
hive> describe extended tb3;
OK
id                      int                                         
name                    string                                      

Detailed Table Information  Table(tableName:tb3, dbName:default, owner:cloudera, createTime:1439377591, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id, type:int, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://quickstart.cloudera:8020/user/hive/warehouse/tb3, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=/t, field.delim=/t}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{transient_lastDdlTime=1439377591}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) 
Time taken: 0.125 seconds, Fetched: 4 row(s)

parameters:{serialization.format=/t, field.delim=/t})
like image 5
NPKR Avatar answered Oct 13 '22 19:10

NPKR