Google a ton but haven't found it anywhere. Or does that mean Hive can support arbitrary large string data type as long as cluster is allowed? If so, where I can find what is the largest size of string data type that my cluster can support?
Thanks in advance!
By default, the columns metadata for Hive does not specify a maximum data length for STRING columns. The driver has the parameter DefaultStringColumnLength, default is 255 maximum value.
Hive supports three types of String data type, STRING , VARCHAR , and CHAR : STRING: It is a sequence of characters that can be expressed using single quotes ( ' ) as well as double quotes ( " ).
Q24 What is the maximum size of string data type supported by Hive? Answer: Maximum size is 2 GB.
The data loaded in the hive database is stored at the HDFS path – /user/hive/warehouse. If the location is not specified, by default all metadata gets stored in this path. In the HDFS path, the data is stored in blocks of size either 64 or 128 MB.
The current documentation for Hive lists STRING
as a valid datatype, distinct from VARCHAR
and CHAR
See official apache doc here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Strings
It wasn't immediately apparent to me that STRING
was indeed it's own type, but if you scroll down you'll see several cases where it's used distinctly from the others.
While perhaps not authoritative, this page indicates the max length of a STRING
is 2GB. http://www.folkstalk.com/2011/11/data-types-in-hive.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With