Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Hive have a String split function?

Tags:

hadoop

hive

People also ask

How do you split values in hive?

Use the split() function. You can read about it (and all other Hive functions) in the documentation. You'll want to replace "116:151:1" with the name of the column in your table.

Which function is used to split the string?

Splitting a string using strtok() in C In C, the strtok() function is used to split a string into a series of tokens based on a particular delimiter.

How do I split a string into string?

Use the Split method when the substrings you want are separated by a known delimiting character (or characters). Regular expressions are useful when the string conforms to a fixed pattern. Use the IndexOf and Substring methods in conjunction when you don't want to extract all of the substrings in a string.

What are three main functions of hive?

Hive has three main functions: data summarization, query and analysis.


There does exist a split function based on regular expressions. It's not listed in the tutorial, but it is listed on the language manual on the wiki:

split(string str, string pat)
   Split str around pat (pat is a regular expression) 

In your case, the delimiter "|" has a special meaning as a regular expression, so it should be referred to as "\\|".


Another interesting usecase for split in Hive is when, for example, a column ipname in the table has a value "abc11.def.ghft.com" and you want to pull "abc11" out:

SELECT split(ipname,'[\.]')[0] FROM tablename;

Just a clarification on the answer given by Bkkbrad.

I tried this suggestion and it did not work for me.

For example,

split('aa|bb','\\|')

produced:

["","a","a","|","b","b",""]

But,

split('aa|bb','[|]')

produced the desired result:

["aa","bb"]

Including the metacharacter '|' inside the square brackets causes it to be interpreted literally, as intended, rather than as a metacharacter.

For elaboration of this behaviour of regexp, see: http://www.regular-expressions.info/charclass.html