Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to replace characters in hive?

Tags:

hadoop

hive

I have a string column description in a hive table which may contain tab characters '\t', these characters are however messing some views when connecting hive to an external application. is there a simple way to get rid of all tab characters in that column?. I could run a simple python program to do it, but I want to find a better solution for this.

like image 202
user1745713 Avatar asked Aug 06 '13 21:08

user1745713


People also ask

How do I replace a character in Hive?

Use the powerful regexp_replace function to replace characters.

How do you replace values in Hive?

Use nvl() function in Hive to replace all NULL values of a column with a default value, In this article, I will explain with an example. Replace all NULL values with -1 or 0 or any number for the integer column. Replace all NULL values with empty space for string types. Replace with any value based on your need.

How do I replace multiple characters in a string in Hive?

Replace the multiple string with another string We can mention the multiple existing strings with the pipe line (|) character in the regexp_replace function and the those strings will be replaced with the given new string value.


1 Answers

regexp_replace UDF performs my task. Below is the definition and usage from apache Wiki.

regexp_replace(string INITIAL_STRING, string PATTERN, string REPLACEMENT): 

This returns the string resulting from replacing all substrings in INITIAL_STRING that match the java regular expression syntax defined in PATTERN with instances of REPLACEMENT,

e.g.: regexp_replace("foobar", "oo|ar", "") returns fb

like image 52
user2637464 Avatar answered Sep 23 '22 21:09

user2637464