Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate multiple columns into one in hive

I need to concatenate column values into a single column. I have column names in a variable as colnames=col1,col2,col3 . I am writing the below query from a unix shell and calling the hive. But when I do this, I am getting only the column names concatenated not the values of those columns.

select concat('regexp_replace("${colnames}",",","^")) as result from table;

I would like the output as:

ABCD^10^XYZ

(ABCD,10,XYZ are the column values)

like image 884
akm Avatar asked Jul 06 '18 13:07

akm


1 Answers

Use concat_ws function to concatenate values with ^ as a delimiter.

Example with constants:

hive> select concat_ws('^','ABCD','10', 'XYZ');
OK
ABCD^10^XYZ

Command with column names after shell variable substitution should look like this:

 select concat_ws('^',col1,col2,col3) as result from table;

In the shell it will look like this:

colnames=col1,col2,col3
hive -e "select concat_ws('^',${colnames}) as result from table"

If columns are not string, wrap them with cast as string using shell, this will allow concat_ws work with strings and not-string columns.

Example

colnames=col1,col2,col3
colnames2=$(echo "cast( $colnames as string)" | sed "s/,/ as string), cast( /g")
echo "$colnames2"

Output:

cast( col1 as string), cast( col2 as string), cast( col3 as string)

Use new variable to pass to hive as in the previous example.

like image 161
leftjoin Avatar answered Oct 26 '22 21:10

leftjoin