Is there any way to do kind of reverse thing for explode() function in Apache Hive.
Let's say I have a table in this form id int, description string, url string, ...
And from this table I would like to create table which looks like id int, json string
where in json
column stored all other columns as json. "description":"blah blah", "url":"http:", ...
Use concat_ws function to concatenate values with ^ as a delimiter. If columns are not string, wrap them with cast as string using shell, this will allow concat_ws work with strings and not-string columns.
You can use || operator to concatenate two or more strings. The result of the operation is always a string. If an operand is a number, it is implicitly converted to string before concatenation. If an operand is NULL, it is treated as an empty string '' in the concatenation.
As previously mentioned, the stringr package is part of the Tidyverse packages which also includes packages such as tidyr and the unite() function. In the next section, we are going to merge two columns in R using the unite() function as well.
In the Hive documentation you can find the following: ALTER TABLE table_name [PARTITION partition_spec] CHANGE [COLUMN] col_old_name col_new_name column_type [COMMENT col_comment] [FIRST|AFTER column_name] [CASCADE|RESTRICT];
Hive has access to some string operations which can be used to combine multiple columns into one column
SELECT id, CONCAT(CONCAT("(", CONCAT_WS(", ", description, url)), ")") as descriptionAndUrl
FROM originalTable
This is obviously going to get complicated fast for combining many columns into valid JSON. If this is one-of and you know that all of the JSON strings will have the same properties you might get away with just CONCAT for your purposes.
The "right" way to do it would be to write a User Defined Function which takes a list of columns and spits out a JSON string. This will be much more maintainable if you ever need to add columns or do the same thing to other tables.
It's likely someone has already written one you can use, so you should look around. Unfortunately the [JSON related UDFs provided by Hive]https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-get_json_object) work from JSON strings, they don't make them.
You can concatenate string variables using CONCAT_WS in HIve
SELECT CONCAT_WS('-','string1','string2','string3') FROM TABLE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With