In my hive table, the session
field is a string in format like:
ip-sessionID-userID
or area-sessionID-userID
There's 3 or 4 fields separated by "-
", but userID is always the last one.
i wanna select userID, but how to access the last field? In python, there's something like:
arr[-1]
but in hive, how to achieve this? The following SQL seems not correct.
select split(session,"\-")[-1] as user from my_table;
Thanks!
The Last element is nothing but the element at the index position that is the length of the array minus-1. If the length is 4 then the last element is arr[3].
Get the ArrayList with elements. Get the first element of ArrayList with use of get(index) method by passing index = 0. Get the last element of ArrayList with use of get(index) method by passing index = size – 1.
The last element of an array is at position size - 1.
reverse(split(reverse(session), '-')[0])
Although this might be a bit more expensive than the regex solution ;)
Because Non-constant expressions for array indexes not supported in hive.
There will be some other ways to solve your problem:
use regexp_extract
, such as :
select regexp_extract(session, '(\-[^\-]+)', 1) as user from my_table;
use custom hive function : example and document could be found in hive document
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With