I am using hive 2.11 (hive context) with Apache spark 1.6.2. I am observing a unexpected behavior with hive substr function.
Documentation says that:-
substr(string|binary A, int start, int len) substring(string|binary A, int start, int len) Returns the substring or slice of the byte array of A starting from start position with length len. For example, substr('foobar', 4, 1) results in 'b' (see [http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_substr]).
My Observation:-
1. select substr('Sakila' ,0,3) from <table>; => returns 'Sak'
2. select substr('Sakila' ,1,3) from <table>; => returns 'Sak'
So my question is:-
I tried executing above query here: http://demo.gethue.com/beeswax/#query but got the same result.
INSTR function in Apache Hive helps in finding the position of a substring in a string. It returns only the first first occurrence of the given input. Returns null if either of the arguments are null and returns 0 if the substring could not be found in the string. By default, the first character in string has index 1.
The SUBSTR or SUBSTRING function returns a part of the source string from the start position with the specified length of characters. If the length is not given, then it returns from the start position to the end of the string.
As you can see in the foobar example, the index of the first letter is 1.
If you have a nonpositive number m, your substring starts m positions 'left' of 1.
Example:
select substring('abcdefgh',-1,3)
h
From this perspective, you could see that 0 positions left of 1 is still 1, and therefore the commands give the same result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With