I am coming across a piece of code in Apache Hive like regexp_extract(input, '[0-9]*', 0), Can someone please explain to me what this code does? Thanks
From the Hive manual DDL, it returns the string extracted using the pattern. e.g. regexp_extract('foothebar', 'foo(.*?)(bar)', 2)
returns bar
.
The index
parameter is the capture group, which is an integer that can take the following values:
foothebar
the
bar
In your example, regexp_extract(input, '[0-9]*', 0)
, your are looking for the whole match for your column identified by input
and starting with a numerical value.
Here are a few examples:
regexp_extract('9eleven', '[0-9]*', 0)
-> returns 9
regexp_extract('9eleven', '[0-9]*', 1)
-> query failsregexp_extract('911test', '[0-9]*', 0)
-> returns 911
regexp_extract('911test', '[0-9]*', 1)
-> query failsregexp_extract('eleven', '[0-9]*', 0)
-> returns empty stringregexp_extract('test911', '[0-9]*', 0)
-> returns empty stringIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With