Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hive regexp_extract weirdness

Tags:

I am having some problems with regexp_extract:

I am querying on a tab-delimited file, the column I'm checking has strings that look like this:

abc.def.ghi

Now, if I do:

select distinct regexp_extract(name, '[^.]+', 0) from dummy;

MR job runs, it works, and I get "abc" from index 0.

But now, if I want to get "def" from index 1:

select distinct regexp_extract(name, '[^.]+', 1) from dummy;

Hive fails with:

2011-12-13 23:17:08,132 Stage-1 map = 0%,  reduce = 0%
2011-12-13 23:17:28,265 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201112071152_0071 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

Log file says:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row

Am I doing something fundamentally wrong here?

Thanks, Mario