Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use inline(ARRAY<STRUCT[,STRUCT]>) in Hive

Tags:

hive

hiveql

Has someone been able to use that function before, I have tried pretty much every combination in order to see if it works.

This is the Array of Struct I am trying to use with inline

[{"position":1,"price":124.0,"card_pos":"External","clicked":0},
 {"position":2,"price":94.78,"card_pos":"Cbox","clicked":0},
 {"position":3,"price":94.77,"card_pos":"External","clicked":0}] 

This works nicely:

select iq.*, iq.card.position as position, 
iq.card.price as price,iq.card.card_pos as card_pos, 
iq.card.clicked as clicked 
from
(
  select *
  from 
  hsim.im_metasearch
  LATERAL VIEW explode(cards) card as card
) iq

It's kind of annoying that I can's make the inline function work. The documentation on the Hive Wiki is very vague on how this function should be used properly.

We have Hive 0.10(CDH4.6), the inline function is definitely part of our distribution.

If someone as concrete example of how to use it please let me know

I have tried a couple of different syntax

select *
from 
hsim.im_metasearch
Lateral view inline(cards) as(position,price,card_pos,clicked)

select *
from 
hsim.im_metasearch
Lateral view inline(cards) card as (position,price,card_pos,clicked)

I've also tried to put it in the select without success Thank you

like image 668
chilucas Avatar asked Aug 01 '14 20:08

chilucas


1 Answers

Here is an example of how I have (successfully) used inline. Suppose we have a dataset such as

id    |    num
---------------
1          2.0
1          4.0
2          5.0
1          7.0
1          8.0
2          8.0
1          3.0
1          5.0
1          6.0
3          7.0

if you perform the query

select histogram_numeric(num, 3)
from table

you will get a histogram grouped into 3 bins represented as an array of structs.

[{'x':2.5, 'y:2.0'}, {'x':5.0, 'y':4.0}, {'x':7.5, 'y':4.0}]

Most people would want to view this in some sort of table form, hence the inline function. So we could do

select inline(histogram_numeric(num, 3))
from table

This gives

x    |    y
-------------
2.5      2.0
5.0      4.0
7.5      4.0

Hope this helps.

like image 84
o-90 Avatar answered Oct 11 '22 15:10

o-90