How to query struct array with Hive (get_json_object) or json serde

Tags:

I am trying to query the following JSON example file stored on my HDFS

{
    "tag1": "1.0",
    "tag2": "blah",
    "tag3": "blahblah",
    "tag4": {
        "tag4_1": [{
                "tag4_1_1": [{
                        "tag4_1_1_1": {
                            "Addr": {
                                "Addr1": "blah",
                                "City": "City",
                                "StateProvCd": "NY",
                                "PostalCode": "99999"
                            }
                        }
                        "tag4_1_1_1": {
                            "Addr": {
                                "Addr1": "blah2",
                                "City": "City2",
                                "StateProvCd": "NY",
                                "PostalCode": "99999"
                            }
                        }
                    }
                ]
            }
        ]
    }
}

I used the following to create an external table over the data

CREATE  EXTERNAL TABLE DB.hv_table
(
  tag1 string
, tag2 string
, tag3 string
, tag4 struct<tag4_1:ARRAY<struct<tag4_1_1:ARRAY<struct<tag4_1_1_1:struct<Addr
                Addr1:string
                , City:string
                , StateProvCd:string
                , PostalCode:string>>>>>>
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' 
LOCATION 'HDFS/location';

Ideally, I want to query the data such that it would return to me as such:

select tag1, tag2, tag3, tag4(all data) from DB.hv_table;

Can someone provide me an example of how I can query without writing it in the following manner:

select tag1, tag2, tag3
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.Addr1 as Addr1 
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.City as City 
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.StateProvCd as StateProvCd 
, tag4.tag4_1[0].tag4_1_1[0].tag4_1_1_1.Addr.PostalCode as PostalCode 
from DB.hv_table

Most importantly, I would like to not define the array item element number. In my example, I am only able to target the first element of my array (tag4_1_1_1). I would to target everything if possible.

550

asked Jul 10 '17 19:07

DatWunGuy102

1 Answers

Found a really good blog at: ThornyDev

CREATE EXTERNAL TABLE IF NOT EXISTS DB.dummyTable (jsonBlob STRING)
LOCATION 'pathOfYourFiles';

SELECT 
get_json_object(jsonBlob, '$.tag1') AS tag1
,get_json_object(jsonBlob, '$.tag2') AS tag2
,get_json_object(jsonBlob, '$.tag3') AS tag3
,get_json_object(jsonBlob, '$.tag4.tag4_1.tag4_1_1.tag4_1_1_1.Addr.Addr1') AS Addr1
,get_json_object(jsonBlob, '$.tag4.tag4_1.tag4_1_1.tag4_1_1_1.Addr.City') AS City
,get_json_object(jsonBlob, '$.tag4.tag4_1.tag4_1_1.tag4_1_1_1.Addr.StateProvCd') AS StateProvCd
,get_json_object(jsonBlob, '$.tag4.tag4_1.tag4_1_1.tag4_1_1_1.Addr.PostalCode') AS PostalCode
FROM DB.dummyTable

I'm very satisfied, but I want to check out the json tuple and see how it performs versus the "get_json_object" class

answered Sep 25 '22 14:09

DatWunGuy102

Related questions
                            
                                Serialize POJO to JSON with different names using GSON?
                            
                                Serialize multidimensional form data into a JSON object-array to work with application/json
                            
                                Rails 4 - Save address as one column in database
                            
                                Accepting a date in Phoenix JSON API
                            
                                Preserving order in Java object to json with spring framework
                            
                                Retrofit response.errorBody() is null
                            
                                Json dumping bytes fails in Python 3
                            
                                Getting data from JSON file in R
                            
                                Issue parsing multiline JSON file using Python
                            
                                Getting values from array that matches a regular expression using lodash
                            
                                Get content of <script type="application/ld+json"> using PHP
                            
                                Pretty Display JSON data from with Flask [duplicate]
                            
                                Filter Records from JSON with Node or ES6
                            
                                Using google drive appDataFolder to store app state with javascript on the client side
                            
                                How to sanitize input data in golang?
                            
                                Need to insert struct directly in a PostgreSQL DB
                            
                                How do I Json.decode a union type?
                            
                                How to create HTML table based on JSON [closed]
                            
                                How to get filtered JSON nodes through Gson?
                            
                                Converting a JSON list to CSV file in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to query struct array with Hive (get_json_object) or json serde

Tags:

json

hql

hive

hive-serde

DatWunGuy102

People also ask

1 Answers

DatWunGuy102

Recent Activity

Donate For Us