Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement generic Oracle DECODE function in BigQuery?

I'm looking into implementing the Oracle DECODE function as a UDF.

Below is the outward functionaliy https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions040.htm

Below is the outward functionality and syntax of a decode in Oracle:

Oracle:

DECODE( <expr> , <search1> , <result1> [ , <search2> , <result2> ... ] [ , <default> ] )

SELECT product_id,
       DECODE (warehouse_id, 1, 'Southlake', 
                             2, 'San Francisco', 
                             3, 'New Jersey', 
                             4, 'Seattle',
                                'Non domestic') 
       "Location of inventory" FROM inventories;

Primarily, with BigQuery UDFs SQL or JavaScript, with BigQuery UDFs, when you define the UDF function you need to know the number of parameters you are accepting and typing. When you define the SQL UDF function, you can also accept an array of type anything, but I am not sure if it can work and SQL UDF can perform what we want with an array. It seems based on the Javascript UDF documentation all parameters are named and typed and known up front.

Is there a way to accomplish this using a BigQuery UDF, it has to be dynamic like Oracle decode and fit any scenario you put in front of it not static knowing what you are decoding

like image 347
Mikhail Berlyant Avatar asked Apr 29 '26 06:04

Mikhail Berlyant


1 Answers

Below is for BigQuery Standard SQL

CREATE TEMP FUNCTION DECODE(expr ANY TYPE, map ANY TYPE, `default` ANY TYPE ) AS ((
  IFNULL((SELECT result FROM UNNEST(map) WHERE search = expr), `default`)
));  

You can see how it works using below example

#standardSQL
CREATE TEMP FUNCTION DECODE(expr ANY TYPE, map ANY TYPE, `default` ANY TYPE ) AS ((
  IFNULL((SELECT result FROM UNNEST(map) WHERE search = expr), `default`)
));
WITH `project.dataset.inventories` AS (
  SELECT 1 product_id, 4 warehouse_id UNION ALL
  SELECT 2, 2 UNION ALL
  SELECT 3, 5
)
SELECT product_id, warehouse_id,
  DECODE(warehouse_id, 
    [STRUCT<search INT64, result STRING>
      (1,'Southlake'),
      (2,'San Francisco'),
      (3,'New Jersey'),
      (4,'Seattle')
    ], 'Non domestic') AS `Location_of_inventory`
FROM `project.dataset.inventories`

with result

Row product_id  warehouse_id    Location_of_inventory    
1   1           4               Seattle  
2   2           2               San Francisco    
3   3           5               Non domestic       

Another example of use is:

#standardSQL
CREATE TEMP FUNCTION DECODE(expr ANY TYPE, map ANY TYPE, `default` ANY TYPE ) AS ((
  IFNULL((SELECT result FROM UNNEST(map) WHERE search = expr), `default`)
));
WITH `project.dataset.inventories` AS (
  SELECT 1 product_id, 4 warehouse_id UNION ALL
  SELECT 2, 2 UNION ALL
  SELECT 3, 5
), map AS (
  SELECT 1 search, 'Southlake' result UNION ALL                               
  SELECT 2, 'San Francisco' UNION ALL                               
  SELECT 3, 'New Jersey' UNION ALL                               
  SELECT 4, 'Seattle'                           
)

SELECT product_id, warehouse_id,
  DECODE(warehouse_id, kv, 'Non domestic') AS `Location_of_inventory`
FROM `project.dataset.inventories`,
(SELECT ARRAY_AGG(STRUCT(search, result)) AS kv FROM map) arr

with the same output

Update to address - "for a reusable UDF, not having to name the fields makes it closer to Oracle's implementation."

CREATE TEMP FUNCTION DECODE(expr ANY TYPE, map ANY TYPE, `default` ANY TYPE ) AS (
  IFNULL((
    SELECT result FROM (
      SELECT NULL AS search, NULL AS result UNION ALL SELECT * FROM UNNEST(map) 
    )
    WHERE search = expr
  ), `default`)
);

So now - previous examples can be used w/o explicit naming fields as in example below

#standardSQL
CREATE TEMP FUNCTION DECODE(expr ANY TYPE, map ANY TYPE, `default` ANY TYPE ) AS (
  IFNULL((
    SELECT result FROM (
      SELECT NULL AS search, NULL AS result UNION ALL SELECT * FROM UNNEST(map) 
    )
    WHERE search = expr
  ), `default`)
);
WITH `project.dataset.inventories` AS (
  SELECT 1 product_id, 4 warehouse_id UNION ALL
  SELECT 2, 2 UNION ALL
  SELECT 3, 5
)
SELECT product_id, warehouse_id,
  DECODE(warehouse_id, 
    [ (1,'Southlake'),
      (2,'San Francisco'),
      (3,'New Jersey'),
      (4,'Seattle')
    ], 'Non domestic') AS `Location_of_inventory`
FROM `project.dataset.inventories`

still with same output as before

like image 122
Mikhail Berlyant Avatar answered May 03 '26 19:05

Mikhail Berlyant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!