Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

QueryString parsing in BigQuery

I have a column in BigQuery that contains a URL querystring, eg a=1&c=1. I'd like to reference some of these in my query, eg. filtering with something like WHERE querystring.c = 1.

My plan is to convert the querystring to JSON, and then use JSON_EXTRACT. I figured I could write a UDF to convert the querystring to JSON, but I can't work out anyway to import the node.js querystring package into my UDF to make this easy.

Is it possible to import a node.js core library into a UDF, and if so, how? Alternatively is there a better way to achieve what I'm trying to do?

like image 809
Ben Dowling Avatar asked Mar 09 '26 05:03

Ben Dowling


1 Answers

Alternatively is there a better way to achieve what I'm trying to do?

I think - Yes - using JS UDF is expensive resource wise and has some limitations. Using SQL UDF is less expensive and if you want you can transform below by using SQL UDF - but at least below gives you idea of "alternative" approach

For BigQuery Standard SQL

#standardSQL
WITH yourTable AS (
  SELECT 1 AS id, 'a=1&c=1' AS querystring UNION ALL
  SELECT 2, 'c=2&b=3'
)
SELECT 
  id, 
  querystring,
  SPLIT(kv, '=')[SAFE_OFFSET(0)] AS key,
  SPLIT(kv, '=')[SAFE_OFFSET(1)] AS value 
FROM yourTable, UNNEST(SPLIT(querystring, '&')) AS kv   

above allows you to "extract" all key-value pairs like below

id  querystring key value    
2   c=2&b=3     b   3    
1   a=1&c=1     c   1    
1   a=1&c=1     a   1    
2   c=2&b=3     c   2    

So now you can use them in WHERE clause like below

#standardSQL
WITH yourTable AS (
  SELECT 1 AS id, 'a=1&c=1' AS querystring UNION ALL
  SELECT 2, 'c=2&b=3'
)
SELECT 
  id, 
  querystring,
FROM yourTable, UNNEST(SPLIT(querystring, '&')) AS kv
WHERE SPLIT(kv, '=')[SAFE_OFFSET(0)] = 'c' 
AND SPLIT(kv, '=')[SAFE_OFFSET(1)] = '1'

this gives below result

id  querystring  
1   a=1&c=1    

Note: this is just quick and abstract illustration of approach - I hope you will adjust/adopt it to your specific case

Below is example of transforming above to use SQL UDF

#standardSQL
CREATE TEMPORARY FUNCTION parse(qs STRING, key STRING) AS (
  (SELECT SPLIT(kv, '=')[SAFE_OFFSET(1)] FROM UNNEST(SPLIT(qs, '&')) AS kv WHERE SPLIT(kv, '=')[SAFE_OFFSET(0)] = key )
);
WITH yourTable AS (
  SELECT 1 AS id, 'a=1&c=1' AS querystring UNION ALL
  SELECT 2, 'c=2&b=3'
)
SELECT 
  id, 
  querystring
FROM yourTable
WHERE parse(querystring, 'c') = '1'

Note: usually querystring do not have keys duplication - so case of having dups is not addressed - but easy to if needed :o)

but it doesn't decode any encoded components, so my values will still contain things like %20. Any suggestion on that?

#standardSQL
CREATE TEMPORARY FUNCTION parse(qs STRING, key STRING) AS (
  (SELECT SPLIT(kv, '=')[SAFE_OFFSET(1)] FROM UNNEST(SPLIT(qs, '&')) AS kv WHERE SPLIT(kv, '=')[SAFE_OFFSET(0)] = key )
);
CREATE TEMP FUNCTION decode(str STRING)
RETURNS STRING
LANGUAGE js AS """
  if (str == null) return null;
  try {
    return decodeURIComponent(str);
  } catch (e) {
    return str;
  }
""";
WITH yourTable AS (
  SELECT 1 AS id, 'a=1&c=1&d=a%20b%20c' AS querystring UNION ALL
  SELECT 2, 'c=2&b=3'
)
SELECT 
  id,
  querystring,
  decode(parse(querystring, 'd')) as d
FROM yourTable
WHERE parse(querystring, 'c') = '1'

result is

id  querystring             d    
--  -------------------     -----
1   a=1&c=1&d=a%20b%20c     a b c    
like image 146
Mikhail Berlyant Avatar answered Mar 11 '26 23:03

Mikhail Berlyant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!