Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error parsing JSON: more than one document in the input (Redshift to Snowflake SQL)

I'm trying to convert a query from Redshift to Snowflake SQL.

The Redshift query looks like this:

SELECT 
                cr.creatives as creatives
                , JSON_ARRAY_LENGTH(cr.creatives) as creatives_length
                , JSON_EXTRACT_PATH_TEXT(JSON_EXTRACT_ARRAY_ELEMENT_TEXT (cr.creatives,0),'previewUrl') as preview_url
FROM campaign_revisions cr

The Snowflake query looks like this:

SELECT 
                cr.creatives as creatives
                , ARRAY_SIZE(TO_ARRAY(ARRAY_CONSTRUCT(cr.creatives))) as creatives_length
                , PARSE_JSON(PARSE_JSON(cr.creatives)[0]):previewUrl as preview_url
FROM campaign_revisions cr

It seems like JSON_EXTRACT_PATH_TEXT isn't converted correctly, as the Snowflake query results in error:

Error parsing JSON: more than one document in the input

cr.creatives is formatted like this:

"[{""previewUrl"":""https://someurl.com/preview1.png"",""device"":""desktop"",""splitId"":null,""splitType"":null},{""previewUrl"":""https://someurl.com/preview2.png"",""device"":""mobile"",""splitId"":null,""splitType"":null}]"
like image 534
Milka Avatar asked Nov 06 '22 00:11

Milka


1 Answers

It seems to me that you are not working with valid JSON data inside Snowflake.

Please review your file format used for the copy into command.

If you open the "JSON" text provided in a text editor , note that the information is not parsed or formatted as JSON because of the quoting you have. Once your issue with double quotes / escaped quotes is handled, you should be able to make good progress

Proper JSON on Left || Original Data on Right Invalid JSON vs Valid JSON

If you are not inclined to reload your data, see if you can create a Javascript User Defined Function to remove the quotes from your string, then you can use Snowflake to process the variant column.

The following code is working POJO that can be used to remove the doublequotes for you.

var textOriginal = '[{""previewUrl"":""https://someurl.com/preview1.png"",""device"":""desktop"",""splitId"":null,""splitType"":null},{""previewUrl"":""https://someurl.com/preview2.png"",""device"":""mobile"",""splitId"":null,""splitType"":null}]';


function parseText(input){
    var a = input.replaceAll('""','\"');
    a = JSON.parse(a);
    return  a;
}


x = parseText(textOriginal);
console.log(x);

See Output from JS Function Here

like image 144
john.da.costa Avatar answered Nov 11 '22 04:11

john.da.costa