I am trying to debug a function not made by myself (dms2dd). I made my own test function (see below) and boiled my problem down to a specific row/value.
If I run the following query:
SELECT "Lat", "Long", test_dolf("Lat"), test_dolf("Long") FROM pawikan WHERE "Lat" IS NOT NULL AND "Long" IS NOT NULL ORDER BY index LIMIT 1 OFFSET 29130
I get the following output:
'N6° 6' 9.4824"';'E118° 26' 49.1172'' ';'9.4824';'49.1172'
which is exactly what I expect. But with the following query:
SELECT "Lat", "Long", CAST(test_dolf("Lat") as numeric), test_dolf("Long") FROM pawikan WHERE "Lat" IS NOT NULL AND "Long" IS NOT NULL ORDER BY index LIMIT 1 OFFSET 29130
I get the error
ERROR: invalid input syntax for type numeric: ""
SQL state: 22P02
The error suggests that the varchar value I was trying to cast to numeric is empty, but as you can see from the previous query, it isn't. It is just a valid numeric varchar. Actually, if I copy-paste the value and run:
SELECT CAST('9.4824' AS numeric);
It totally works and the query actually results in a valid numeric. Even more, if I store the results of the first query in an intermediary table with:
SELECT "Lat", "Long", test_dolf("Lat") as lat_sec, test_dolf("Long") as long_sec INTO dms2dd_test FROM pawikan WHERE "Lat" IS NOT NULL AND "Long" IS NOT NULL ORDER BY index LIMIT 11 OFFSET 29120
and then issue a
SELECT CAST(long_sec as numeric), CAST(lat_sec AS numeric) FROM dms2dd_test;
it totally works. Even this works just fine:
SELECT test_dolf(E'N6° 6\' 9.4824"')::numeric as lat_sec
So what is going wrong here? It looks like in the second query where I cast to numeric, a different value is passed to my function, but I tested the sort column (index) and it contains only unique bigints.
This is the code for the test_dolf function:
CREATE OR REPLACE FUNCTION public.test_dolf(strdegminsec character varying)
RETURNS varchar AS
$BODY$
DECLARE
i numeric;
intDmsLen numeric; -- Length of original string
strCompassPoint Char(1);
strNorm varchar(16) = ''; -- Will contain normalized string
strDegMinSecB varchar(100);
blnGotSeparator integer; -- Keeps track of separator sequences
arrDegMinSec varchar[]; -- TYPE stringarray is table of varchar(2048) ;
strChr Char(1);
BEGIN
strDegMinSec := regexp_replace(replace(strdegminsec,E'\'\'','"'),' "([0-9]+)',E' \\1"');
-- Remove leading and trailing spaces
strDegMinSecB := REPLACE(strDegMinSec,' ','');
intDmsLen := Length(strDegMinSecB);
blnGotSeparator := 0; -- Not in separator sequence right now
-- Loop over string, replacing anything that is not a digit or a
-- decimal separator with
-- a single blank
FOR i in 1..intDmsLen LOOP
-- Get current character
strChr := SubStr(strDegMinSecB, i, 1);
-- either add character to normalized string or replace
-- separator sequence with single blank
If strpos('0123456789,.', strChr) > 0 Then
-- add character but replace comma with point
If (strChr <> ',') Then
strNorm := strNorm || strChr;
Else
strNorm := strNorm || '.';
End If;
blnGotSeparator := 0;
ElsIf strpos('neswNESW',strChr) > 0 Then -- Extract Compass Point if present
strCompassPoint := strChr;
Else
-- ensure only one separator is replaced with a blank -
-- suppress the rest
If blnGotSeparator = 0 Then
strNorm := strNorm || ' ';
blnGotSeparator := 0;
End If;
End If;
End Loop;
-- Split normalized string into array of max 3 components
arrDegMinSec := string_to_array(strNorm, ' ');
return arrDegMinSec[3];
End
$BODY$
LANGUAGE plpgsql IMMUTABLE
COST 100;
I figured out what the problem was. It looks like postgresql, even though I do a LIMIT and OFFSET, still calls the functions in the select for other rows outside that frame.
I figured this out by putting the code that raised the exception inside my function and catching the resulting error, and raising a NOTICE error when that exception occurs (see function below, specifically the BEGIN EXCEPTION END block at the end of the function). The notice is displayed as a warning, but doesn't cause the code execution to stop. All of a sudden it turned out that the function wasn't just called for the row that I was expecting it to be called for, but also for a whole bunch of other rows. This is totally not what I expected, and for me kind of counter intuitive, but I guess it is how postgresql is supposed to work.
Since catching exceptions is quite expensive in postgresql, I guess I need to add a test which prevents the exception in the first place (I could test the length of arrDegMinSec
and the value of items 1-3 of that array and return NULL in case of invalid values.
CREATE OR REPLACE FUNCTION public.test_dolf(strdegminsec character varying)
RETURNS numeric AS
$BODY$
DECLARE
i numeric;
intDmsLen numeric; -- Length of original string
strCompassPoint Char(1);
strNorm varchar(16) = ''; -- Will contain normalized string
strDegMinSecB varchar(100);
blnGotSeparator integer; -- Keeps track of separator sequences
arrDegMinSec varchar[]; -- TYPE stringarray is table of varchar(2048) ;
strChr Char(1);
retval numeric;
BEGIN
strDegMinSec := regexp_replace(replace(strdegminsec,E'\'\'','"'),' "([0-9]+)',E' \\1"');
-- Remove leading and trailing spaces
strDegMinSecB := REPLACE(strDegMinSec,' ','');
intDmsLen := Length(strDegMinSecB);
blnGotSeparator := 0; -- Not in separator sequence right now
-- Loop over string, replacing anything that is not a digit or a
-- decimal separator with
-- a single blank
FOR i in 1..intDmsLen LOOP
-- Get current character
strChr := SubStr(strDegMinSecB, i, 1);
-- either add character to normalized string or replace
-- separator sequence with single blank
If strpos('0123456789,.', strChr) > 0 Then
-- add character but replace comma with point
If (strChr <> ',') Then
strNorm := strNorm || strChr;
Else
strNorm := strNorm || '.';
End If;
blnGotSeparator := 0;
ElsIf strpos('neswNESW',strChr) > 0 Then -- Extract Compass Point if present
strCompassPoint := strChr;
Else
-- ensure only one separator is replaced with a blank -
-- suppress the rest
If blnGotSeparator = 0 Then
strNorm := strNorm || ' ';
blnGotSeparator := 0;
End If;
End If;
End Loop;
-- Split normalized string into array of max 3 components
arrDegMinSec := string_to_array(strNorm, ' ');
BEGIN
retval := arrDegMinSec[3]::numeric;
return retval;
EXCEPTION
WHEN SQLSTATE '22P02' THEN
RAISE NOTICE 'Incorrect value %', strDegMinSec;
RETURN NULL;
END;
End
$BODY$
LANGUAGE plpgsql IMMUTABLE
COST 100;
EDIT
Courtesy of @michel.milezzi another solution that doesn't require modification of the function is to change the function call in the query to
CAST(NULLIF(test_dolf("Lat"), '') as numeric)
And indeed as @abelisto suggests, I could also have put the query in a subquery and only casting it to numeric in the main query like this:
SELECT "Lat", "Long", CAST(test_dolf("Lat") as numeric), test_dolf("Long") FROM (SELECT * FROM pawikan WHERE "Lat" IS NOT NULL AND "Long" IS NOT NULL ORDER BY index LIMIT 1 OFFSET 29130) as t
This would indeed have prevented the problem, which would have indeed simplified the debugging process.
That being said, I was going to modify the function anyway (to make it more robust for dirty data) so for me that was in this case the best solution.
The error your getting is the following:
ERROR: invalid input syntax for type numeric: ""
So, it's trying to cast a empty string to a numeric value. What about using NULLIF function to address this?
SELECT "Lat", "Long", CAST(NULLIF(test_dolf("Lat"), '') as numeric), test_dolf("Long") FROM pawikan WHERE "Lat" IS NOT NULL AND "Long" IS NOT NULL ORDER BY index LIMIT 1 OFFSET 29130;
Also you might want to see execution plan to understand this issue. What may be happening is the LIMIT
and OFFSET
is executed just after the casting. This explain why you're not seeing the row with empty string.
Oops, I should read your answer before posting this. Anyway you still can use NULLIF
to workaround your issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With