Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redshift Python UDFs Varchar limits

I have successfully created a Python UDF that accepts a varchar value from a table and extracts a substring of that value based on a regex.

The max size of that varchar column in the DDL is set to be 20000 bytes, and in some occasions the UDF outputs an error when I try to call it:

ERROR: Value too long for character type

Detail: 
  -----------------------------------------------
  error:  Value too long for character type
  code:      8001
  context:   Value too long for type character varying(256)
  query:     1127608
  location:  funcs_string.hpp:390
  process:   query0_73 [pid=25345]
  -----------------------------------------------

Is the case that the max varchar limit for a UDF input variable is set to 256 bytes? If yes, is this something that I can change in my code?

Thanks,

like image 725
and_apo Avatar asked Dec 04 '22 02:12

and_apo


2 Answers

Turns out that you have to specify the length in your function's parameter if it is more than 256, which is the default value.. So I fixed this using something similar to :

CREATE OR REPLACE FUNCTION f_xxxxxx(val VARCHAR(20000)) RETURNS VARCHAR(20000) 
IMMUTABLE AS $$
<python function here>
$$ LANGUAGE plpythonu;
like image 147
and_apo Avatar answered Dec 31 '22 01:12

and_apo


Are you using Text or nvarchar? Redshift internally does not support these data types and converts them to varchar(256).

More details can be found here

like image 32
Paladin Avatar answered Dec 31 '22 01:12

Paladin