Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Invalid digits on Redshift

I'm trying to load some data from stage to relational environment and something is happening I can't figure out.

I'm trying to run the following query:

SELECT
  CAST(SPLIT_PART(some_field,'_',2) AS BIGINT) cmt_par
FROM
  public.some_table;

The some_field is a column that has data with two numbers joined by an underscore like this:

some_field -> 38972691802309_48937927428392

And I'm trying to get the second part.

That said, here is the error I'm getting:

[Amazon](500310) Invalid operation: Invalid digit, Value '1', Pos 0, 
Type: Long 
Details: 
 -----------------------------------------------
  error:  Invalid digit, Value '1', Pos 0, Type: Long 
  code:      1207
  context:   
  query:     1097254
  location:  :0
  process:   query0_99 [pid=0]
  -----------------------------------------------;

Execution time: 2.61s
Statement 1 of 1 finished

1 statement failed.

It's literally saying some numbers are not valid digits. I've already tried to get the exactly data which is throwing the error and it appears to be a normal field like I was expecting. It happens even if I throw out NULL fields.

I thought it would be an encoding error, but I've not found any references to solve that. Anyone has any idea?

Thanks everybody.

like image 971
Maurício Borges Avatar asked Mar 19 '18 21:03

Maurício Borges


2 Answers

I just ran into this problem and did some digging. Seems like the error Value '1' is the misleading part, and the problem is actually that these fields are just not valid as numeric.

In my case they were empty strings. I found the solution to my problem in this blogpost, which is essentially to find any fields that aren't numeric, and fill them with null before casting.

select cast(colname as integer) from
(select
 case when colname ~ '^[0-9]+$' then colname
 else null
 end as colname
 from tablename);

Bottom line: this Redshift error is completely confusing and really needs to be fixed.

like image 104
szeitlin Avatar answered Oct 06 '22 12:10

szeitlin


Hmmm. I would start by investigating the problem. Are there any non-digit characters?

SELECT some_field
FROM public.some_table
WHERE SPLIT_PART(some_field, '_', 2) ~ '[^0-9]';

Is the value too long for a bigint?

SELECT some_field
FROM public.some_table
WHERE LEN(SPLIT_PART(some_field, '_', 2)) > 27

If you need more than 27 digits of precision, consider a decimal rather than bigint.

like image 20
Gordon Linoff Avatar answered Oct 06 '22 12:10

Gordon Linoff