Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find all rows that have Non-ASCII in a specific column in SnowFlake

I'm trying to find all rows in my table that have Non-ASCII characters (one or more) in a specific column in Snowflake. I've tried to use the query below. Although, it didn't give me the results I was looking for.

SELECT *
FROM table_name
WHERE column_name REGEXP '[^a-zA-Z0-9]; 

In addition, I have tried this query as well:

SELECT *
  FROM tableName 
 WHERE columnToCheck <> CONVERT(columnToCheck USING ASCII)

This gave me error because I can't use USING in Snowflake.

Just to elaborate a bit more, I'm looking to find rows with characters like Ã, Ä, Ç, Ô, ÿ, etc. To be clear, these are only examples I'm trying to find all rows where there are one or more cases of non-ASCII characters.

like image 338
QMan5 Avatar asked Nov 18 '25 04:11

QMan5


1 Answers

If you just want to detect whether or not a column has non-ASCII, you can do it like this:

set MY_STRING1 = 'Hello, world';
set MY_STRING2 = 'Ãbcde';

select regexp_instr($MY_STRING1,'[^[:ascii:]]') > 0 as HAS_NON_ASCII;
select regexp_instr($MY_STRING2,'[^[:ascii:]]') > 0 as HAS_NON_ASCII;

If you want to replace diacritics with the closest ASCII match you can use this approach:

How to replace character accented characters in Snowflake?

like image 62
Greg Pavlik Avatar answered Nov 20 '25 19:11

Greg Pavlik