Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find non-ASCII characters in MySQL?

People also ask

How do I remove non-ascii characters in MySQL?

UPDATE tablename SET columnToCheck = REPLACE(CONVERT(columnToCheck USING ascii), '? ', '') WHERE ... If the database is encoded UTF8 (mysql default), the first code will delete all characters with accents, which could be undesirable.

How do I find ascii characters in SQL?

If you ever need to find the ASCII code for a given character when using SQL Server, the T-SQL ASCII() function is probably what you need. The ASCII() function returns the ASCII code value of the leftmost character of a character expression.


MySQL provides comprehensive character set management that can help with this kind of problem.

SELECT whatever
  FROM tableName 
 WHERE columnToCheck <> CONVERT(columnToCheck USING ASCII)

The CONVERT(col USING charset) function turns the unconvertable characters into replacement characters. Then, the converted and unconverted text will be unequal.

See this for more discussion. https://dev.mysql.com/doc/refman/8.0/en/charset-repertoire.html

You can use any character set name you wish in place of ASCII. For example, if you want to find out which characters won't render correctly in code page 1257 (Lithuanian, Latvian, Estonian) use CONVERT(columnToCheck USING cp1257)


You can define ASCII as all characters that have a decimal value of 0 - 127 (0x00 - 0x7F) and find columns with non-ASCII characters using the following query

SELECT * FROM TABLE WHERE NOT HEX(COLUMN) REGEXP '^([0-7][0-9A-F])*$';

This was the most comprehensive query I could come up with.


It depends exactly what you're defining as "ASCII", but I would suggest trying a variant of a query like this:

SELECT * FROM tableName WHERE columnToCheck NOT REGEXP '[A-Za-z0-9]';

That query will return all rows where columnToCheck contains any non-alphanumeric characters. If you have other characters that are acceptable, add them to the character class in the regular expression. For example, if periods, commas, and hyphens are OK, change the query to:

SELECT * FROM tableName WHERE columnToCheck NOT REGEXP '[A-Za-z0-9.,-]';

The most relevant page of the MySQL documentation is probably 12.5.2 Regular Expressions.


This is probably what you're looking for:

select * from TABLE where COLUMN regexp '[^ -~]';

It should return all rows where COLUMN contains non-ASCII characters (or non-printable ASCII characters such as newline).