How do I find all rows of a PostgreSQL table that contain characters in some Unicode range, such as Cyrillic characters?
Solution: Let's consider a table called spatial_ref_sys having columns srid , auth_name, auth_srid, srtext, and proj4text. SELECT * FROM spatial_ref_sys WHERE srtext LIKE '%\ /%'; Sometimes these ticks are very useful for searching special characters in a database.
One of the interesting features of PostgreSQL database is the ability to handle Unicode characters. In SQL Server, to store non-English characters, we need to use NVARCHAR or NCAHR data type. In PostgreSQL, the varchar data type itself will store both English and non-English characters.
PostgreSQL also accepts “escape” string constants, which are an extension to the SQL standard. An escape string constant is specified by writing the letter E (upper or lower case) just before the opening single quote, e.g., E'foo' .
Unlike Oracle, PostgreSQL doesn't support an NVARHCHAR data type and doesn't offer support for UTF-16.
Figured it out! For Cyrillic:
SELECT * FROM "items" WHERE (title SIMILAR TO '%[\u0410-\u044f]%')
I got the range from http://symbolcodes.tlt.psu.edu/bylanguage/cyrillicchart.html. The characters have hex entities А
to я
, which are also my numbers above.
If you install the pgpcre extension, you can use this expression:
SELECT * FROM items WHERE title ~ pcre '\p{Cyrillic}';
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With