Does someone know a easy way to find characters in Unicode that are similar to ASCII characters. An example is the "CYRILLIC SMALL LETTER DZE (ѕ)". I'd like to do a search and replace for similar characters. By similar I mean human readable. You can't see a difference by looking at it.
As noted by other commenters, Unicode normalisation ("compatibilty characters") isn't going to help you here as you aren't looking for official equivalences but for similarities in glyphs (letter shapes). (The linked Unicode Technical Report is still worth reading, though, as it is extremely well written.)
If I were you, to spare you the tedious work of assembling a list of characters yourself, I'd search for resources on homograph attacks: This is a method of maliciously misleading web users by displaying URLs containing domain names in which some letters have been replaced with visually similar letters. Another Unicode Technical Report, on security, contains a section on the problem. There is also -- and that may be what you most need -- a "confusables" table. Here's another article with mainly punctuation marks, some of which ASCII, that have visually similar counterparts in the non-ASCII code tables.
What I do hope is that you aren't asking the question to construct such an attack.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With