Find similar ASCII character in Unicode

Question

Does someone know a easy way to find characters in Unicode that are similar to ASCII characters. An example is the "CYRILLIC SMALL LETTER DZE (ѕ)". I'd like to do a search and replace for similar characters. By similar I mean human readable. You can't see a difference by looking at it.

chryss · Accepted Answer

As noted by other commenters, Unicode normalisation ("compatibilty characters") isn't going to help you here as you aren't looking for official equivalences but for similarities in glyphs (letter shapes). (The linked Unicode Technical Report is still worth reading, though, as it is extremely well written.)

If I were you, to spare you the tedious work of assembling a list of characters yourself, I'd search for resources on homograph attacks: This is a method of maliciously misleading web users by displaying URLs containing domain names in which some letters have been replaced with visually similar letters. Another Unicode Technical Report, on security, contains a section on the problem. There is also -- and that may be what you most need -- a "confusables" table. Here's another article with mainly punctuation marks, some of which ASCII, that have visually similar counterparts in the non-ASCII code tables.

What I do hope is that you aren't asking the question to construct such an attack.

Find similar ASCII character in Unicode

Tags:

replace

unicode

ascii

similarity

fuzzy

DrDol

1 Answers

chryss

Recent Activity

Donate For Us

Find similar ASCII character in Unicode

Tags:

replace

unicode

ascii

similarity

fuzzy

DrDol

1 Answers

chryss

Related questions

Recent Activity

Donate For Us