Visually-identical characters in Unicode

Tags:

I want to find visually identical characters for a specific character in Unicode. I know how to find canonical or compatibility decompositions of a character; but they do not give me what I want. I want to find characters that are visually identical (not similar), and their only difference can be their sizes.

for example I want : (s,S), or (S,S) (whose code points are different). I do not want (ß, β), or (e, é).

Any suggestions? Thanks.

326

asked Nov 06 '12 23:11

Bahar S

1 Answers

For a particular character, you could start from annotations in the code charts in the Unicode standard. The annotations often refer to other characters for various reasons, including similarity or identity of shape. But the annotations are not meant to cover everything.

You could also draw your character at http://shapecatcher.com/ and ask it to recognize it. You often get a long list of visually similar alternatives.

As @TedHopp writes in his comment, visual identity is font-dependent. For example, “s” and “S” need not be identical in shape; in most fonts, they are not – the basic form is the same, but there are various differences in stroke width variation, curvature, serifs, etc. However, some characters can be expected to be visually identical in any font that contains them, such as Latin capital A, Greek capital alpha Α, and Cyrillic capital А.

You did not specify the purpose of the study, but you might be doing something that has been carried out to some extent by the Unicode Consortium. See UTR #6, Unicode Security Considerations, which also contains references to related work, including UTS #9, Unicode Security Mechanisms, which contains confusables.txt, Recommended confusable mapping for IDN (i.e., for a particular context, but it may be of interest for other purposes as well).

111

answered Oct 13 '22 13:10

Jukka K. Korpela

Related questions
                            
                                Displaying unicode chess pieces in Windows-console
                            
                                Why some characters can not be typed in Python's IDLE?
                            
                                Python .lower does not seem to properly lowercase all unicode characters (Python 2.7)
                            
                                how to handle subprocess.Popen output in both Python 2 and Python 3
                            
                                PHP Unicode codepoint to character
                            
                                How to visually horizontally center an emoji in Chrome?
                            
                                Delphi 2009 and Firebird 2.1 = Full Unicode?
                            
                                How to put unicode characters on a System.Windows.Forms.Button in C#?
                            
                                Outputting unicode characters in windows terminal
                            
                                downgrade non-ascii symbols to closest 7-bit ASCII equivalent (preferrably Java)
                            
                                Space-saving character encoding for japanese?
                            
                                Replacing “smart quotes” in powershell
                            
                                Matching case sensitive unicode strings with regular expressions in Python
                            
                                is unicode( codecs.BOM_UTF8, "utf8" ) necessary in Python 2.7/3?
                            
                                Unicode with Cygwin and MinTTY not working
                            
                                Python JSON and Unicode
                            
                                How to make Django create slug from unicode characters?
                            
                                How to simulate a Unicode Char "key press" in Mac Os X using Objective-C?
                            
                                German characters in JTextField
                            
                                Convert a get a character's unicode(?) value? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Visually-identical characters in Unicode

Tags:

unicode

similarity

Bahar S

People also ask

1 Answers

Jukka K. Korpela

Recent Activity

Donate For Us