I'm looking for a table which contains ASCII characters and same looking UTF8 characters. I know it also depends on the font is they look the same, but something generic to start with is enough. <pre class="prettyprint"><code>>>> # PY3 code: >>> a='H' # ascii >>> b='Н' # utf8 >>> a==b False >>> ' '.join(format(ord(x), 'b') for x in a) '1001000' >>> ' '.join(format(ord(x), 'b') for x in b) '10000011101' >>> a='P' # ascii >>> b='Ρ' # utf8 >>> a==b False >>> ' '.join(format(ord(x), 'b') for x in a) '1010000' >>> ' '.join(format(ord(x), 'b') for x in b) '1110100001' </code></pre>

This is very useful tool as it will show you all characters which look similar and you can choose if this is REALLY similar enough for you :) https://unicode.org/cldr/utility/confusables.jsp?a=test&r=None Some other resources: <ul> <li> This is called Visual Spoofing </li> <li> Python Package to detect confusables </li> </ul>

Similar looking UTF8 characters for ASCII

Tags:

ascii

utf-8

non-ascii-characters

extended-ascii

I'm looking for a table which contains ASCII characters and same looking UTF8 characters. I know it also depends on the font is they look the same, but something generic to start with is enough.

>>> # PY3 code:
>>> a='H'  # ascii
>>> b='Н'  # utf8
>>> a==b
False
>>> ' '.join(format(ord(x), 'b') for x in a)
'1001000'
>>> ' '.join(format(ord(x), 'b') for x in b)
'10000011101'
>>> a='P'  # ascii
>>> b='Ρ'  # utf8
>>> a==b
False
>>> ' '.join(format(ord(x), 'b') for x in a)
'1010000'
>>> ' '.join(format(ord(x), 'b') for x in b)
'1110100001'

969

asked Oct 22 '17 07:10

ddofborg

1 Answers

This is very useful tool as it will show you all characters which look similar and you can choose if this is REALLY similar enough for you :)

https://unicode.org/cldr/utility/confusables.jsp?a=test&r=None

Some other resources:

This is called Visual Spoofing
Python Package to detect confusables

103

answered Sep 21 '22 07:09

ddofborg

Related questions
                            
                                Convert.FromBase64String returns unicode sometimes, or UTF-8
                            
                                How to open file in UTF-8 format in Netbeans; without distorting characters?
                            
                                how to deal with accents and strange characters in a database?
                            
                                Swift - Encoding and Decoding String for special characters
                            
                                Write .xml in Python with pretty print and encoding declaration
                            
                                Emacs, xterm, mousepad, C, Unicode and UTF-8: Trying to make sense of it all
                            
                                Cut an UTF8 text in PHP
                            
                                How can I convert HTML character references (&#x5E3;) to regular UTF-8?
                            
                                How do I accomplish random reads of a UTF8 file
                            
                                UTF-8 Without BOM?
                            
                                Bad UTF-8 without BOM encoding
                            
                                PHP \uXXXX encoded string convert to utf-8
                            
                                Display Hindi language in console using Java
                            
                                substr doesn't work fine with utf8
                            
                                Python, Encoding output to UTF-8
                            
                                NodeJS decodeURIComponent not working properly
                            
                                write UTF-8 BOM with supercsv
                            
                                Python unicode: how to replace character that cannot be decoded using utf8 with whitespace?
                            
                                Trouble converting to utf-8
                            
                                "TypeError: string argument without an encoding", but the string is encoded?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Similar looking UTF8 characters for ASCII

Tags:

ascii

utf-8

non-ascii-characters

extended-ascii

ddofborg

People also ask

1 Answers

ddofborg

Recent Activity

Donate For Us