Character Set Special Characters

1 Answers

Is iso-8859-1 a proper subset of utf-8?

The character reportoire of ISO-8859-1 (the first 256 characters of Unicode) is a proper subset of that of UTF-8 (every Unicode character).

However, the characters U+0080 to U+00FF are encoded differently in the two encodings.

ISO-8859-1 assigns each of these characters a single byte from 80 to FF.
UTF-8 encodes the same characters as two-byte sequences C2 80 to C3 BF.

What about iso-8859-n?

These are 15 different encodings that contain a total of 614 distinct characters. Some of these characters occur in multiple "parts" of ISO 8859, and some don't. You'll have to be more specific.

I see that your question is tagged ISO-8859-2. The characters that are in -2 that aren't in -1 are:

ĂăĄąĆćČčĎďĐđĘęĚěĹĺĽľŁłŃńŇňŐőŔŕŘřŚśŞşŠšŢţŤťŮůŰűŹźŻżŽžˇ˘˙˛˝

What about windows-1252?

Windows-1252 is just like ISO-8859-1 except that it replaces the rarely used control characters in the 0x80-0x9F range with printable characters. The characters that are in windows-1252 but not in ISO-8859-1 are:

ŒœŠšŸŽžƒˆ˜–—‘’‚“”„†‡•…‰‹›€™

199

answered Sep 30 '22 07:09

dan04

Related questions
                            
                                utf characters when using user32.dll FindWindow in c# application
                            
                                UTF-8 percentage encoding and python
                            
                                How to convert UTF-8 to ISO-8859-1 in Ruby 2.0? [closed]
                            
                                Big5 to utf-8 encoding while scraping website with Node-request
                            
                                php - translating "%C3%BC" to ü
                            
                                psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0xa0
                            
                                What are the different character sets used for?
                            
                                ascii codec cant decode byte 0xe9
                            
                                How to handle example data in R Package that has UTF-8 marked strings
                            
                                Is it possible to use cyrillic symbols in Lumen(by Laravel)?
                            
                                Why are there holes in the Unicode table?
                            
                                WMIC command in batch outputting non UTF-8 text files
                            
                                Encoding problems in JSP
                            
                                ï»¿ Appears at the beginning of my utf-8 text file when view as ANSI
                            
                                UTF-8 not working in HTML forms
                            
                                Trying to use HPSG PET Parser
                            
                                Detect if a string was double-encoded in UTF-8
                            
                                python check if utf-8 string is uppercase
                            
                                Change file encoding without information losses in intellij idea
                            
                                Does the Unicode Consortium Intend to make UTF-16 run out of characters? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Character Set Special Characters

Tags:

utf-8

iso-8859-1

windows-1252

iso-8859-2

Sean Jezewski

People also ask

1 Answers

dan04

Recent Activity

Donate For Us