ASCII vs Unicode + UTF-8

Tags:

Was reading Joel Spolsky's 'The Absolute Minimum' about character encoding. It is my understanding that ASCII is a Code-point + Encoding scheme, and in modern times, we use Unicode as the Code-point scheme and UTF-8 as the Encoding scheme. Is this correct?

411

asked Jan 23 '14 01:01

Quest Monger

2 Answers

In modern times, ASCII is now a subset of UTF-8, not its own scheme. UTF-8 is backwards compatible with ASCII.

168

answered Oct 11 '22 23:10

Remy Lebeau

Yes, except that UTF-8 is an encoding scheme. Other encoding schemes include UTF-16 (with two different byte orders) and UTF-32. (For some confusion, a UTF-16 scheme is called “Unicode” in Microsoft software.)

And, to be exact, the American National Standard that defines ASCII specifies a collection of characters and their coding as 7-bit quantities, without specifying a particular transfer encoding in terms of bytes. In the past, it was used in different ways, e.g. so that five ASCII characters were packed into one 36-bit storage unit or so that 8-bit bytes used the extra bytes for checking purposes (parity bit) or for transfer control. But nowadays ASCII is used so that one ASCII character is encoded as one 8-bit byte with the first bit set to zero. This is the de facto standard encoding scheme and implied in a large number of specifications, but strictly speaking not part of the ASCII standard.

answered Oct 12 '22 01:10

Jukka K. Korpela

Related questions
                            
                                Unicode in Github markdown
                            
                                Any way to return PHP `json_encode` with encode UTF-8 and not Unicode? [duplicate]
                            
                                How to fix ''UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 29815: character maps to <undefined>''?
                            
                                convert Persian/Arabic numbers to English numbers
                            
                                Python UnicodeDecodeError - Am I misunderstanding encode?
                            
                                How do I get str.translate to work with Unicode strings?
                            
                                Byte and char conversion in Java
                            
                                git not displaying unicode file names
                            
                                Ruby 1.9: how can I properly upcase & downcase multibyte strings?
                            
                                Matching accented characters with Javascript regexes
                            
                                Swift turn a country code into a emoji flag via unicode
                            
                                Reading Unicode file data with BOM chars in Python
                            
                                How can I display a 'Reload' symbol in HTML without loading an image via HTTP?
                            
                                Color for Unicode Emoji
                            
                                Unicode in C++11
                            
                                WebClient Unicode - Which UTF8?
                            
                                How to count characters in a unicode string in C
                            
                                What Unicode symbol represents a person?
                            
                                JSON and escaping characters
                            
                                How can I use io.StringIO() with the csv module?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ASCII vs Unicode + UTF-8

Tags:

character-encoding

unicode

ascii

utf-8

Quest Monger

People also ask

2 Answers

Remy Lebeau

Jukka K. Korpela

Recent Activity

Donate For Us