Unicode code point limit

1 Answers

UTF-8 underwent some changes during its life, and there are many specifications (most of which are outdated now) which standardized UTF-8. Most of the changes were introduced to help compatibility with UTF-16 and to allow for the ever-growing amount of codepoints.

To make the long story short, UTF-8 was originally specified to allow codepoints with up to 31 bits (or 6 bytes). But with RFC3629, this was reduced to 4 bytes max. to be more compatible to UTF-16.

Wikipedia has some more information. The specification of the Universal Character Set is closely linked to the history of Unicode and its transformation format (UTF).

149

answered Sep 18 '22 15:09

Holger Just

Related questions
                            
                                How can I substitute Unicode characters with ASCII in Perl?
                            
                                Unicode characters and Internet Explorer
                            
                                Python: Sanitize a string for unicode? [duplicate]
                            
                                Unicode characters from charcode in javascript for charcodes > 0xFFFF
                            
                                international characters in Javascript
                            
                                Unicode support for android
                            
                                How do I print a Celsius symbol with matplotlib?
                            
                                How do I escape a Unicode string with Ruby?
                            
                                Accented characters in mySQL table
                            
                                Why this regex is not working for german words?
                            
                                Disable unicode replacement emoji in Android Chrome?
                            
                                UnicodeEncodeError: 'ascii' codec can't encode character [...]
                            
                                Using Unicode inside R's expression() command
                            
                                Using PDFBox to write UTF-8 encoded strings to a PDF [duplicate]
                            
                                How to portably parse the (Unicode) degree symbol with regular expressions?
                            
                                How can I tell TortoiseHg to display a UTF-16 file as non-binary?
                            
                                Unicode filenames on FAT-32?
                            
                                Range of unicode characters GHC accepts
                            
                                Cross-platform strings (and Unicode) in C++
                            
                                How to avoid browsers Unicode normalization when submitting a form with Unicode

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Unicode code point limit

Tags:

character-encoding

unicode

user4344

People also ask

1 Answers

Holger Just

Recent Activity

Donate For Us