Does the Unicode Consortium Intend to make UTF-16 run out of characters? [closed]

2 Answers

As of 2011 we have consumed 109,449 characters AND set aside for application use(6,400+131,068):

leaving room for over 860,000 unused chars; plenty for CJK extension E(~10,000 chars) and 85 more sets just like it; so that in the event of contact with the Ferengi culture, we should be ready.

In November 2003 the IETF restricted UTF-8 to end at U+10FFFF with RFC 3629, in order to match the constraints of the UTF-16 character encoding: a UTF-8 parser should not accept 5 or 6 byte sequences that would overflow the utf-16 set, or characters in the 4 byte sequence that are greater than 0x10FFFF

Please put edits listing sets that pose threats on the size of the unicode code point limit here if they are over 1/3 the Size of the CJK extension E(~10,000 chars):

CJK extension E(~10,000 chars)
Ferengi culture characters(~5,000 chars)

answered Oct 12 '22 15:10

GlassGhost

At present time, the Unicode standard doesn't define any characters above U+10FFFF, so you would be fine to code your app to reject characters above that point.

Predicting the future is hard, but I think you're safe for the near term with this strategy. Honestly, even if Unicode extends past U+10FFFF in the distant future, it almost certainly won't be for mission critical glyphs. Your app might not be compatible with the new Ferengi fonts that come out in 2063, but you can always fix it when it actually becomes an issue.

answered Oct 12 '22 14:10

StilesCrisis

Related questions
                            
                                Python, .format(), and UTF-8
                            
                                determine whether a unicode character is fullwidth or halfwidth in C++
                            
                                Convert numeric character reference notation to unicode string
                            
                                How to rename a file with non-ASCII character encoding to ASCII
                            
                                How to validate a unicode email?
                            
                                Unicode character-specific CSS - a thought
                            
                                Is Python 3.3 better than 2.7 for Decoding and Re-Encoding Scraped Web Text to UTF-8?? Like, a lot better?
                            
                                Why can I not use the Unicode characters √ and ∀ in assignments?
                            
                                ascii codec cant decode byte 0xe9
                            
                                Why python 2.7 on Windows need a space before unicode character when print?
                            
                                Regex for accent insensitive replacement in python
                            
                                Why are there holes in the Unicode table?
                            
                                Should NVARCHAR be used to saved 'accented characters' into Sql Server?
                            
                                Default encoding for python for stderr?
                            
                                How to specify Regexp for unicode cyrillic characters in Ruby 1.9
                            
                                Ruby on Rails. Unicode routes
                            
                                Detect if a string was double-encoded in UTF-8
                            
                                Unicode fonts for Japanese
                            
                                python check if utf-8 string is uppercase
                            
                                How to store multi byte characters in SQL Server database using CodeIgniter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does the Unicode Consortium Intend to make UTF-16 run out of characters? [closed]

Tags:

unicode

utf-8

utf-16

GlassGhost

People also ask

2 Answers

GlassGhost

StilesCrisis

Recent Activity

Donate For Us