What is the current modern term for "Multi-byte Character Set"

Tags:

I used to be confusing quite a while :

Confusion on Unicode and Multibyte Articles

After reading up the comments by all contributors, plus :

Looking at an old article (Year 2001) : http://www.hastingsresearch.com/net/04-unicode-limitations.shtml, which talk about unicode :

being a 16-bit character definition allowing a theoretical total of over 65,000 characters. However, the complete character sets of the world add up to over 170,000 characters.

and Looking at current "modern" article : http://en.wikipedia.org/wiki/Unicode

The most commonly used encodings are UTF-8 (which uses 1 byte for all ASCII characters, which have the same code values as in the standard ASCII encoding, and up to 4 bytes for other characters), the now-obsolete UCS-2 (which uses 2 bytes for all characters, but does not include every character in the Unicode standard), and UTF-16 (which extends UCS-2, using 4 bytes to encode characters missing from UCS-2).

It seems that in the compilation options in VC2008, the options "Unicode" under Character Sets really means "Unicode encoded in UCS-2" (Or UTF-16? I am not sure)

I try to verify this by running the following code under VC2008

#include <iostream>

int main()
{
    // Use unicode encoded in UCS-2?
    std::cout << sizeof(L"我爱你") << std::endl;
    // Use unicode encoded in UCS-2?
    std::cout << sizeof(L"abc") << std::endl;
    getchar();

    // Compiled using options Character Set : Use Unicode Character Set.
    // print out 8, 8

    // Compiled using options Character Set : Multi-byte Character Set.
    // print out 8, 8
}

It seems that during compilation with Unicode Character Set options, the outcome matched my assumption.

But what about Multi-byte Character Set? What does Multi-byte Character Set means in current "modern" world? :)

328

asked Mar 10 '10 03:03

Cheok Yan Cheng

1 Answers

http://en.wikipedia.org/wiki/Multi-byte_character_set

MBCS is a term used to denote a class of character encodings with characters that cannot be represented with a single byte, hence multi-byte character set. In order to properly decode a string in this format, you need a codepage that tells you various byte combinations map to characters. ISO/IEC 8859 defines a set of MBCS standards, but according to Wikipedia, ISO stopped maintaining them in 2004, presumably to focus on Unicode.

So I guess the modern term for MBCS is "deprecated in favor of Unicode".

168

answered Oct 07 '22 01:10

MSN

Related questions
                            
                                is DISPID_VALUE reliable for invokes on IDispatchs from scripts?
                            
                                What's the most efficient way to do recursive XPath queries using libxml2?
                            
                                How to subtract one audio wave from another?
                            
                                Can POSIX message queues be used cross user on Linux?
                            
                                C/C++ Copy file with automatic recursive folder/directory creation
                            
                                C++ Array Constructor
                            
                                istream from file_descriptor_source (boost::iostreams) or file
                            
                                How to integrate CUDA .cu code with C++ app
                            
                                C++ redirect outgoing connections
                            
                                Distinguish between const and non-const method with same name in boost::bind
                            
                                Saving a simple image buffer to png in C++
                            
                                Possible to pass name as argument to c++ template?
                            
                                Does splitting C++ code into multiple translation units introduce overhead on the executable size?
                            
                                The C++11 way of initializing data members from arguments
                            
                                Does copy elision work with structured bindings
                            
                                How do I set the working directory to the "solution directory" in c++?
                            
                                How to use lock_guard when returning protected data
                            
                                Why does std::exception have extra constructors in VC++?
                            
                                What are "rvalue references for *this" for?
                            
                                Comparator for min-heap in C++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the current modern term for "Multi-byte Character Set"

Tags:

c++

unicode

visual-c++

internationalization

Cheok Yan Cheng

People also ask

1 Answers

MSN

Recent Activity

Donate For Us