What is the string terminator sequence for a UTF-16 string? EDIT: Let me rephrase the question in an attempt to clarify. How's does the call to <code>wcslen()</code> work?

Unicode does not define string terminators. Your environment or language does. For instance, C strings use 0x0 as a string terminator, as well as in .NET strings where a separate value in the <code>String</code> class is used to store the length of the string. To answer your second question, <code>wcslen</code> looks for a terminating <code>L'\0'</code> character. Which as I read it, is any length of <code>0x00</code> bytes, depending on the compiler, but will likely be the two-byte sequence <code>0x00</code> <code>0x00</code> if you're using UTF-16 (encoding U+0000, 'NUL')

UTF-16 string terminator

2 Answers

Unicode does not define string terminators. Your environment or language does. For instance, C strings use 0x0 as a string terminator, as well as in .NET strings where a separate value in the String class is used to store the length of the string.

To answer your second question, wcslen looks for a terminating L'\0' character. Which as I read it, is any length of 0x00 bytes, depending on the compiler, but will likely be the two-byte sequence 0x00 0x00 if you're using UTF-16 (encoding U+0000, 'NUL')

answered Oct 21 '22 18:10

Michael Petrotta

7.24.4.6.1 The wcslen function (from the Standard)

...

   [#3]   The  wcslen  function  returns  the  number  of  wide
   characters that precede the terminating null wide character.

And the null wide character is L'\0'

answered Oct 21 '22 16:10

pmg

Related questions
                            
                                C macro to create a bit mask -- possible? And have I found a GCC bug?
                            
                                What is the difference between MAP_SHARED and MAP_PRIVATE in the mmap function?
                            
                                How can I store a function pointer in a structure?
                            
                                Are there any free implementations of strcpy_s and/or TR24731-1?
                            
                                Can some explain the performance behavior of the following memory allocating C program?
                            
                                time delay in C. usleep
                            
                                fscanf and newline character
                            
                                Reopen a file descriptor with another access?
                            
                                How to use lockdep feature in linux kernel for deadlock detection
                            
                                C Programming - Read specific line from text file
                            
                                Error: control may reach end of non-void function in C
                            
                                Calculate a 32-bit CRC lookup table in C/C++
                            
                                Are conditional expressions in C++ always of bool type?
                            
                                Why does compiler generate additional sqrts in the compiled assembly code
                            
                                Free an assigned pointer
                            
                                GCC emits vastly different code using "-march=native" on similar architectures
                            
                                Why is stdout buffering?
                            
                                Why won't the second scanf() execute
                            
                                Library for parsing arguments GNU-style? [closed]
                            
                                why c/c++ allows omission of leftmost index of a multidimensional array in a function call?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

UTF-16 string terminator

Tags:

c

string

unicode

utf-16

unicode-string

Ray

People also ask

2 Answers

Michael Petrotta

pmg

Recent Activity

Donate For Us