Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert a stream of bytes to another encoding?

Tags:

c

winapi

I'm trying to convert a stream of bytes with MultiByteToWideChar() WinAPI function.

Documentation says function fails with ERROR_NO_UNICODE_TRANSLATION on incomplete strings (no trailing byte in multibyte encoded string). How do I prevent this error? The only way that comes to mind is not to convert the last multibyte character of input buffer (using IsDBCSLeadByteEx() to locate it).

Are there better solutions to convert a stream of bytes?

like image 981
Basilevs Avatar asked Nov 06 '22 10:11

Basilevs


1 Answers

It seems to me that you can just use CharNextExA to move to the next character position in the input stream. In the way you can get some characters and convert there together in the UNICODE string with respect of MultiByteToWideChar. After you have the UNICODE text fragment you can convert it in another code page using WideCharToMultiByte.

UPDATED: I am sure the process of receiving the stream of the input data is much more slowly as the decoding of data with respect of CharNextExA, MultiByteToWideChar and WideCharToMultiByte. For example if you use a buffer on the stack like WCHAR szBuffer[4096] and TCHAR szDestBuffer[4096] then you will be able to decode 1K of input data very quickly. So I suppose that the total time of working of your whole program will be almost indented from the usage of these three functions.

Moreover, I am not sure that you have any alternative. I don't know any reliable way to start decoding of the text either from the beginning of at the end of the text. Probably other people has another idea...

like image 138
Oleg Avatar answered Nov 12 '22 13:11

Oleg