Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reverse wstring in C++

Tags:

c++

utf-8

locale

I need to reverse wstring. I have such code:

#include <iostream>
#include <string>
#include <locale>

int main() {
    std::wstring s;
    std::getline(std::wcin, s);
    for (const auto &i : s) {
        std::wcout << (int) i << " ";
    }
    std::wcout << std::endl;

    std::wcout << s << std::endl;

    std::reverse(s.begin(), s.end());
    std::wcout << s << std::endl;
    return 0;
}

ANSI characters are encoded in 1 byte, and I can easily reverse them:

echo -n "papa" | ./reverse
112 97 112 97
papa
apap

But when I enter cyrillic text, that are encoded more than 1 bytes, I get such output:

echo -n "папа" | ./reverse
208 191 208 176 208 191 208 176
папа
�пап�

How can I properly reverse that string?

P.S. I'm using OS X.

like image 904
0x1337 Avatar asked Apr 10 '26 01:04

0x1337


1 Answers

Your system, OS X, uses UTF-8. So there is no reason for you to use wstring or wchar_t. And indeed this is where the confusion comes from!

You see, when you call getline() with a wstring on OS X, it does not read wide characters at all. The characters are indeed four bytes each, but they hold the same 0-255 range of values that they would if you used a regular "narrow" string. So when you pipe your Cyrillic characters to your program, you end up with a wstring of length 8, because C++ doesn't understand UTF-8, but your terminal does (hence it looks like four characters in the terminal but 8 in C++).

A commenter on your question was right to point out this question: How do I reverse a UTF-8 string in place? - that is really all you need, once you realize that you aren't dealing with wide strings at all.

like image 93
John Zwinck Avatar answered Apr 12 '26 14:04

John Zwinck



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!