I'm working on a project that we want to use Unicode and could end up in countries like Japan, etc... We want to use std::string for the underlying type that holds string data in the data layer (see Qt, MSVC, and /Zc:wchar_t- == I want to blow up the world as to why). The problem is that I'm not completely sure which function pair (to/from) to use for this and be sure we're 100% compatible with anything the user might enter in the Qt layer.
A look at to/fromStdString indicates that I'd have to use setCodecForCStrings. The documentation for that function though indicates that I wouldn't want to do this for things like Japanese. This is the set that I'd LIKE to use though. Does someone know enough to explain how I'd set this up if it's possible?
The other option that looks like I could be pretty sure of it working is the to/fromUTF8 functions. Those would require a two step approach though so I'd prefer the other if possible.
Is there anything I've missed?
The documentation is a bit disconcerting here (on QString). However, the docs on QTextCodec don't seem to note any problems. It does however return QByteArray instead of a std::string. You can of course easily convert QByteArray to a std::string. But if you look at the source for QString you'll see that toStdString does exactly that:
inline std::string QString::toStdString() const
{ const QByteArray asc = toAscii(); return std::string(asc.constData(), asc.length()); }
toAscii in turn will use whatever codec has been set with QTextCodec::setCodecForCString.
Their warning about the Japanese codec is likely valid, however, if your OS is using the Japanese codec you likely won't have a problem. I'm not entirely sure here.
But you can avoid that problem. Simply set your codec as UTF-8 and then toStdString should convert everything into UTF-8. In fact I'm going to do that with my code right now. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With