Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create UTF-8 file in Qt

Tags:

unicode

utf-8

qt4

I'm trying to create a UTF-8 coded file in Qt.

#include <QtCore>

int main()
{
    QString unicodeString = "Some Unicode string";
    QFile fileOut("D:\\Temp\\qt_unicode.txt");
    if (!fileOut.open(QIODevice::WriteOnly | QIODevice::Text))
    {
        return -1;
    }

    QTextStream streamFileOut(&fileOut);
    streamFileOut.setCodec("UTF-8");
    streamFileOut << unicodeString;
    streamFileOut.flush();

    fileOut.close();

    return 0;
}

I thought when QString is by default Unicode and when I set codec of the output stream to UTF-8 that my file will be UTF-8. But it's not, it's ANSI. What do I do wrong? Is something wrong with my strings? Can you correct my code to create UTF-8 file? Next step for me will be to read ANSI file and save it as UTF-8 file, so I'll have to perform a conversion on each read string but now, I want to start with a file. Thank you.

like image 462
Ondrej Vencovsky Avatar asked Jan 24 '11 09:01

Ondrej Vencovsky


2 Answers

2022 edit: what follows was true for Qt 4. Qt 5 and later use UTF-8 by default, so this answer doesn’t apply to the latest Qt versions.

Your code is absolutely correct. The only part that looks suspicious to me is this:

QString unicodeString = "Some Unicode string";

The reason it looks suspicious is that QString uses the Latin1 encoding by default when constructing from a C-style string literal, so if you just intend to use accented Latin characters, you're probably fine, but use anything but that (Cyrillic, Chinese, Japanese, Hebrew...) and it no longer works correctly. The best way to deal with this issue is to have your source encoded in UTF-8 and do this instead:

QString unicodeString = QString::fromUtf8("Some Unicode string");

This will work for any imaginable language. Using QObject::trUtf8() is even better as it gives you a lot of i18n capabilities.

Edit

While it's true that you generate a correct UTF-8 file, if you want Notepad to recognize your file as UTF-8, it's a different story. You need to put a BOM in there. It can be done either as suggested in another answer, or here is another way:

streamFileOut.setGenerateByteOrderMark(true);
like image 198
Sergei Tachenov Avatar answered Nov 09 '22 20:11

Sergei Tachenov


My experience to create txt encoding UTF-8 without BOM by QT as:

file.open(QIODevice::WriteOnly | QIODevice::Text);
QTextStream out(&file);
out.setCodec("UTF-8"); // ...
vcfline = ctn; //assign some utf-8 characters
out.setGenerateByteOrderMark(false);
out << vcfline; //.....
file.close();

And the file will be encoding UTF-8 without BOM.

like image 11
user2006121 Avatar answered Nov 09 '22 21:11

user2006121