Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Qt cpp - Clean way to write QString into text file

I need to find a clean and fast way to write a QString in .csv file. I tried:

QString path= QCoreApplication::applicationDirPath() + QString("/myfile.csv");
QFile file(path);
QString mystring = "Hello, world!";    
if(!file.open(QIODevice::WriteOnly)){
        file.close();
    } else {
        QTextStream out(&file); out << mystring;
        file.close();
    }

But it writes for me "???????" in myfile.csv

like image 579
Ale Avatar asked Mar 26 '18 09:03

Ale


1 Answers

As Andrey didn't react yet, I will step in and provide some background about OPs issue:

From Qt doc. about QString:

The QString class provides a Unicode character string.

QString stores a string of 16-bit QChars, where each QChar corresponds one Unicode 4.0 character. (Unicode characters with code values above 65535 are stored using surrogate pairs, i.e., two consecutive QChars.)

From Qt doc. about QTextStream::operator<<(const QString &string):

Writes the string string to the stream, and returns a reference to the QTextStream. The string is first encoded using the assigned codec (the default codec is QTextCodec::codecForLocale()) before it is written to the stream.

and about QTextStream in general:

Internally, QTextStream uses a Unicode based buffer, and QTextCodec is used by QTextStream to automatically support different character sets. By default, QTextCodec::codecForLocale() is used for reading and writing, but you can also set the codec by calling setCodec(). Automatic Unicode detection is also supported. When this feature is enabled (the default behavior), QTextStream will detect the UTF-16 or the UTF-32 BOM (Byte Order Mark) and switch to the appropriate UTF codec when reading. QTextStream does not write a BOM by default, but you can enable this by calling setGenerateByteOrderMark(true).

So, "???????" is probably Hello, World! encoded in UTF-16 or UTF-32 where OPs view tool (which did output "???????") was not able to detect this or even does not support this encoding.

The hint of Andrey Semenov, to do instead:

file.write(mystring.toUtf8());

converts the QString contents to UTF-8 which

  • consists of bytes
  • is identical with ASCII concerning the first 127 ASCII characters.

QString::toUtf8() returns QByteArray; and QTextStream::operator<<(const QByteArray&) very probably writes these bytes unchanged. (This even isn't mentioned in the doc.)

So, Hello, World! consists only of characters available in the ASCII table (with codes < 127). Even, if OPs view tool supports/expects e.g. Windows 1252 it will not notice this. (I assume that a tool that cannot detect/process UTF-16 or UTF-32, probably cannot detect/process UTF-8 as well.)


Btw. to find out what encoding "???????" actually was, myfile.csv could be viewed with a hex-view tool. As the input is known, the encoding can probably be deduced from the output. (E.g. a He is 0x48 0x65 in ASCII and UTF-8, but 0x48 0x00 0x65 0x00 in UTF-16LE and 0x00 0x48 0x00 0x65 in UTF-16BE.)


I tried to reproduce the issue with an MCVE.

testQTextStreamEncoding.cc:

#include <QtWidgets>

int main(int, char**)
{
  const QString qString = "Hello, World!";
  const QString qPath("testQTextStreamEncoding.txt");
  QFile qFile(qPath);
  if (qFile.open(QIODevice::WriteOnly)) {
    QTextStream out(&qFile); out << qString;
    qFile.close();
  }
  return 0;
}

and testQTextStreamEncoding.pro:

SOURCES = testQTextStreamEncoding.cc

QT += widgets

On cygwin, I did:

$ qmake-qt5

$ make
g++ -c -fno-keep-inline-dllexport -D_GNU_SOURCE -pipe -O2 -Wall -W -D_REENTRANT -DQT_NO_DEBUG -DQT_WIDGETS_LIB -DQT_GUI_LIB -DQT_CORE_LIB -I. -isystem /usr/include/qt5 -isystem /usr/include/qt5/QtWidgets -isystem /usr/include/qt5/QtGui -isystem /usr/include/qt5/QtCore -I. -I/usr/lib/qt5/mkspecs/cygwin-g++ -o testQTextStreamEncoding.o testQTextStreamEncoding.cc
g++  -o testQTextStreamEncoding.exe testQTextStreamEncoding.o   -lQt5Widgets -lQt5Gui -lQt5Core -lGL -lpthread 

$ ./testQTextStreamEncoding

$ hexdump.exe -C testQTextStreamEncoding.txt
00000000  48 65 6c 6c 6f 2c 20 57  6f 72 6c 64 21           |Hello, World!|
0000000d

$

So, it seems I cannot reproduce on my side what OP described. I also tried it with the code compiled and ran in VS2013:

$ rm testQTextStreamEncoding.txt ; ls testQTextStreamEncoding.txt
ls: cannot access 'testQTextStreamEncoding.txt': No such file or directory

(Compile, Run in VS2013)

$ hexdump.exe -C testQTextStreamEncoding.txt
00000000  48 65 6c 6c 6f 2c 20 57  6f 72 6c 64 21           |Hello, World!|
0000000d

$

Again, I cannot reproduce. It would be interesting to see how the OP produced "???????" and what it actually contained.

like image 81
Scheff's Cat Avatar answered Oct 15 '22 19:10

Scheff's Cat