I'm trying to figure out how QXmlStreamReader works for a C++ application I'm writing. The XML file I want to parse is a large dictionary with a convoluted structure and plenty of Unicode characters so I decided to try a small test case with a simpler document. Unfortunately, I hit a wall. Here's the example xml file:
<?xml version="1.0" encoding="UTF-8" ?>
<persons>
<person>
<firstname>John</firstname>
<surname>Doe</surname>
<email>[email protected]</email>
<website>http://en.wikipedia.org/wiki/John_Doe</website>
</person>
<person>
<firstname>Jane</firstname>
<surname>Doe</surname>
<email>[email protected]</email>
<website>http://en.wikipedia.org/wiki/John_Doe</website>
</person>
<person>
<firstname>Matti</firstname>
<surname>Meikäläinen</surname>
<email>[email protected]</email>
<website>http://fi.wikipedia.org/wiki/Matti_Meikäläinen</website>
</person>
</persons>
...and I'm trying to parse it using this code:
int main(int argc, char *argv[])
{
if (argc != 2) return 1;
QString filename(argv[1]);
QTextStream cout(stdout);
cout << "Starting... filename: " << filename << endl;
QFile file(filename);
bool open = file.open(QIODevice::ReadOnly | QIODevice::Text);
if (!open)
{
cout << "Couldn't open file" << endl;
return 1;
}
else
{
cout << "File opened OK" << endl;
}
QXmlStreamReader xml(&file);
cout << "Encoding: " << xml.documentEncoding().toString() << endl;
while (!xml.atEnd() && !xml.hasError())
{
xml.readNext();
if (xml.isStartElement())
{
cout << "element name: '" << xml.name().toString() << "'"
<< ", text: '" << xml.text().toString() << "'" << endl;
}
else if (xml.hasError())
{
cout << "XML error: " << xml.errorString() << endl;
}
else if (xml.atEnd())
{
cout << "Reached end, done" << endl;
}
}
return 0;
}
...then I get this output:
C:\xmltest\Debug>xmltest.exe example.xml
Starting... filename: example.xml
File opened OK
Encoding:
XML error: Encountered incorrectly encoded content.
What happened? This file couldn't be simpler and it looks consistent to me. With my original file I also get a blank entry for the encoding, the entries' names() are displayed, but alas, the text() is also empty. Any suggestions greatly appreciated, personally I'm thorougly mystified.
I'm answering this myself as this problem was related to three issues, two of which were brought up by the responses.
The relevant code section that works as expected now looks like this:
while (!xml.atEnd() && !xml.hasError())
{
xml.readNext();
if (xml.isStartElement())
{
QString name = xml.name().toString();
if (name == "firstname" || name == "surname" ||
name == "email" || name == "website")
{
cout << "element name: '" << name << "'"
<< ", text: '" << xml.readElementText()
<< "'" << endl;
}
}
}
if (xml.hasError())
{
cout << "XML error: " << xml.errorString() << endl;
}
else if (xml.atEnd())
{
cout << "Reached end, done" << endl;
}
The file is not UTF-8 encoded. Change the encoding to iso-8859-1 and it will parse without error.
<?xml version="1.0" encoding="iso-8859-1" ?>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With