Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to optimize ASCII output with QTextStream

I'm currently writing out billions of binary records to ASCII files (ugh). I've got things working just fine, but I'd like to optimize the performance if I can. The problem is, the user is allowed to select any number of fields to output, so I can't know at compile-time which of 3-12 fields they'll include.

Is there a faster way to construct lines of ASCII text? As you can see, the types of the fields vary quite a bit and I can't think of a way around the series of if() statements. The output ASCII file has one line per record, so I've tried using a template QString constructed with arg, but that just slowed things down about 15%.

A faster solution doesn't have to use QTextStream, or necessarily write directly to the file, but the output is too large to write the whole thing to memory.

Here's some sample code:

QFile outfile(outpath);
if(!outfile.open(QIODevice::WriteOnly | QIODevice::Text | QIODevice::Truncate))
{
    qWarning("Could not open ASCII for writing!");
    return false;
} else
{
    /* compute XYZ precision */
    int prec[3] = {0, 0, 0}; //these non-zero values are determined programmatically

    /* set up the writer */
    QTextStream out(&outfile);
    out.setRealNumberNotation(QTextStream::FixedNotation);
    out.setRealNumberPrecision(3);
    QString del(config.delimiter); //the user chooses the delimiter character (comma, tab, etc) - using QChar is slower since it has to be promoted to QString anyway

    /* write the header line */
    out << "X" << del << "Y" << del << "Z";
    if(config.fields & INTFIELD)
        out << del << "IntegerField";
    if(config.fields & DBLFIELD)
        out << del << "DoubleField";
    if(config.fields & INTFIELD2)
        out << del << "IntegerField2";
    if(config.fields & TRIPLEFIELD)
        out << del << "Tri1" << del << "Tri2" << del << "Tri3";
    out << "\n";

    /* write out the points */
    for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
    {
        pt = points.at(ptnum);
        out.setRealNumberPrecision(prec[0]);
        out << pt->getXYZ(0);
        out.setRealNumberPrecision(prec[1]);
        out << del << pt->getXYZ(1);
        out.setRealNumberPrecision(prec[2]);
        out << del << pt->getXYZ(2);
        out.setRealNumberPrecision(3);
        if(config.fields & INTFIELD)
            out << del << pt->getIntValue();
        if(config.fields & DBLFIELD)
            out << del << pt->getDoubleValue();
        if(config.fields & INTFIELD2)
            out << del << pt->getIntValue2();
        if(config.fields & TRIPLEFIELD)
        {
            out << del << pt->getTriple(0);
            out << del << pt->getTriple(1);
            out << del << pt->getTriple(2);
        }
        out << "\n";
    } //end for every point
outfile.close();
like image 408
Phlucious Avatar asked Jun 14 '13 19:06

Phlucious


3 Answers

(This doesn't answer the profiler question. It tries to answer the original question, which is the performance issue.)

I would suggest avoiding the use of QTextStream altogether in this case to see if that helps. The reason it might help with performance is that there's overhead involved, because text is encoded internally to UTF-16 for storage, and then decoded again to ASCII or UTF-8 when writing it out. You have two conversions there that you don't need.

Try using only the standard C++ std::ostringstream class instead. It's very similar to QTextStream and only minor changes are needed in your code. For example:

#include <sstream>

// ...

QFile outfile(outpath);
if (!outfile.open(QIODevice::WriteOnly | QIODevice::Text
                | QIODevice::Truncate))
{
    qWarning("Could not open ASCII for writing!");
    return false;
}

/* compute XYZ precision */
int prec[3] = {0, 0, 0};

std::ostringstream out;
out.precision(3);
std::fixed(out);
// I assume config.delimiter is a QChar.
char del = config.delimiter.toLatin1();

/* write the header line */
out << "X" << del << "Y" << del << "Z";
if(config.fields & INTFIELD)
    out << del << "IntegerField";
if(config.fields & DBLFIELD)
    out << del << "DoubleField";
if(config.fields & INTFIELD2)
    out << del << "IntegerField2";

if(config.fields & TRIPLEFIELD)
    out << del << "Tri1" << del << "Tri2" << del << "Tri3";
out << "\n";

/* write out the points */
for(quint64 ptnum = 0; ptnum < numpoints; ++ptnum)
{
    pt = points.at(ptnum);
    out.precision(prec[0]);
    out << pt->getXYZ(0);
    out.precision(prec[1]);
    out << del << pt->getXYZ(1);
    out.precision(prec[2]);
    out << del << pt->getXYZ(2);
    out.precision(3);
    if(config.fields & INTFIELD)
        out << del << pt->getIntValue();
    if(config.fields & DBLFIELD)
        out << del << pt->getDoubleValue();
    if(config.fields & INTFIELD2)
        out << del << pt->getIntValue2();
    if(config.fields & TRIPLEFIELD)
    {
        out << del << pt->getTriple(0);
        out << del << pt->getTriple(1);
        out << del << pt->getTriple(2);
    }
    out << "\n";

    // Write out the data and empty the stream.
    outfile.write(out.str().data(), out.str().length());
    out.str("");
}
outfile.close();
like image 171
Nikos C. Avatar answered Nov 15 '22 08:11

Nikos C.


Given that you are writing out billions of records you might consider using the boost karma library:

http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/karma.html

According to their benchmark it runs much faster than C++ streams and even sprintf with most compilers/libraries, including Visual C++ 2010:

http://www.boost.org/doc/libs/1_54_0/libs/spirit/doc/html/spirit/karma/performance_measurements/numeric_performance/format_performance.html

It will take some learning, but you will be rewarded with significant speedup.

like image 23
amdn Avatar answered Nov 15 '22 06:11

amdn


Use multiple cores (if available)! It seems to me that each point of your data is independent of the others. So you could split up the preprocessing using QtConcurrent::mappedReduced. e.g.:

  1. divide your data into a sequence of blocks consisting of N (e.g. 1000) points each,
  2. then let your mapFunction process each block into a memory buffer
  3. let the reduceFunction write the buffers to the file.

Use OrderedReduce | SequentialReduce as options.

This can be used in addition to the other optimizations!

like image 37
Joachim Avatar answered Nov 15 '22 08:11

Joachim