Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing through a csv file in Qt

Tags:

c++

csv

qt

qt5

qfile

Is anyone familiar with how to parse through a csv file and put it inside a string list. Right now I am taking the entire csv file and putting into the string list. I am trying to figure out if there is a way to get only the first column.

#include "searchwindow.h"
#include <QtGui/QApplication>

#include <QApplication>
#include <QStringList>
#include <QLineEdit>
#include <QCompleter>
#include <QHBoxLayout>
#include <QWidget>
#include <QLabel>

#include <qfile.h>
#include <QTextStream>


int main(int argc, char *argv[])
{
    QApplication a(argc, argv);

    QWidget *widget = new QWidget();
    QHBoxLayout *layout = new QHBoxLayout();

    QStringList wordList;

    QFile f("FlightParam.csv");
    if (f.open(QIODevice::ReadOnly))
    {
        //file opened successfully
        QString data;
        data = f.readAll();
        wordList = data.split(',');

        f.close();
    }

    QLabel *label = new QLabel("Select");
    QLineEdit *lineEdit = new QLineEdit;
    label->setBuddy(lineEdit);

    QCompleter *completer = new QCompleter(wordList);
    completer->setCaseSensitivity(Qt::CaseInsensitive); //Make caseInsensitive selection

    lineEdit->setCompleter(completer);

    layout->addWidget(label);
    layout->addWidget(lineEdit);

    widget->setLayout(layout);
    widget->showMaximized();

    return a.exec();
}
like image 555
user3878223 Avatar asked Dec 05 '14 14:12

user3878223


People also ask

How do I read a CSV file in Java by line?

We can read a CSV file line by line using the readLine() method of BufferedReader class. Split each line on comma character to get the words of the line into an array. Now we can easily print the contents of the array by iterating over it or by using an appropriate index.

Are CSV files easy to parse?

Parsing CSV files in Python is quite easy. Python has an inbuilt CSV library which provides the functionality of both readings and writing the data from and to CSV files. There are a variety of formats available for CSV files in the library which makes data processing user-friendly.


4 Answers

There you go:

FlightParam.csv

1,2,3,
4,5,6,
7,8,9,

main.cpp

#include <QFile>
#include <QStringList>
#include <QDebug>

int main()
{
    QFile file("FlightParam.csv");
    if (!file.open(QIODevice::ReadOnly)) {
        qDebug() << file.errorString();
        return 1;
    }

    QStringList wordList;
    while (!file.atEnd()) {
        QByteArray line = file.readLine();
        wordList.append(line.split(',').first());
    }

    qDebug() << wordList;

    return 0;
}

main.pro

TEMPLATE = app
TARGET = main
QT = core
SOURCES += main.cpp

Build and Run

qmake && make && ./main

Output

("1", "4", "7")
like image 120
lpapp Avatar answered Oct 13 '22 06:10

lpapp


What you are looking for is a QTextStream class. It provides all kind of interfaces for reading and writing files.

A simple example:

QStringList firstColumn;
QFile f1("h:/1.txt");
f1.open(QIODevice::ReadOnly);
QTextStream s1(&f1);
while (!s1.atEnd()){
  QString s=s1.readLine(); // reads line from file
  firstColumn.append(s.split(",").first()); // appends first column to list, ',' is separator
}
f1.close();

Alternatively yes, you can do something like this which would have the same result:

wordList = f.readAll().split(QRegExp("[\r\n]"),QString::SkipEmptyParts); //reading file and splitting it by lines
for (int i=0;i<wordList.count();i++) 
   wordList[i]=wordlist[i].split(",").first(); // replacing whole row with only first value
f.close();    
like image 37
Shf Avatar answered Oct 13 '22 04:10

Shf


One might prefer to do it this way:

QStringList MainWindow::parseCSV(const QString &string)
{
    enum State {Normal, Quote} state = Normal;
    QStringList fields;
    QString value;

    for (int i = 0; i < string.size(); i++)
    {
        const QChar current = string.at(i);

        // Normal state
        if (state == Normal)
        {
            // Comma
            if (current == ',')
            {
                // Save field
                fields.append(value.trimmed());
                value.clear();
            }

            // Double-quote
            else if (current == '"')
            {
                state = Quote;
                value += current;
            }

            // Other character
            else
                value += current;
        }

        // In-quote state
        else if (state == Quote)
        {
            // Another double-quote
            if (current == '"')
            {
                if (i < string.size())
                {
                    // A double double-quote?
                    if (i+1 < string.size() && string.at(i+1) == '"')
                    {
                        value += '"';

                        // Skip a second quote character in a row
                        i++;
                    }
                    else
                    {
                        state = Normal;
                        value += '"';
                    }
                }
            }

            // Other character
            else
                value += current;
        }
    }

    if (!value.isEmpty())
        fields.append(value.trimmed());

    // Quotes are left in until here; so when fields are trimmed, only whitespace outside of
    // quotes is removed.  The outermost quotes are removed here.
    for (int i=0; i<fields.size(); ++i)
        if (fields[i].length()>=1 && fields[i].left(1)=='"')
        {
            fields[i]=fields[i].mid(1);
            if (fields[i].length()>=1 && fields[i].right(1)=='"')
                fields[i]=fields[i].left(fields[i].length()-1);
        }

    return fields;
}
  • Powerful: handles quoted material with commas, double double quotes (which signify a double-quote character) and whitespace right
  • Flexible: doesn't fail if the last quote on the last string is forgotten, and handles more complicated CSV files; lets you process one line at a time without having to read the whole file in memory first
  • Simple: Just drop this state machine in yer code, right-click on the function name in QtCreator and choose Refactor | Add private declaration, and yer good 2 go.
  • Performant: accurately processes CSV lines faster than doing RegEx look-aheads on each character
  • Convenient: requires no external library
  • Easy to read: The code is intuitive, in case U need 2 modify it.

Edit: I've finally got around to getting this to trim spaces before and after the fields. No whitespace nor commas are trimmed inside quotes. Otherwise, all whitespace is trimmed from the start and end of a field. After puzzling about this for a while, I hit on the idea that the quotes could be left around the field; and so all fields could be trimmed. That way, only whitespace before and after quotes or text is removed. A final step was then added, to strip out quotes for fields that start and end with quotes.

Here is a more or less challenging test case:

QStringList sl=
{
    "\"one\"",
    "  \" two \"\"\"  , \" and a half  ",
    "three  ",
    "\t  four"
};

for (int i=0; i < sl.size(); ++i)
    qDebug() << parseCSV(sl[i]);

This corresponds to the file

"one"
 " two """  , " and a half  
three  
<TAB>  four

where <TAB> represents the tab character; and each line is fed into parseCSV() in turn. DON'T write .csv files like this!

Its output is (where qDebug() is representing quotes in the string with \" and putting things in quotes and parens):

("one")
(" two \"", " and a half")
("three")
("four")

You can observe that the quote and the extra spaces were preserved inside the quote for item "two". In the malformed case for "and a half", the space before the quote, and those after the last word, were removed; but the others were not. Missing terminal spaces in this routine could be an indication of a missing terminal quote. Quotes in a field that don't start or end it are just treated as part of a string. A quote isn't removed from the end of a field if one doesn't start it. To detect an error here, just check for a field that starts with a quote, but doesn't end with one; and/or one that contains quotes but doesn't start and end with one, in the final loop.

More than was needed for yer test case, I know; but a solid general answer to the ?, nonetheless - perhaps for others who have found it.

Adapted from: https://github.com/hnaohiro/qt-csv/blob/master/csv.cpp

like image 12
CodeLurker Avatar answered Oct 13 '22 06:10

CodeLurker


Here is the code I usually use. I'm the author, consider this as-is, public domain. It has a similar feature-set and concept as CodeLurker's code except the state machine is represented differently, the code is a bit shorter.

bool readCSVRow (QTextStream &in, QStringList *row) {

    static const int delta[][5] = {
        //  ,    "   \n    ?  eof
        {   1,   2,  -1,   0,  -1  }, // 0: parsing (store char)
        {   1,   2,  -1,   0,  -1  }, // 1: parsing (store column)
        {   3,   4,   3,   3,  -2  }, // 2: quote entered (no-op)
        {   3,   4,   3,   3,  -2  }, // 3: parsing inside quotes (store char)
        {   1,   3,  -1,   0,  -1  }, // 4: quote exited (no-op)
        // -1: end of row, store column, success
        // -2: eof inside quotes
    };

    row->clear();

    if (in.atEnd())
        return false;

    int state = 0, t;
    char ch;
    QString cell;

    while (state >= 0) {

        if (in.atEnd())
            t = 4;
        else {
            in >> ch;
            if (ch == ',') t = 0;
            else if (ch == '\"') t = 1;
            else if (ch == '\n') t = 2;
            else t = 3;
        }

        state = delta[state][t];

        switch (state) {
        case 0:
        case 3:
            cell += ch;
            break;
        case -1:
        case 1:
            row->append(cell);
            cell = "";
            break;
        }

    }

    if (state == -2)
        throw runtime_error("End-of-file found while inside quotes.");

    return true;

}
  • Parameter: in, a QTextStream.
  • Parameter: row, a QStringList that will receive the row.
  • Returns: true if a row was read, false if EOF.
  • Throws: std::runtime_error if an error occurs.

It parses Excel style CSV's, handling quotes and double-quotes appropriately, and allows newlines in fields. Handles Windows and Unix line endings properly as long as your file is opened with QFile::Text. I don't think Qt supports old-school Mac line endings, and this doesn't support binary-mode untranslated line-endings, but for the most part this shouldn't be a problem these days.

Other notes:

  • Unlike CodeLurker's implementation this intentionally fails if EOF is hit inside quotes. If you change the -2's to -1's in the state table then it will be forgiving.
  • Parses x"y"z as xyz, wasn't sure what the rule for mid-string quotes was. I have no idea if this is correct.
  • Performance and memory characteristics the same as CodeLurker's (i.e. very good).
  • Does not support unicode (converts to ISO-5589-1) but changing to QChar should be trivial.

Example:

QFile csv(filename);
csv.open(QFile::ReadOnly | QFile::Text);

QTextStream in(&csv);
QStringList row;
while (readCSVRow(in, &row))
    qDebug() << row;
like image 11
Jason C Avatar answered Oct 13 '22 05:10

Jason C