Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Fastest way to populate QTableView from Pandas data frame

I'm very new to PyQt and I am struggling to populate a QTableView control.

My code is the following:

def data_frame_to_ui(self, data_frame):
        Displays a pandas data frame into the GUI
        list_model = QtGui.QStandardItemModel()
        i = 0
        for val in data_frame.columns:
            # for the list model
            if i > 0:
                item = QtGui.QStandardItem(val)
            i += 1

        # for the table model
        table_model = QtGui.QStandardItemModel()

        # set table headers

        # fill table model data
        for row_idx in range(10): #len(data_frame.values)
            row = list()
            for col_idx in range(data_frame.columns.size):
                val = QtGui.QStandardItem(str(data_frame.values[row_idx][col_idx]))

        # set table model to table object

Actually in the code I succeed to populate a QListView, but the values I set to the QTableView are not displayed, also you can see that I truncated the rows to 10 because it takes forever to display the hundreds of rows of the data frame.

So, What is the fastest way to populate the table model from a pandas data frame?

Thanks in advance.

like image 485
Santi Peñate-Vera Avatar asked Jul 17 '15 12:07

Santi Peñate-Vera

People also ask

How can you speed up computations with pandas?

Modin is a new library designed to accelerate Pandas by automatically distributing the computation across all of the system's available CPU cores. With that, Modin claims to be able to get nearly linear speedup to the number of CPU cores on your system for Pandas DataFrames of any size.

Is pandas query faster than LOC?

The query function seams more efficient than the loc function. DF2: 2K records x 6 columns. The loc function seams much more efficient than the query function.

What is faster than pandas DataFrame?

Dask runs faster than pandas for this query, even when the most inefficient column type is used, because it parallelizes the computations. pandas only uses 1 CPU core to run the query. My computer has 4 cores and Dask uses all the cores to run the computation.

Are pandas faster than data tables?

While the process takes 16.62 seconds for Pandas, Datatable is only at 6.55 seconds. Overall Datatable is 2 times faster than Pandas.

1 Answers

Personally I would just create my own model class to make handling it somewhat easier.

For example:

import sys
from PyQt4 import QtCore, QtGui
Qt = QtCore.Qt

class PandasModel(QtCore.QAbstractTableModel):
    def __init__(self, data, parent=None):
        QtCore.QAbstractTableModel.__init__(self, parent)
        self._data = data

    def rowCount(self, parent=None):
        return len(self._data.values)

    def columnCount(self, parent=None):
        return self._data.columns.size

    def data(self, index, role=Qt.DisplayRole):
        if index.isValid():
            if role == Qt.DisplayRole:
                return QtCore.QVariant(str(
        return QtCore.QVariant()

if __name__ == '__main__':
    application = QtGui.QApplication(sys.argv)
    view = QtGui.QTableView()
    model = PandasModel(your_pandas_data)

like image 83
Wolph Avatar answered Oct 02 '22 18:10
