Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most appropriate data structure for a CSV table?

Tags:

csv

delphi

I'm looking for an advice on most appropriate data structure for holding CSV(Comma Separated Value) table in a memory. It should cover both cases: table with and without a header. If the table contains a header, all fields of all rows are determined by key->value pairs, where the key is a name from a header and value is an appropriate content of a field. If the table does not contain a header, then rows are simply lists of strings or also key->value pairs with key names generated (like 'COL1', 'COL2', ... 'COLn').

I'm looking for most simple (less code) and most generic solution at the same time.

I'm thinking about the following subclassing, but doubt if it's the right/effective way of implementation:

TCSV = class (TObjectList<TDictionary<string, string>>)
  ...
public
  constructor Create(fileName: string; header: Boolean; encoding: string = '';
                     delimiter: Char = ';'; quoteChar: Char = '"'); overload;
  ...
end;

It looks like I have to keep keys for every row of fields. What about TDictionary<string, TStringList> ? Would it be a better solution ?

like image 214
David Unric Avatar asked Dec 16 '22 07:12

David Unric


2 Answers

What about a TClientDataset? Seems quite easy.

Just a simple guide on how to use TClientDataSet as an in-memory dataset, can be found here.

like image 126
GolezTrol Avatar answered Dec 29 '22 00:12

GolezTrol


The structure you are proposing would mean that you would have a TDictionary instance for every row in your csv file. In essence duplicating the column names for every row. Seems like a bit of a waste.

Assuming that with TDictionary<string, TStringList> you would fill each TStringList with the values from a single column. That could work, but it still won't be easy to iterate over all columns per row of data.

As GolezTrol suggests, TClientDataSet comes standard with Delphi, is very powerful and as a dataset intended to be used with columnar data. Also, although it is a dataset, it does not require a database (connection) and is used in many application for exactly the goal you are trying to achieve: an in-memory dataset.

like image 24
Marjan Venema Avatar answered Dec 28 '22 23:12

Marjan Venema