Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rigorous definition for CSV file reading/writing

Tags:

c

csv

I have written my own CSV reader/writer in C to store records in a character column in an ODBC database. Unfortunately I have discovered many edge cases that trip over my implementation, and I have come to the conclusion my problem is that I have not rigorously defined the rules for CSV. I've read RFC4180, but it seems incomplete and does not resolve ambiguities.

For example, should "" be considered an empty token or a double quote? Do quotes match outside-in or left to right? What do I do with an input string that has unmatched single quotes? The real mess begins when I have nested tokens, which doubles up the escaped quotation characters.

What I really need is a definitive CSV standard that I can implement in code. Every time I feel I have nailed every corner case, I find another one. I am sure this problem has been mulled over and solved many times over by superior minds to mine, has anyone written a rigorous definition of CSV that I can implement in code? I realise C is not the ideal language here, but I don't have a choice about the compiler at this stage; nor can I use a third party library (unless it compiles with C-90). Boost is not an option as my compiler doesn't support C++. I have contemplated ditching CSV for XML, but it seems like overkill for storing a few tokens in a 256 character database record. Anyone made a definitive CSV spec?

like image 361
Piers Avatar asked Jun 06 '13 03:06

Piers


People also ask

How do you explain a CSV file?

A CSV is a comma-separated values file, which allows data to be saved in a tabular format. CSVs look like a garden-variety spreadsheet but with a . csv extension. CSV files can be used with most any spreadsheet program, such as Microsoft Excel or Google Spreadsheets.

What is the function for reading CSV data?

csv file in reading mode using open() function. Then, the csv. reader() is used to read the file, which returns an iterable reader object. The reader object is then iterated using a for loop to print the contents of each row.

What is the difference between writing and reading CSV files?

When writing files you can specify the separator and quote characters, when reading CSV you can specify column positions, types, and validate data. The Jackson Databind library supports the CSV format (as well as many others). Writing CSV files from existing data is simple as shown here for running example:

What is a “CSV” file?

A “CSV” file stands for “Comma Separated Values.” Staying true to its name, it is a text file that records values separated by a comma (”, ”) which is also the “default delimiter.” The first line of this file contains the column names (headers) followed by subsequent rows that store the metrics for its corresponding column.

How to read a CSV file in Python?

Check the official Python manual to get a sense of all the possible operations. To read a.csv file, we must first load the csv module. This module’s reader function () reads and iterates over each row. The data in the file is then placed in a Python object, and the returned output is a Python list containing all the data.

How do you write Hello world in a CSV file?

In the above example, our first column is 1 and the second is Hello, world!, however, a CSV reader would divide the row into 3 columns 1, Hello and world!. 1 1, "Hello, world!" This way we mean that the string Hello, world! is a single data field.


1 Answers

There is no standard (see Wikipedia's article, in particular http://en.wikipedia.org/wiki/Comma-separated_values#Lack_of_a_standard), so in order to use CSV, you need to follow the general principle of being conservative in what you generate and liberal in what you accept. In particular:

  • Do not use quotation marks for blank fields. Simply write an empty field (two adjacent delimiters, or a delimiter in the first/last position of the line).
  • Quote any field containing a quotation mark, comma, or newline.
like image 119
R.. GitHub STOP HELPING ICE Avatar answered Sep 23 '22 05:09

R.. GitHub STOP HELPING ICE