Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Library for reading csv file in F#

Tags:

csv

f#

F#

I am interested to read a csv file and output a List< List< string > >

let readCsv (filepath:string) : string list list =
//.......................

input file:
Quote1,Quote2,Quote3
"Hello,World","He said:""Yes""",Example

Output:
// Type: string list list
[["Quote1";"Quote2";"Quote3"];
 ["Hello,World"; "He said:"Yes"";"Example"]] 

Input2:
1,2,3,4,5,6
7,8,9,10,11,12

Output2:
// Type: string list list
[["1";"2";"3";"4";"5";"6"];
 ["7";"8";"9";"10";"11";"12"]]

However, some of the Nuget packages, e.g. CsvHelper, FileHelper, F#Data relies on defining a Class to "capture" the data, or defining a type by referring to a csv file.

https://joshclose.github.io/CsvHelper/

http://www.filehelpers.net/example/QuickStart/ReadWriteRecordByRecord/

http://fsharp.github.io/FSharp.Data/index.html

For example:

// In C#, from FileHelper Documentation
[DelimitedRecord(",")]
public class AbstractClass
{
    public string Quote1;
    public string Quote2;
    public string Quote3;
}

or

// F# Data Documentation
type AbstractType = CsvProvider<"../example.csv">

But the input file may change in number of columns (and so I cannot define an abstract class)

Of course, I can just write regular expression to break up the input file line-by-line, but I am interested to know if someone else has already done it (or is it a standard library function).

Thank you.

like image 801
CH Ben Avatar asked Jul 31 '17 09:07

CH Ben


People also ask

What library reads CSV files?

Reading from a CSV file is done using the reader object. The CSV file is opened as a text file with Python's built-in open() function, which returns a file object.

Which library is used for CSV files in Python?

To write to a CSV file in Python, we can use the csv. writer() function. The csv. writer() function returns a writer object that converts the user's data into a delimited string.

How do I read a CSV file in FS?

You will use the fs module's createReadStream() method to read the data from the CSV file and create a readable stream. You will then pipe the stream to another stream initialized with the csv-parse module to parse the chunks of data. Once the chunks of data have been parsed, you can log them in the console.

Is csv standard Python library?

CSV stands for comma separated values. This file format is a commonly used data format while exporting/importing data to/from spreadsheets and data tables in databases. The csv module was incorporated in Python's standard library as a result of PEP 305.


1 Answers

If you use FSharp.Data there's a CsvFile class which can read arbitrary CSV files.

e.g.

let csv = CsvFile.Load(filename, hasHeaders = true)
csv.Rows
|> Seq.map (fun r -> (r.["Image"], float r.["Size"]))

Would create a sequence of tuples from the "Image" and "Size" columns.

csv.Headers is a string[] option which contains the headers from the first line of the file.

let csv = CsvFile.Load(filename, hasHeaders = false)
csv.Rows
|> Seq.map (fun r -> r.Columns |> List.ofArray)
|> List.ofSeq

might be what you're after

like image 57
marklam Avatar answered Oct 13 '22 11:10

marklam