Hi I'm looking to find the best way to read in a fixed width text file using F#. The file will be plain text, from one to a couple of thousand lines long and around 1000 characters wide. Each line contains around 50 fields, each with varying lengths. My initial thoughts were to have something like the following
type MyRecord = {
Name : string
Address : string
Postcode : string
Tel : string
}
let format = [
(0,10)
(10,50)
(50,7)
(57,20)
]
and read each line one by one, assigning each field by the format tuple(where the first item is the start character and the second is the number of characters wide).
Any pointers would be appreciated.
The hardest part is probably to split a single line according to the column format. It can be done something like this:
let splitLine format (line : string) =
format |> List.map (fun (index, length) -> line.Substring(index, length))
This function has the type (int * int) list -> string -> string list
. In other words, format
is an (int * int) list
. This corresponds exactly to your format
list. The line
argument is a string
, and the function returns a string list
.
You can map a list of lines like this:
let result = lines |> List.map (splitLine format)
You can also use Seq.map
or Array.map
, depending on how lines
is defined. Such a result
will be a string list list
, and you can now map over such a list to produce a MyRecord list
.
You can use File.ReadLines
to get a lazily evaluated sequence of strings from a file.
Please note that the above is only an outline of a possible solution. I left out boundary checks, error handling, and such. The above code may contain off-by-one errors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With