This is the first time I'm doing a)functional programming b)in F#.
Basically there are a number of files on disk (n > 50); each file stores an instrument's readings and timestamp of the time when that reading was taken. The problem is to get all the readings in all the files sorted based on timestamp.
NB the files are huge. over 10,000 entries per file.
File 1: <12:00, XXX> ; <15:30, XXX> ; <18:20, XXX> ;
File 2: <10:45, XXX> ; <16:20, XXX> ; <16:55, XXX> ;
File 3: <17:50, XXX> ;
The first n00b thing is to get all the entries across all the files in chucks of N and then use one of F#'s inbuilt sorting thingies. If we take things in chunks of "1" from each file, then File 3: <17:50, XXX> would be unsorted when the next chunk is taken. To deal with that intend to check the lowest and highest timestamp values in a chunk, and test if they lie within the previous or succeeding chunk's ranges.
Basically I am still thinking in an imperative way (almost a decade of C does that). Recently I've toyed around with a producer-consumer approach to using MailboxProcessor.
Coming from experienced F# programmers, is there any "functional" and better way to sort mutli-file timestamps in parallel using F#?
Assuming the files aren't too large you can do something like:
seq {
for path in files do
yield! File.ReadAllLines(path)
}
|> Seq.map parseTimestamp
|> Seq.sort
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With