Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel sorting of timestamps in F# - the functional way?

This is the first time I'm doing a)functional programming b)in F#.

Basically there are a number of files on disk (n > 50); each file stores an instrument's readings and timestamp of the time when that reading was taken. The problem is to get all the readings in all the files sorted based on timestamp.

NB the files are huge. over 10,000 entries per file.

File 1: <12:00, XXX> ; <15:30, XXX> ; <18:20, XXX> ;

File 2: <10:45, XXX> ; <16:20, XXX> ; <16:55, XXX> ;

File 3: <17:50, XXX> ;

The first n00b thing is to get all the entries across all the files in chucks of N and then use one of F#'s inbuilt sorting thingies. If we take things in chunks of "1" from each file, then File 3: <17:50, XXX> would be unsorted when the next chunk is taken. To deal with that intend to check the lowest and highest timestamp values in a chunk, and test if they lie within the previous or succeeding chunk's ranges.

Basically I am still thinking in an imperative way (almost a decade of C does that). Recently I've toyed around with a producer-consumer approach to using MailboxProcessor.

Coming from experienced F# programmers, is there any "functional" and better way to sort mutli-file timestamps in parallel using F#?

like image 285
AruniRC Avatar asked Dec 04 '25 07:12

AruniRC


1 Answers

Assuming the files aren't too large you can do something like:

seq {
  for path in files do
    yield! File.ReadAllLines(path)
}
|> Seq.map parseTimestamp
|> Seq.sort
like image 156
Daniel Avatar answered Dec 06 '25 03:12

Daniel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!