Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conditional sum in F#

Tags:

record

pivot

f#

I have defined a record type for some client data in F# as follows:-

  type DataPoint = {
       date: string; 
       dr: string; 
       Group: string; 
       Product: string; 
       Book: int; 
       Revenue: int} with 
          static member fromFile file =
               file
               |> File.ReadLines
               |> Seq.skip 1 //skip the header
               |> Seq.map (fun s-> s.Split ',') // split each line into array
               |> Seq.map (fun a -> {date = string a.[0]; dr = string a.[1];
                              Group = string a.[2]; Product = string a.[3];
                                Book = int a.[4]; Revenue = int a.[5] });;  

    // creates a record for each line
    let pivot (file) = DataPoint.fromFile file
              |> ??????????

For the rows where date, dr, Group and Product are all equal, I want to then sum all of the Book and Revenue entries, producing a pivoted row. So some kind of if else statement should be fine. I suspect I need to start at the first data point and recursively add each matching row and then delete the matching row to avoid duplicates in the output.

Once I have done this I will be easily able to write these pivoted rows to another csv file.

Can anyone get me started?

like image 590
Simon Hayward Avatar asked Nov 20 '12 13:11

Simon Hayward


2 Answers

Seq.groupBy and Seq.reduce are what you're looking for:

let pivot file = 
    DataPoint.fromFile file
    |> Seq.groupBy (fun dp -> dp.date, dp.dr, dp.Group, dp.Product)
    |> Seq.map (snd >> Seq.reduce (fun acc dp -> 
                          { date = acc.date; dr = acc.dr; 
                            Group = acc.Group; Product = acc.Product;
                            Book = acc.Book + dp.Book; 
                            Revenue = acc.Revenue + dp.Revenue; }))
like image 80
pad Avatar answered Sep 20 '22 05:09

pad


Quickly hacked up, should give you some idea:

// Sample data
let data = [
             {date    = "2012-01-01"
              dr      = "Test"
              Group   = "A" 
              Product = "B"
              Book    = 123
              Revenue = 123}
             {date   = "2012-01-01"
              dr      = "Test"
              Group   = "A"
              Product = "B"
              Book    = 123
              Revenue = 123}
             {date = "2012-01-01"
              dr = "Test"
              Group = "B" 
              Product = "B"
              Book = 11
              Revenue = 123}]


let grouped = data |> Seq.groupBy(fun d -> (d.date, d.dr, d.Group, d.Product))
                   |> Seq.map (fun (k,v) -> (k, v |> Seq.sumBy (fun v -> v.Book), v |> Seq.sumBy (fun v -> v.Revenue)))

for g,books,revs in grouped do
   printfn "Books %A: %d" g books
   printfn "Revenues %A: %d" g revs

prints

Books ("2012-01-01", "Test", "A", "B"): 246
Revenues ("2012-01-01", "Test", "A", "B"): 246
Books ("2012-01-01", "Test", "B", "B"): 11
Revenues ("2012-01-01", "Test", "B", "B"): 11
like image 21
Robert Jeppesen Avatar answered Sep 18 '22 05:09

Robert Jeppesen