Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I call Enumerable.Join from F#?

Tags:

c#

linq

f#

I have two sequences (of tuples) on which I need to do a join:

  • Seq 1: [(City1 * Pin1), (City2 * Pin2), (City1 * Pin3), (City1 * Pin4)]
  • Seq 2: [(Pin1 * ProductA), (Pin2 * ProductB), (Pin1 * ProductC), (Pin2 * ProductA)]

into the sequence (of tuples):

  • [(City1 * ProductA), (City2 * ProductB), (City * ProductC), (City2 * Product A)...]

In C# I could do this using the Linq Join extension method like:

seq1.Join(seq2, t => t.Item2, t=> t.Item1,
    (t,u) => Tuple.Create(t.Item1, u.Item2))

How do I accomplish this in F#? I cannot find join on Seq there.

like image 839
SharePoint Newbie Avatar asked Sep 23 '10 10:09

SharePoint Newbie


3 Answers

Edit: Actually, you can just use LINQ:

> open System.Linq;;
> let ans = seq1.Join(seq2, (fun t -> snd t), (fun t -> fst t), (fun t u -> (fst t, snd u)));;

Why not use F#'s native Seq functions? If you look at the docs and at this question you can simply use these instead of LINQ. Take the Seq.map2 function for example:

> let mapped = Seq.map2 (fun a b -> (fst a, snd b)) seq1 seq2;;

val it : seq<string * string> =
  seq [("city1", "product1"); ("city2", "product2")]

should give you what you want, where seq1 and seq2 are your first and second sequences.

like image 74
Callum Rogers Avatar answered Oct 19 '22 23:10

Callum Rogers


F# Interactive session:

> let seq1 = seq [("city1", "pin1"); ("city2", "pin2")];;

val seq1 : seq<string * string> = [("city1", "pin1"); ("city2", "pin2")]

> let seq2 = seq [("pin1", "product1"); ("pin2", "product2")];;

val seq2 : seq<string * string> = [("pin1", "product1"); ("pin2", "product2")]

> Seq.zip seq1 seq2;;
val it : seq<(string * string) * (string * string)> =
  seq
    [(("city1", "pin1"), ("pin1", "product1"));
     (("city2", "pin2"), ("pin2", "product2"))]
> Seq.zip seq1 seq2 |> Seq.map (fun (x,y) -> (fst x, snd y));;
val it : seq<string * string> =
  seq [("city1", "product1"); ("city2", "product2")]

Also, you must be able to use Linq queries on sequences, just be sure you have a reference to the System.Linq assembly and opened a namespace open System.Linq

UPDATE: in a complex scenario you can use sequence expressions as follows:

open System

let seq1 = seq [("city1", "pin1"); ("city2", "pin2"); ("city1", "pin3"); ("city1", "pin4")]
let seq2 = seq [("pin1", "product1"); ("pin2", "product2"); ("pin1", "product3"); ("pin2", "product1")]

let joinSeq = seq { for x in seq1 do
                        for y in seq2 do
                            let city, pin = x
                            let pin1, product = y
                            if pin = pin1 then
                                yield(city, product) }
for(x,y)in joinSeq do
    printfn "%s: %s" x y

Console.ReadKey() |> ignore
like image 35
Artem Koshelev Avatar answered Oct 19 '22 23:10

Artem Koshelev


I think that it is not exactly clear what results are you expecting, so the answers are a bit confusing. Your example could be interpreted in two ways (either as zipping or as joining) and they are dramatically different.

  • Zipping: If you have two lists of the same length and you want to align correspoding items (e.g. 1st item from first list with 1st item from the second list; 2nd item from first list with 2nd item from the second list, etc..), then look at the answers that use either List.zip or List.map2.

    However, this would mean that the lists are sorted by pins and pins are unique. In that case you don't need to use Join and even in C#/LINQ, you could use Zip extension method.

  • Joining: If the lists may have different lengths, pins may not be sorted or not unique, then you need to write a real join. A simplified version of the code by Artem K would look like this:

    seq { for city, pin1 in seq1 do 
            for pin2, product in seq2 do 
              if pin1 = pin2 then yield city, product }
    

    This may be less efficient than Join in LINQ, because it loops through all the items in seq2 for every item in seq1, so the complexity is O(seq1.Length * seq2.Length). I'm not sure, but I think that Join could use some hashing to be more efficient. Instead of using Join method directly, I would probably define a little helper:

    open System.Linq
    module Seq = 
      let join (seq1:seq<_>) seq2 k1 k2 =
        seq1.Join(seq2, (fun t -> k1 t), (fun t -> k2 t), (fun t u -> t, u)) 
    

    Then you can write something like this:

    (seq1, seq2) 
       ||> Seq.join snd fst 
       |> Seq.map (fun (t, u) -> fst t, snd u)
    

Finally, if you know that there is exactly one unique city for every product (the sequences have the same length and pins are unique in both of them), then you could just sort both sequences by pins and then use zip - this may be more efficient than using join (especially if you could keep the sequence sorted from some earlier operations).

like image 34
Tomas Petricek Avatar answered Oct 19 '22 23:10

Tomas Petricek