Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match with empty sequence

I'm learning F# and I've started to play around with both sequences and match expressions.

I'm writing a web scraper that's looking through HTML similar to the following and taking the last URL in a parent <span> with the paging class.

<html>
<body>
    <span class="paging">
        <a href="http://google.com">Link to Google</a>
        <a href="http://TheLinkIWant.com">The Link I want</a>
    </span>
</body>
</html>

My attempt to get the last URL is as follows:

type AnHtmlPage = FSharp.Data.HtmlProvider<"http://somesite.com">

let findMaxPageNumber (page:AnHtmlPage)= 
    page.Html.Descendants()
    |> Seq.filter(fun n -> n.HasClass("paging"))
    |> Seq.collect(fun n -> n.Descendants() |> Seq.filter(fun m -> m.HasName("a")))
    |> Seq.last
    |> fun n -> n.AttributeValue("href")

However I'm running into issues when the class I'm searching for is absent from the page. In particular I get ArgumentExceptions with the message: Additional information: The input sequence was empty.

My first thought was to build another function that matched empty sequences and returned an empty string when the paging class wasn't found on a page.

let findUrlOrReturnEmptyString (span:seq<HtmlNode>) =
    match span with 
    | Seq.empty -> String.Empty      // <----- This is invalid
    | span -> span
    |> Seq.collect(fun (n:HtmlNode) -> n.Descendants() |> Seq.filter(fun m -> m.HasName("a")))
    |> Seq.last
    |> fun n -> n.AttributeValue("href")

let findMaxPageNumber (page:AnHtmlPage)= 
    page.Html.Descendants()
    |> Seq.filter(fun n -> n.HasClass("paging"))
    |> findUrlOrReturnEmptyStrin

My issue is now that Seq.Empty is not a literal and cannot be used in a pattern. Most examples with pattern matching specify empty lists [] in their patterns so I'm wondering: How can I use a similar approach and match empty sequences?

like image 630
JoshVarty Avatar asked Aug 11 '16 22:08

JoshVarty


2 Answers

Use a guard clause

match myseq with
| s when Seq.isEmpty s -> "empty"
| _ -> "not empty"
like image 57
Ilya Kharlamov Avatar answered Oct 12 '22 20:10

Ilya Kharlamov


You can use a when guard to further qualify the case:

match span with 
| sequence when Seq.isEmpty sequence -> String.Empty
| span -> span
|> Seq.collect (fun (n: HtmlNode) ->
    n.Descendants()
    |> Seq.filter (fun m -> m.HasName("a")))
|> Seq.last
|> fun n -> n.AttributeValue("href")

ildjarn is correct in that in this case, an if...then...else may be the more readable alternative, though.

like image 31
TeaDrivenDev Avatar answered Oct 12 '22 19:10

TeaDrivenDev