Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Debugging Seq.sumBy

Tags:

f#

sequences

I was trying to learn me some F# by looking at last years AdventOfCode solutions. I came across this neat peice of code, which I cannot parse at all:

i 1|>Seq.sumBy(" (".IndexOf)

Note, I believe I understand the prior line (in the link):

let i n=System.IO.File.ReadAllText(sprintf "%s/input/input%d.txt"__SOURCE_DIRECTORY__ n)

Which creates a function i that takes an integer n and reads the file inputN.txt and returns it as a string. Therefore i 1 returns input1.txt as a string.

Then |> is just piping the string (or array of chars?) as the first param to the next function, which is Seq.sumBy

But then things start breaking down...

sumBy seems straight forward enough:

Returns the sum of the results generated by applying the function to each element of the list.

But the IndexOf of a string " (" has me baffled.

Now, I don't really want any fishes here, what I would like to know is this. As a newbie to this foreign language, as I learn to work more bits of F#, how can I take this piece of code and decompose it into smaller pieces to test it to figure out what is going on? It is driving me nuts that I have the solution, have google/so, and I still can't understand this code.

Can someone show me some smaller snippets so I can discover the answer myself?

like image 975
Joshua Ball Avatar asked Dec 09 '16 19:12

Joshua Ball


Video Answer


2 Answers

So, we can break this into pieces.

i 1|>Seq.sumBy(" (".IndexOf)

You are correct about the i 1 section. This will read input1.txt, and give you the entire text as a string.

So, the first key here is that String implements IEnumerable<char> (char seq), which means that it's something that can be enumerated.

Next, let's look at the portion inside of the parens:

" (".IndexOf

The first part is just a string: " (", and IndexOf is a method on string. It returns the zero based index of a specific character, or -1 if it does not exist.

As it's a method, you can use it as a function - so " (".IndexOf can be thought of like:

(fun someChar -> 
              let str = " ("
              str.IndexOf(someChar))

--------- Stop here unless you want the full answer explained in detail --------

 

 

 

 

 

 

 

 

 

 

In this case, if the input character is ' ', it will return 0, if it's '(', it'll return 1, and if it's anything else, it will return -1.

The Seq.sumBy takes each character of the input string and pipes it into this function, then sums by the result. This means that each input '(' will add 1, each input ' ' will add 0, and anything else will add -1 (which, in this case, is ')' characters. A string like this "()" will add 1, then add -1, resulting in 0, which matches the goal of the day 1 advent challenge.

like image 114
Reed Copsey Avatar answered Oct 07 '22 20:10

Reed Copsey


FSI is your friend. I often use it to understand how functions can be broken down. If you paste the expression " (".IndexOf into FSI, at first glance it doesn't look like it helps:

> " (".IndexOf;;

  " (".IndexOf;;
  ^^^^^^^^^^^^

stdin(12,1): error FS0041: A unique overload for method 'IndexOf' could not be determined based on type information prior to this program point. A type annotation may be needed. Candidates: System.String.IndexOf(value: char) : int, System.String.IndexOf(value: string) : int

As you've already figured out, " (" is a string, and IndexOf is a method on string. In fact, there are quite a few overloads of that method, but only two with arity 1.

One of these overloads take a char as input, and the other takes a string as input.

The expression " (".IndexOf if a function. It's the short form of fun x -> " (".IndexOf x.

You're also already identified that string implements char seq, so when you use the Seq module over it, you're looking at each element of the sequence. In this case, each element is a char, so the overload in use here must be the one that takes a char as input.

Now that you've figured out which overload is in use, you can start to experiment with it in FSI:

> " (".IndexOf '(';;
val it : int = 1
> " (".IndexOf 'f';;
val it : int = -1
> " (".IndexOf 'o';;
val it : int = -1
> " (".IndexOf ' ';;
val it : int = 0

Apparently, the function looks for the index of each input char in " (", so every time you pass in '(' you get 1 (because it's zero-indexed), and when the input is ' ', the return value is 0. For all other values, the return value is -1.

An input string like "(foo bar)" is also a char seq. Instead of doing a sumBy, you can try to pipe it into Seq.map in order to understand how each of the elements are being translated:

> "(foo bar)" |> Seq.map (" (".IndexOf) |> Seq.toList;;
val it : int list = [1; -1; -1; -1; 0; -1; -1; -1; -1]

Now, Seq.map only translates, but Seq.sumBy takes all those numbers and adds them together:

> "(foo bar)" |> Seq.sumBy (" (".IndexOf);;
val it : int = -6

I still can't guess what the purpose is, but then, I've never seen the input string...

like image 22
Mark Seemann Avatar answered Oct 07 '22 22:10

Mark Seemann