I really don't seem to be understanding Map and FlatMap. What I am failing to understand is how a for-comprehension is a sequence of nested calls to map and flatMap. The following example is from Functional Programming in Scala
def bothMatch(pat:String,pat2:String,s:String):Option[Boolean] = for { f <- mkMatcher(pat) g <- mkMatcher(pat2) } yield f(s) && g(s)
translates to
def bothMatch(pat:String,pat2:String,s:String):Option[Boolean] = mkMatcher(pat) flatMap (f => mkMatcher(pat2) map (g => f(s) && g(s)))
The mkMatcher method is defined as follows:
def mkMatcher(pat:String):Option[String => Boolean] = pattern(pat) map (p => (s:String) => p.matcher(s).matches)
And the pattern method is as follows:
import java.util.regex._ def pattern(s:String):Option[Pattern] = try { Some(Pattern.compile(s)) }catch{ case e: PatternSyntaxException => None }
It will be great if someone could shed some light on the rationale behind using map and flatMap here.
In Scala, flatMap() method is identical to the map() method, but the only difference is that in flatMap the inner grouping of an item is removed and a sequence is generated. It can be defined as a blend of map method and flatten method.
Scala offers a lightweight notation for expressing sequence comprehensions. Comprehensions have the form for (enumerators) yield e , where enumerators refers to a semicolon-separated list of enumerators. An enumerator is either a generator which introduces new variables, or it is a filter.
It uses One-To-One mapping. It's mapper function produces multiple values (stream of values) for each input value. It's mapper function produces single values for each input value. Use the flatMap() method when the mapper function is producing multiple values for each input value.
In Scala flatmap method is used on the collection and data structures of scale, as the name suggests it is the combination of two things methods i.e. map and Flatten method. If we use a flat map on any collection then it will apply both this method map and flatten method on the given collection.
TL;DR go directly to the final example
I'll try and recap.
Definitions
The for
comprehension is a syntax shortcut to combine flatMap
and map
in a way that's easy to read and reason about.
Let's simplify things a bit and assume that every class
that provides both aforementioned methods can be called a monad
and we'll use the symbol M[A]
to mean a monad
with an inner type A
.
Examples
Some commonly seen monads include:
List[String]
where M[X] = List[X]
A = String
Option[Int]
where M[X] = Option[X]
A = Int
Future[String => Boolean]
where M[X] = Future[X]
A = (String => Boolean)
map and flatMap
Defined in a generic monad M[A]
/* applies a transformation of the monad "content" mantaining the * monad "external shape" * i.e. a List remains a List and an Option remains an Option * but the inner type changes */ def map(f: A => B): M[B] /* applies a transformation of the monad "content" by composing * this monad with an operation resulting in another monad instance * of the same type */ def flatMap(f: A => M[B]): M[B]
e.g.
val list = List("neo", "smith", "trinity") //converts each character of the string to its corresponding code val f: String => List[Int] = s => s.map(_.toInt).toList list map f >> List(List(110, 101, 111), List(115, 109, 105, 116, 104), List(116, 114, 105, 110, 105, 116, 121)) list flatMap f >> List(110, 101, 111, 115, 109, 105, 116, 104, 116, 114, 105, 110, 105, 116, 121)
for expression
Each line in the expression using the <-
symbol is translated to a flatMap
call, except for the last line which is translated to a concluding map
call, where the "bound symbol" on the left-hand side is passed as the parameter to the argument function (what we previously called f: A => M[B]
):
// The following ... for { bound <- list out <- f(bound) } yield out // ... is translated by the Scala compiler as ... list.flatMap { bound => f(bound).map { out => out } } // ... which can be simplified as ... list.flatMap { bound => f(bound) } // ... which is just another way of writing: list flatMap f
A for-expression with only one <-
is converted to a map
call with the expression passed as argument:
// The following ... for { bound <- list } yield f(bound) // ... is translated by the Scala compiler as ... list.map { bound => f(bound) } // ... which is just another way of writing: list map f
Now to the point
As you can see, the map
operation preserves the "shape" of the original monad
, so the same happens for the yield
expression: a List
remains a List
with the content transformed by the operation in the yield
.
On the other hand each binding line in the for
is just a composition of successive monads
, which must be "flattened" to maintain a single "external shape".
Suppose for a moment that each internal binding was translated to a map
call, but the right-hand was the same A => M[B]
function, you would end up with a M[M[B]]
for each line in the comprehension.
The intent of the whole for
syntax is to easily "flatten" the concatenation of successive monadic operations (i.e. operations that "lift" a value in a "monadic shape": A => M[B]
), with the addition of a final map
operation that possibly performs a concluding transformation.
I hope this explains the logic behind the choice of translation, which is applied in a mechanical way, that is: n
flatMap
nested calls concluded by a single map
call.
A contrived illustrative example
Meant to show the expressiveness of the for
syntax
case class Customer(value: Int) case class Consultant(portfolio: List[Customer]) case class Branch(consultants: List[Consultant]) case class Company(branches: List[Branch]) def getCompanyValue(company: Company): Int = { val valuesList = for { branch <- company.branches consultant <- branch.consultants customer <- consultant.portfolio } yield (customer.value) valuesList reduce (_ + _) }
Can you guess the type of valuesList
?
As already said, the shape of the monad
is maintained through the comprehension, so we start with a List
in company.branches
, and must end with a List
.
The inner type instead changes and is determined by the yield
expression: which is customer.value: Int
valueList
should be a List[Int]
I'm not a scala mega mind so feel free to correct me, but this is how I explain the flatMap/map/for-comprehension
saga to myself!
To understand for comprehension
and it's translation to scala's map / flatMap
we must take small steps and understand the composing parts - map
and flatMap
. But isn't scala's flatMap
just map
with flatten
you ask thyself! if so why do so many developers find it so hard to get the grasp of it or of for-comprehension / flatMap / map
. Well, if you just look at scala's map
and flatMap
signature you see they return the same return type M[B]
and they work on the same input argument A
(at least the first part to the function they take) if that's so what makes a difference?
Our plan
map
.flatMap
.for comprehension
.`Scala's map
scala map signature:
map[B](f: (A) => B): M[B]
But there is a big part missing when we look at this signature, and it's - where does this A
comes from? our container is of type A
so its important to look at this function in the context of the container - M[A]
. Our container could be a List
of items of type A
and our map
function takes a function which transform each items of type A
to type B
, then it returns a container of type B
(or M[B]
)
Let's write map's signature taking into account the container:
M[A]: // We are in M[A] context. map[B](f: (A) => B): M[B] // map takes a function which knows to transform A to B and then it bundles them in M[B]
Note an extremely highly highly important fact about map - it bundles automatically in the output container M[B]
you have no control over it. Let's us stress it again:
map
chooses the output container for us and its going to be the same container as the source we work on so for M[A]
container we get the same M
container only for B
M[B]
and nothing else!map
does this containerization for us we just give a mapping from A
to B
and it would put it in the box of M[B]
will put it in the box for us!You see you did not specify how to containerize
the item you just specified how to transform the internal items. And as we have the same container M
for both M[A]
and M[B]
this means M[B]
is the same container, meaning if you have List[A]
then you are going to have a List[B]
and more importantly map
is doing it for you!
Now that we have dealt with map
let's move on to flatMap
.
Scala's flatMap
Let's see its signature:
flatMap[B](f: (A) => M[B]): M[B] // we need to show it how to containerize the A into M[B]
You see the big difference from map to flatMap
in flatMap we are providing it with the function that does not just convert from A to B
but also containerizes it into M[B]
.
why do we care who does the containerization?
So why do we so much care of the input function to map/flatMap does the containerization into M[B]
or the map itself does the containerization for us?
You see in the context of for comprehension
what's happening is multiple transformations on the item provided in the for
so we are giving the next worker in our assembly line the ability to determine the packaging. imagine we have an assembly line each worker does something to the product and only the last worker is packaging it in a container! welcome to flatMap
this is it's purpose, in map
each worker when finished working on the item also packages it so you get containers over containers.
The mighty for comprehension
Now let's looks into your for comprehension taking into account what we said above:
def bothMatch(pat:String,pat2:String,s:String):Option[Boolean] = for { f <- mkMatcher(pat) g <- mkMatcher(pat2) } yield f(s) && g(s)
What have we got here:
mkMatcher
returns a container
the container contains a function: String => Boolean
<-
they translate to flatMap
except for the last one.f <- mkMatcher(pat)
is first in sequence
(think assembly line
) all we want out of it is to take f
and pass it to the next worker in the assembly line, we let the next worker in our assembly line (the next function) the ability to determine what would be the packaging back of our item this is why the last function is map
.The last g <- mkMatcher(pat2)
will use map
this is because its last in assembly line! so it can just do the final operation with map( g =>
which yes! pulls out g
and uses the f
which has already been pulled out from the container by the flatMap
therefore we end up with first:
mkMatcher(pat) flatMap (f // pull out f function give item to next assembly line worker (you see it has access to f
, and do not package it back i mean let the map determine the packaging let the next assembly line worker determine the container. mkMatcher(pat2) map (g => f(s) ...)) // as this is the last function in the assembly line we are going to use map and pull g out of the container and to the packaging back, its map
and this packaging will throttle all the way up and be our package or our container, yah!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With