Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cats effect - parallel composition of independent effects

I want to combine multiple IO values that should run independently in parallel.

val io1: IO[Int] = ???
val io2: IO[Int] = ???

As I see it, I have to options:

  1. Use cats-effect's fibers with a fork-join pattern
    val parallelSum1: IO[Int] = for {
      fiber1 <- io1.start
      fiber2 <- io2.start
      i1 <- fiber1.join
      i2 <- fiber2.join
    } yield i1 + i2
    
  2. Use the Parallel instance for IO with parMapN (or one of its siblings like parTraverse, parSequence, parTupled etc)
    val parallelSum2: IO[Int] = (io1, io2).parMapN(_ + _)
    

Not sure about the pros and cons of each approach, and when should I choose one over the other. This becomes even more tricky when abstracting over the effect type IO (tagless-final style):

def io1[F[_]]: F[Int] = ???
def io2[F[_]]: F[Int] = ???

def parallelSum1[F[_]: Concurrent]: F[Int] = for {
  fiber1 <- io1[F].start
  fiber2 <- io2[F].start
  i1 <- fiber1.join
  i2 <- fiber2.join
} yield i1 + i2

def parallelSum2[F[_], G[_]](implicit parallel: Parallel[F, G]): F[Int] =
  (io1[F], io2[F]).parMapN(_ + _)

The Parallel typeclass requires 2 type constructors, making it somewhat more cumbersome to use, without context bounds and with an additional vague type parameter G[_]

Your guidance is appreciated :)

Amitay

like image 640
amitayh Avatar asked Jan 13 '19 13:01

amitayh


People also ask

What is Cats effect?

Cats Effect is a high-performance, asynchronous, composable framework for building real-world applications in a purely functional style within the Typelevel ecosystem.

What is cats in Scala?

Cats is a library which provides abstractions for functional programming in the Scala programming language. Scala supports both object-oriented and functional programming, and this is reflected in the hybrid approach of the standard library.

What is fiber in Scala?

You can think of fibers as being lightweight threads, a fiber being a concurrency primitive for doing cooperative multi-tasking. trait Fiber[F[_], A] { def cancel: F[Unit] def join: F[A] } For example a Fiber value is the result of evaluating IO.start : import cats.effect.{Fiber, IO} import scala.concurrent.


1 Answers

I want to combine multiple IO values that should run independently in parallel.

The way I view it, in order to figure out "when do I use which?", we need to return the the old parallel vs concurrent discussion, which basically boils down to (quoting the accepted answer):

Concurrency is when two or more tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant. For example, multitasking on a single-core machine.

Parallelism is when tasks literally run at the same time, e.g., on a multicore processor.

We often like to provide an example of concurrency when we we do IO like operations, such as creating an over the wire call, or talking to disk.

Question is, which one do you want when you say you want to execute "in parallel", is it the former or the latter?

If we're referring to the former, then using Concurrent[F] both conveys the intention by the signature and provides the proper execution semantics. If it's the latter, and we, for example, want to process a collection of elements in parallel, then going with Parallel[F, G] would be the better solution.

It is often quite confusing when we think about the semantics of this regarding IO, because it has both instances for Parallel and Concurrent and we mostly use it to opaquely define side effecting operations.

As a side note, the reason behind Parallel taking two unary type constructors is because of the fact that M (in Parallel[M[_], F[_]]) in always a Monad instance, and we need a way to prove the Monad has an Applicative[F] instance as well for parallel executions, because when we think of a Monad we always talk about sequential execution semantics.

like image 178
Yuval Itzchakov Avatar answered Oct 27 '22 06:10

Yuval Itzchakov