Paul Chiusano and Rúnar Óli have written a fantastic book Functional programming in Scala. In it they mention a little-referenced concept in the Scala community - Transducers.
In the Clojure Community - Transducers get a little more press.
My question is: What are the similarities and differences between Scala Transducers **(from the book Functional Programming in Scala) and Clojure Transducers?**
Assumptions:
I'm aware that
Transducers are common parlance from their concept in Electrical Engineering
There is a pre-existing concept in Computer Science called a Finite State Transducer
There is a precedent in Biology and Psychology adopting the word transduction
There is already a history of other technical books like SICP adopting the word Transducers.
The stream transducers from the book Function Programming in Scala (FPiS) and Clojure's transducers are quite similar. They are a generalisation of the idea of having a "machine" (step function) to process the input stream into the output stream. FPiS's transducers are called Process
es. Rich Hickey also uses the term process in his introductory talk on transducers in Clojure.
The design of FPiS's transducers is based on Mealy machines. Mealy machines are said to have:
transition function T : (S, I) -> S
output function G : (S, I) -> O
These functions can be fused together to form:
step: (S, I) -> (S, O)
Heres it's easy to see that the step function operates on the current state of the machine and the next input item to produce the next state of the machine and output item.
One of the combinators from FPiS uses such a step function:
trait Process[I, O] {
...
def loop[S, I, O](z: S)(f: (I,S) => (O,S)): Process[I, O]
...
}
This loop
function is essentially the seeded left reduction that Rickey talks about in this slide.
Both can be used in many different context (such as lists, streams, channels, etc.).
In FPiS transducers, a process type is:
trait Process[I, O]
All it knows about are it's input elements and it's output elements.
In Clojure, it's a similar story. Hickey calls this "fully decoupled".
Both types of transducers can be composed.
FPiS uses a "pipe" operator
map(labelHeavy) |> filter(_.nonFood)
Clojure uses comp
(comp
(filtering non-food?)
(mapping label-heavy))
In Clojure:
reducer: (whatever, input) -> whatever
transducer: reducer -> reducer
In FPiS:
// The main type is
trait Process[I, O]
// Many combinators have the type
Process[I, O] ⇒ Process[I, O]
However, FPiS's representation isn't just a function under the hood. It's a case-class (algebraic data type) with 3 variants: Await, Emit, and Halt.
case class Await[I,O](recv: Option[I] => Process[I,O])
case class Emit[I,O](head: O, tail: Process[I,O]
case class Halt[I,O]() extends Process[I,O]
reduced
in Clojure.Both support early termination. Clojure does it using a special value called reduced
which can be tested for via the reduced?
predicate.
FPiS uses a more statically typed approach, a Process can be in one of 3 states: Await, Emit or Halt. When a "step function" returns a process of state Halt, the processing function knows to stop.
In some points they are again similar. Both types of transducers are demand-driven and don't generate intermediate collections. However, I'd imagine that FPiS's transducers are not as efficient when pipelined/composed as the internal representation is more than "just a stack of function calls" as Hickey puts it. I'm only guessing here about the efficiency/performance though.
Look into fs2
(previously scalaz-stream
) for a perhaps more performant library that is based on the design of the transducers in FPiS.
Here's an example of filter
in both implementations:
Clojure, from Hickey's talk slides:
(defn filter
([pred]
(fn [rf]
(fn
([] (rf))
([result] (rf result))
([result input]
(if (prod input)
(rf result input)
result)))))
([pred coll]
(sequence (filter red) coll)))
In FPiS, here's one way to implement it:
def filter[I](f: I ⇒ Boolean): Process[I, I] =
await(i ⇒ if (f(i)) emit(i, filter(f))
else filter(f))
As you can see, filter
is built up here from other combinators such as await
and emit
.
There are a number of places where you have to be careful when implementing Clojure transducers. This seems to be a design trade-off favouring efficiency. However, this downside would seem to effect mostly library producers rather than end-users/consumers.
reduced
value from a nested step call, it must never call that step function again with input.The transducer design from FPiS favours correctness and ease of use. The pipe composition and flatMap
operations ensure that completion actions occur promptly and that errors are handled appropriately. These concerns are not a burden to implementors of transducers. That said, I imagine that the library may not be as efficient as the Clojure one.
Both Clojure and FPiS transducers have:
They differ somewhat in their underlying representation. Clojure style transducers seem to favour efficiency whereas FPiS transducers favour correctness and compositionality.
I am not particularly familiar with Scala's concept of transducers or how ubiquitous that terminology is, but from the snippet of text you've posted above (and my knowledge of transducers), here's what I can say:
It would seem from the definition above that any function or callable with a signature roughly along the lines of
Stream[A] -> Stream[B]
So for instance, a mapping function that worked streams would count as a transducer in this case.
That's it; pretty simple really.
A Clojure transducer is a function that transforms one reducing function into another. A reducing function is one that can be used with reduce. That is, if Clojure had signatures, it would have a signature
(x, a) -> x
In English, given some starting collection x
, and "the next thing" a
in the collection being reduced, our reducing function returns "the next iteration of the collection being built".
So, if this is the signature of a reducing function, a transducer has signature
((x, a) -> x) -> ((x, b) -> x)
The reason transducers were added to Clojure is that with the addition or core.async
channels, Rich Hickey and friends found themselves reimplimenting all of the stanard collection functions to work with channels (map
, filter
, take
, etc.). RH wondered if there wasn't a better way here, and went to work thinking about how to decomplect the logic of these various collection processing functions from the mechanics of the collection types at hand. Explaining exactly how transducers do this is I think outside the scope of this question however, so I'll get back to the point. But if you're interested, there is a lot of literature on this that's easy to find and work through.
Clearly, these are very different concepts, but here's how I see them relating:
While Scala transducers are collection processing functions for Streams (as apposed to other Scala collections), Clojure's transducers are really a mechanism for unifying the implementation of collection processing functions across different collection types. So one way of putting it would be the if Scala had Clojure's notion of transducers, Scala's notion of transducers could be implemented in terms of Clojure's notion of transducers, which are more abstract/general processing functions reusable for multiple collection types.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With