Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a polymorphic lambda?

The concept of lambdas (anonymous functions) is very clear to me. And I'm aware of polymorphism in terms of classes, with runtime/dynamic dispatch used to call the appropriate method based on the instance's most derived type. But how exactly can a lambda be polymorphic? I'm yet another Java programmer trying to learn more about functional programming.

like image 325
sdanzig Avatar asked Feb 14 '23 14:02

sdanzig


2 Answers

You will observe that I don't talk about lambdas much in the following answer. Remember that in functional languages, any function is simply a lambda bound to a name, so what I say about functions translates to lambdas.

Polymorphism

Note that polymorphism doesn't really require the kind of "dispatch" that OO languages implement through derived classes overriding virtual methods. That's just one particular kind of polymorphism, subtyping.

Polymorphism itself simply means a function allows not just for one particular type of argument, but is able to act accordingly for any of the allowed types. The simplest example: you don't care for the type at all, but simply hand on whatever is passed in. Or, to make it not quite so trivial, wrap it in a single-element container. You could implement such a function in, say, C++:

template<typename T> std::vector<T> wrap1elem( T val ) {
  return std::vector(val);
}

but you couldn't implement it as a lambda, because C++ (time of writing: C++11) doesn't support polymorphic lambdas.

Untyped values

...At least not in this way, that is. C++ templates implement polymorphism in rather an unusual way: the compiler actually generates a monomorphic function for every type that anybody passes to the function, in all the code it encounters. This is necessary because of C++' value semantics: when a value is passed in, the compiler needs to know the exact type (its size in memory, possible child-nodes etc.) in order to make a copy of it.

In most newer languages, almost everything is just a reference to some value, and when you call a function it doesn't get a copy of the argument objects but just a reference to the already-existing ones. Older languages require you to explicitly mark arguments as reference / pointer types.

A big advantage of reference semantics is that polymorphism becomes much easier: pointers always have the same size, so the same machine code can deal with references to any type at all. That makes, very uglily1, a polymorphic container-wrapper possible even in C:

typedef struct{
  void** contents;
  int size;
} vector;

vector wrap1elem_by_voidptr(void* ptr) {
  vector v;
  v.contents = malloc(sizeof(&ptr));
  v.contents[0] = ptr;
  v.size = 1;
  return v;
}
#define wrap1elem(val) wrap1elem_by_voidptr(&(val))

Here, void* is just a pointer to any unknown type. The obvious problem thus arising: vector doesn't know what type(s) of elements it "contains"! So you can't really do anything useful with those objects. Except if you do know what type it is!

int sum_contents_int(vector v) {
  int acc = 0, i;
  for(i=0; i<v.size; ++i) {
    acc += * (int*) (v.contents[i]);
  }
  return acc;
}

obviously, this is extremely laborious. What if the type is double? What if we want the product, not the sum? Of course, we could write each case by hand. Not a nice solution.

What would we better is if we had a generic function that takes the instruction what to do as an extra argument! C has function pointers:

int accum_contents_int(vector v, void* (*combine)(int*, int)) {
  int acc = 0, i;
  for(i=0; i<v.size; ++i) {
    combine(&acc, * (int*) (v.contents[i]));
  }
  return acc;
}

That could then be used like

void multon(int* acc, int x) {
  acc *= x;
}
int main() {
  int a = 3, b = 5;
  vector v = wrap2elems(a, b);
  printf("%i\n", accum_contents_int(v, multon));
}

Apart from still being cumbersome, all the above C code has one huge problem: it's completely unchecked if the container elements actually have the right type! The casts from *void will happily fire on any type, but in doubt the result will be complete garbage2.

Classes & Inheritance

That problem is one of the main issues which OO languages solve by trying to bundle all operations you might perform right together with the data, in the object, as methods. While compiling your class, the types are monomorphic so the compiler can check the operations make sense. When you try to use the values, it's enough if the compiler knows how to find the method. In particular, if you make a derived class, the compiler knows "aha, it's ok to call that method from the base class even on a derived object".

Unfortunately, that would mean all you achieve by polymorphism is equivalent to compositing data and simply calling the (monomorphic) methods on a single field. To actually get different behaviour (but controlledly!) for different types, OO languages need virtual methods. What this amounts to is basically that the class has extra fields with pointers to the method implementations, much like the pointer to the combine function I used in the C example – with the difference that you can only implement an overriding method by adding a derived class, for which the compiler again knows the type of all the data fields etc. and you're safe and all.

Sophisticated type systems, checked parametric polymorphism

While inheritance-based polymorphism obviously works, I can't help saying it's just crazy stupid3 sure a bit limiting. If you want to use just one particular operation that happens to be not implemented as a class method, you need to make an entire derived class. Even if you just want to vary an operation in some way, you need to derive and override a slightly different version of the method.

Let's revisit our C code. On the face of it, we notice it should be perfectly possible to make it type-safe, without any method-bundling nonsense. We just need to make sure no type information is lost – not during compile-time, at least. Imagine (Read ∀T as "for all types T")

∀T: {
  typedef struct{
    T* contents;
    int size;
  } vector<T>;
}

∀T: {
  vector<T> wrap1elem(T* elem) {
    vector v;
    v.contents = malloc(sizeof(T*));
    v.contents[0] = &elem;
    v.size = 1;
    return v;
  }
}

∀T: {
  void accum_contents(vector<T> v, void* (*combine)(T*, const T*), T* acc) {
    int i;
    for(i=0; i<v.size; ++i) {
      combine(&acc, (*T) (v[i]));
    }
  }
}

Observe how, even though the signatures look a lot like the C++ template thing on top of this post (which, as I said, really is just auto-generated monomorphic code), the implementation actually is pretty much just plain C. There are no T values in there, just pointers to them. No need to compile multiple versions of the code: at runtime, the type information isn't needed, we just handle generic pointers. At compile time, we do know the types and can use the function head to make sure they match. I.e., if you wrote

void evil_sumon (int* acc, double* x) { acc += *x; }

and tried to do

vector<float> v; char acc;
accum_contents(v, evil_sumon, acc);

the compiler would complain because the types don't match: in the declaration of accum_contents it says the type may vary, but all occurences of T do need to resolve to the same type.

And that is exactly how parametric polymorphism works in languages of the ML family as well as Haskell: the functions really don't know anything about the polymorphic data they're dealing with. But they are given the specialised operators which have this knowledge, as arguments.

In a language like Java (prior to lambdas), parametric polymorphism doesn't gain you much: since the compiler makes it deliberately hard to define "just a simple helper function" in favour of having only class methods, you can simply go the derive-from-class way right away. But in functional languages, defining small helper functions is the easiest thing imaginable: lambdas!

And so you can do incredible terse code in Haskell:

Prelude> foldr (+) 0 [1,4,6]
11
Prelude> foldr (\x y -> x+y+1) 0 [1,4,6]
14
Prelude> let f start = foldr (\_ (xl,xr) -> (xr, xl)) start
Prelude> :t f
f :: (t, t) -> [a] -> (t, t)
Prelude> f ("left", "right") [1]
("right","left")
Prelude> f ("left", "right") [1, 2]
("left","right")

Note how in the lambda I defined as a helper for f, I didn't have any clue about the type of xl and xr, I merely wanted to swap a tuple of these elements which requires the types to be the same. So that would be a polymorphic lambda, with the type

\_ (xl, xr) -> (xr, xl)   ::  ∀ a t.  a -> (t,t) -> (t,t)

1Apart from the weird explicit malloc stuff, type safety etc.: code like that is extremely hard to work with in languages without garbage collector, because somebody always needs to clean up memory once it's not needed anymore, but if you didn't watch out properly whether somebody still holds a reference to the data and might in fact need it still. That's nothing you have to worry about in Java, Lisp, Haskell...

2There is a completely different approach to this: the one dynamic languages choose. In those languages, every operation needs to make sure it works with any type (or, if that's not possible, raise a well-defined error). Then you can arbitrarily compose polymorphic operations, which is on one hand "nicely trouble-free" (not as trouble-free as with a really clever type system like Haskell's, though) but OTOH incurs quite a heavy overhead, since even primitive operations need type-decisions and safeguards around them.

3I'm of course being unfair here. The OO paradigm has more to it than just type-safe polymorphism, it enables many things e.g. old ML with it's Hindler-Milner type system couldn't do (ad-hoc polymorphism: Haskell has type classes for that, SML has modules), and even some things that are pretty hard in Haskell (mainly, storing values of different types in a variable-size container). But the more you get accustomed to functional programming, the less need you will feel for such stuff.

like image 165
leftaroundabout Avatar answered Feb 18 '23 17:02

leftaroundabout


In C++ polymorphic (or generic) lambda starting from C++14 is a lambda that can take any type as an argument. Basically it's a lambda that has auto parameter type: auto lambda = [](auto){};

like image 35
dividedbyzero Avatar answered Feb 18 '23 18:02

dividedbyzero