So I was answering a question about lazy evaluation (here, my answer is overkill for that case but the idea seems interesting) and it made me think about how lazy evaluation might be done in C++. I came up with a way but I wasn't sure of all the pitfalls in this. Are there other ways of achieving lazy evaluation? how might this be done? What are the pitfalls and this and other designs?
Here is my idea:
#include <iostream>
#include <functional>
#include <memory>
#include <string>
#define LAZY(E) lazy<decltype((E))>{[&](){ return E; }}
template<class T>
class lazy {
private:
typedef std::function<std::shared_ptr<T>()> thunk_type;
mutable std::shared_ptr<thunk_type> thunk_ptr;
public:
lazy(const std::function<T()>& x)
: thunk_ptr(
std::make_shared<thunk_type>([x](){
return std::make_shared<T>(x());
})) {}
const T& operator()() const {
std::shared_ptr<T> val = (*thunk_ptr)();
*thunk_ptr = [val](){ return val; };
return *val;
}
T& operator()() {
std::shared_ptr<T> val = (*thunk_ptr)();
*thunk_ptr = [val](){ return val; };
return *val;
}
};
void log(const lazy<std::string>& msg) {
std::cout << msg() << std::endl;
}
int main() {
std::string hello = "hello";
std::string world = "world";
auto x = LAZY((std::cout << "I was evaluated!\n", hello + ", " + world + "!"));
log(x);
log(x);
log(x);
log(x);
return 0;
}
Some things I was concerned about in my design.
What are your thoughts?
Yes, in C++ short circuit and and or operators are available. Here's a question answered in the C-faq on the subject. Show activity on this post.
In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing).
Lazy evaluation is a form of non-strict evaluation in which arguments are not evaluated until required. A computation that does not terminate under strict evaluation, can terminate under lazy evaluation.
In computer programming, lazy initialization is the tactic of delaying the creation of an object, the calculation of a value, or some other expensive process until the first time it is needed. It is a kind of lazy evaluation that refers specifically to the instantiation of objects or other resources.
Yes, what you have is lazy. Basically, just pass a function which computes the argument instead of the argument. After evaluation, the object is replaced by the computed value. That is basically it, and yes, implemented like that, with reference-counted pointers, it is pretty expensive.
Memoization is an old term, which often means tabling a function's result. No modern language does that (maybe PROLOG), it's extremely expensive. Full-lazyness (never computing one thing twice) is achieved in the process of lambda lifting, which is the elimination of free variables (by putting them as arguments). In fully-lazy lambda lifting, it is the maximal free expressions that are lifted (say x is free, so an ocurrence of sqrt x is replaced by a new argument, sqrtx). There is also the so-called optimal reduction.
I don't think there are other ways of doing it. Why is it a lot faster in a lazy functional language such as Haskell? Well, basically, no reference counted pointers, then there is strictness analysis (strict is the opposite of lazy), which allows the compiler to know beforehand that some things are better evaluated strictly, unboxing values which are evaluated strictly and are of known machine type... not to mention other typical functional programming language optimizations... But in essence, if you look at the implementation of a graph reduction machine, if you look at how the stack evolves, you see that basically you are passing functions on the stack instead of arguments, and that is basically it.
Now, in those machines, the node which computes the argument is overwritten with its value. So you are missing an optimization, maybe, but one which would not possible in a type-safe context.
Suppose all your "nodes" where subclasses of a master superclass, called "node" which had only a virtual function that computed the value... then it could be "overwritten" by another which would return the already computed value. That thing, with function pointers, is why they say the STG machine of Haskell is "Tagless" (the Spineless Tagless G-Machine), because they don't tag the data elements, instead they use a function pointer which either computes or returns the value.
I don't think it can be made not nearly as efficient in C++ as it is in Haskell... unless we start thinking about implementing C++ in a totally different way (could and should be done). We are used to such complex prologs and epilogs and complicated calling convention, etc... Function call is too bureaucratic in C/C++.
Now, the book to read when you feel lazy is definitely "The Implementation of Functional Programming Languages" by Simon Peyton-Jones. However, the modern implementation is described on the freely available article "Implementing functional languages on stock hardware: The Spineless Tagless G-machine" which is very nice to read about the implementation optimizations, but the other one is the one to read to understand the fundamentals.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With