Non-Functional way:
arr = [1, 2, 3]
becomes arr = [1, 5, 3]
. Here we change same array.
This is discouraged in functional programming. I know that since computers are becoming faster and faster every day and there is more memory to store, functional programming seems more feasible for better readability and clean code.
Functional way:
arr = [1, 2, 3]
isn't changed arr2 = [1, 5, 3]
. I see a general trend that we use more memory and time to just change one variable.
Here, we doubled our memory and the time complexity changed from O(1)
to O(n)
.
This might be costly for bigger algorithms. Where is this compensated? Or since we can afford for costlier calculations (like when Quantum computing becomes mainstream), do we just trade speed off for readability?
Functional data structures don't necessarily take up a lot more space or require more processing time. The important aspect here is that purely functional data structures are immutable, but that doesn't mean you always make a complete copy of something. In fact, the immutability is precisely the key to working efficiently.
I'll provide as an example a simple list. Suppose we have the following list:
The head of the list is element 1
. The tail of the list is (2, 3)
. Suppose this list is entirely immutable.
Now, we want to add an element at the start of that list. Our new list must look like this:
You can't change the existing list, it is immutable. So, we have to make a new one, right? However, note how the tail of our new list is (1, 2 ,3)
. That's identical to the old list. So, you can just re-use that. The new list is simply the element 0
with a pointer to the start of the old list as its tail. Here's the new list with various parts highlighted:
If our lists were mutable, this would not be safe. If you changed something in the old list (for example, replacing element 2
with a different one) the change would reflect in the new list as well. That's exactly where the danger is in mutability: concurrent access on data structures needs to be synchronized to avoid unpredictable results, and changes can have unintended side-effects. But, because that can't happen with immutable data structures, it's safe to re-use part of another structure in a new one. Sometimes you want changes in one thing to reflect in another; for example, when you remove an entry in the key set of a Map
in Java, you want the mapping itself to be removed too. But in other situations mutability leads to trouble (the infamous Calendar
class in Java).
So how can this work, if you can't change the data structure itself? How do you make a new list? Remember that if we're working purely functionally, we move away from the classical data structures with changeable pointers, and instead evaluate functions.
In functional languages, making lists is done with the cons
function. cons
makes a "cell" of two elements. If you want to make a list with only one element, the second one is nil
. So a list with only one 3
element is:
(cons 3 nil)
If the above is a function and you ask what its head
is, you get 3
. Ask for the tail
, you get nil
. Now, the tail itself can be a function, like cons
.
Our first list then is expressed as such:
(cons 1 (cons 2 (cons 3 nil)))
Ask the head
of the above function and you get 1
. Ask for the tail
and you get (cons 2 (cons 3 nil))
.
If we want to append 0
in the front, you just make a new function that evaluates to cons
with 0
as head and the above as tail.
(cons 0 (cons 1 (cons 2 (cons 3 nil))))
Since the functions we make are immutable, our lists become immutable. Things like adding elements is a matter of making a new function that calls the old one in the right place. Traversing a list in the imperative and object-oriented way is going through pointers to get from one element to another. Traversing a list in the functional way is evaluating functions.
I like to think of data structures as this: a data structure is basically storing the result of running some algorithm in memory. It "caches" the result of computation, so we don't have to do the computation every time. Purely functional data structures model the computation itself via functions.
This in fact means that it can be quite memory efficient because a lot of data copying can be avoided. And with an increasing focus on parallelization in processing, immutable data structures can be very useful.
EDIT
Given the additional questions in the comments, I'll add a bit to the above to the best of my abilities.
What about my example? Is it something like cons(1 fn) and that function can be cons(2 fn2) where fn2 is cons(3 nil) and in some other case cons(5 fn2)?
The cons
function is best compared to a single-linked list. As you might imagine, if you're given a list composed of cons
cells, what you're getting is the head and thus random access to some index isn't possible. In your array you can just call arr[1]
and get the second item (since it's 0-indexed) in the array, in constant time. If you state something like val list = (cons 1 (cons 2 (cons 3 nil)))
you can't just ask the second item without traversing it, because list
is now actually a function you evaluate. So access requires linear time, and access to the last element will take longer than access to the head element. Also, given that it's equivalent to a single-linked list, traversal can only be in one direction. So the behavior and performance is more like that of a single-linked list than of, say, an arraylist or array.
Purely functional data structures don't necessarily provide better performance for some operations such as indexed access. A "classic" data structure may have O(1) for some operation where a functional one may have O(log n) for the same one. That's a trade-off; functional data structures aren't a silver bullet, just like object-orientation wasn't. You use them where they make sense. If you're always going to traverse a whole list or part of it and want to be capable of safe parallel access, a structure composed of cons
cells works perfectly fine. In functional programming, you'd often traverse a structure using recursive calls where in imperative programming you'd use a for
loop.
There are of course many other functional data structures, some of which come much closer to modeling an array that allows random access and updates. But they're typically a lot more complex than the simple example above. There's of course advantages: parallel computation can be trivially easy thanks to immutability; memoization allows us to cache the results of function calls based on inputs since a purely functional approach always yields the same result for the same input.
What are we actually storing underneath? If we need to traverse a list, we need a mechanism to point to next elements right? or If I think a bit, I feel like it is irrelevant question to traverse a list since whenever a list is required it should probably be reconstructed everytime?
We store data structures containing functions. What is a cons
? A simple structure consisting of two elements: a head
and tail
. It's just pointers underneath. In an object-oriented language like Java, you could model it as a class Cons
that contains two final fields head
and tail
assigned on construction (immutable) and has corresponding methods to fetch these. This in a LISP variant
(cons 1 (cons 2 nil))
would be equivalent to
new Cons(1, new Cons(2, null))
in Java.
The big difference in functional languages is that functions are first-class types. They can be passed around and assigned to variables just like object references. You can compose functions. I could just as easily do this in a functional language
val list = (cons 1 (max 2 3))
and if I ask list.head
I get 1, if I ask list.tail
I get (max 2 3)
and evaluating that just gives me 3. You compose functions. Think of it as modeling behavior instead of data. Which brings us to
Could you elaborate "Purely functional data structures model the computation itself via functions."?
Calling list.tail
on our above list returns something that can be evaluated and then returns a value. In other words, it returns a function. If I call list.tail
in that example it returns (max 2 3)
, clearly a function. Evaluating it yields 3
as that's the highest number of the arguments. In this example
(cons 1 (cons 2 nil))
calling tail
evaluates to a new cons
(the (cons 2 nil)
one) which in turn can be used.
Suppose we want a sum of all the elements in our list. In Java, before the introduction of lambdas, if you had an array int[] array = new int[] {1, 2, 3}
you'd do something like
int sum = 0;
for (int i = 0; i < array.length; ++i) {
sum += array[i];
}
In a functional language it would be something like (simplified pseudo-code)
(define sum (arg)
(eq arg nil
(0)
(+ arg.head (sum arg.tail))
)
)
This uses prefix notation like we've used with our cons
so far. So a + b
is written as (+ a b)
. define
lets us define a function, with as arguments the name (sum
), a list of arguments for the function ((arg)
), and then the actual function body (the rest).
The function body consists of an eq
function which we'll define as comparing its first two arguments (arg
and nil
) and if they're equal it evaluates to its next argument ((0)
in this case), otherwise to the argument after that (the sum). So think of it as (eq arg1 arg2 true false)
with true and false whatever you want (a value, a function...).
The recursion bit then comes in the sum (+ arg.head (sum arg.tail))
. We're stating that we take the addition of the head
of the argument with a recursive call to the sum
function itself on the tail. Suppose we do this:
val list = (cons 1 (cons 2 (cons 3 nil)))
(sum list)
Mentally step through what that last line would do to see how it evaluates to the sum of all the elements in list
.
Note, now, how sum
is a function. In the Java example we had some data structure and then iterated over it, performing access on it, to create our sum. In the functional example the evaluation is the computation. A useful aspect of this is that sum
as a function could be passed around and evaluated only when it's actually needed. That is lazy evaluation.
Another example of how data structures and algorithms are actually the same thing in a different form. Take a set
. A set can contain only one instance of an element, for some definition of equality of elements. For something like integers it's simple; if they are the same value (like 1 == 1
) they're equal. For objects, however, we typically have some equality check (like equals()
in Java). So how can you know whether a set already contains an element? You go over each element in the set and check if it is equal to the one you're looking for.
A hash set
, however, computes some hash function for each element and places elements with the same hash in a corresponding bucket. For a good hash function there will rarely be more than one element in a bucket. If you now provide some element and want to check if it's in the set, the actions are:
The requirement is that two equal elements must have the same hash.
So now you can check if something is in the set in constant time. The reason being that our data structure itself has stored some computation information: the hashes. If you store each element in a bucket corresponding to its hash, we have put some computation result in the data structure itself. This saves time later if we want to check whether the set contains an element. In that way, data structures are actually computations frozen in memory. Instead of doing the entire computation every time, we've done some work up-front and re-use those results.
When you think of data structures and algorithms as being analogous in this way, it becomes clearer how functions can model the same thing.
Make sure to check out the classic book "Structure and Interpetation of Computer Programs" (often abbreviated as SICP). It'll give you a lot more insight. You can read it for free here: https://mitpress.mit.edu/sicp/full-text/book/book.html
This is a really broad question with a lot of room for opinionated answers, but G_H provides a really nice breakdown of some of the differences
Could you elaborate "Purely functional data structures model the computation itself via functions."?
This is one of my favourite topics, so I'm happy to share an example in JavaScript because it will allow you to run the code here in the browser and see the answer for yourself
Below you will see a linked list implemented using functions. I use a couple Numbers for example data and I use a String so that I can log something to the console for you to see, but other that that, it's just functions – no fancy objects, no arrays, no other custom stuff.
const cons = (x,y) => f => f(x,y)
const head = f => f((x,y) => x)
const tail = f => f((x,y) => y)
const nil = () => {}
const isEmpty = x => x === nil
const comp = f => g => x => f(g(x))
const reduce = f => y => xs =>
isEmpty(xs) ? y : reduce (f) (f (y,head(xs))) (tail(xs))
const reverse = xs =>
reduce ((acc,x) => cons(x,acc)) (nil) (xs)
const map = f =>
comp (reverse) (reduce ((acc, x) => (cons(f(x), acc))) (nil))
// this function is required so we can visualise the data
// it effectively converts a linked-list of functions to readable strings
const list2str = xs =>
isEmpty(xs) ? 'nil' : `(${head(xs)} . ${list2str(tail(xs))})`
// example input data
const xs = cons(1, cons(2, cons(3, cons(4, nil))))
// example derived data
const ys = map (x => x * x) (xs)
console.log(list2str(xs))
// (1 . (2 . (3 . (4 . nil))))
console.log(list2str(ys))
// (1 . (4 . (9 . (16 . nil))))
Of course this isn't of practical use in real-world JavaScript, but that's beside the point. It's just showing you how functions alone could be used to represent complex data structures.
Here's another example of implementing rational numbers using nothing but functions and numbers – again, we're only using strings so we can convert the functional structure to a visual representation we can understand in the console - this exact scenario is examine thoroughly in the SICP book that G_H mentions
We even implement our higher-order data rat
using cons
. This shows how functional data structures can easily be made up of (composed of) other functional data structures
const cons = (x,y) => f => f(x,y)
const head = f => f((x,y) => x)
const tail = f => f((x,y) => y)
const mod = y => x =>
y > x ? x : mod (y) (x - y)
const gcd = (x,y) =>
y === 0 ? x : gcd(y, mod (y) (x))
const rat = (n,d) =>
(g => cons(n/g, d/g)) (gcd(n,d))
const numer = head
const denom = tail
const ratAdd = (x,y) =>
rat(numer(x) * denom(y) + numer(y) * denom(x),
denom(x) * denom(y))
const rat2str = r => `${numer(r)}/${denom(r)}`
// example complex data
let x = rat(1,2)
let y = rat(1,4)
console.log(rat2str(x)) // 1/2
console.log(rat2str(y)) // 1/4
console.log(rat2str(ratAdd(x,y))) // 3/4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With