Haskell - why would I use infinite data structures?

Tags:

In Haskell, it is possible to define infinite lists like so:

[1.. ]

If found many articles which describe how to implement infinite lists and I understood how this works. However, I cant think of any reason to use the concept of infinite datastructures.

Can someone give me an example of a problem, which can be solved easier (or maybe only) with an infinite list in Haskell?

495

asked Jun 12 '20 17:06

Moritz Schmidt

2 Answers

The basic advantage of lists in Haskell is that they’re a control structure that looks like a data structure. You can write code that operates incrementally on streams of data, but it looks like simple operations on lists. This is in contrast to other languages that require the use of an explicitly incremental structure, like iterators (Python’s itertools), coroutines (C# IEnumerable), or ranges (D).

For example, a sort function can be written in such a way that it sorts as few elements as possible before it starts to produce results. While sorting the entire list takes O(n log n) / linearithmic time in the length of the list, minimum xs = head (sort xs) only takes O(n) / linear time, because head will only examine the first constructor of the list, like x : _, and leave the tail as an unevaluated thunk that represents the remainder of the sorting operation.

This means that performance is compositional: for example, if you have a long chain of operations on a stream of data, like sum . map (* 2) . filter (< 5), it looks like it would first filter all the elements, then map a function over them, then take the sum, producing a full intermediate list at every step. But what happens is that each element is only processed one at a time: given [1, 2, 6], this basically proceeds as follows, with all the steps happening incrementally:

total = 0
1 < 5 is true
1 * 2 == 2
total = 0 + 2 = 2
2 < 5 is true
2 * 2 == 4
total = 2 + 4 = 6
6 < 5 is false
result = 6

This is exactly how you would write a fast loop in an imperative language (pseudocode):

Click to copy

total = 0;
for x in xs {
  if (x < 5) {
    total = total + x * 2;
  }
}

This means that performance is compositional: because of laziness, this code has constant memory usage during the processing of the list. And there is nothing special inside map or filter that makes this happen: they can be entirely independent.

For another example, and in the standard library computes the logical AND of a list, e.g. and [a, b, c] == a && b && c, and it’s implemented simply as a fold: and = foldr (&&) True. The moment it reaches a False element in the input, it stops evaluation, simply because && is lazy in its right argument. Laziness gives you composition!

For a great paper on all this, read the famous Why Functional Programming Matters by John Hughes, which goes over the advantages of lazy functional programming (in Miranda, a forebear of Haskell) far better than I could.

190

answered Sep 28 '22 08:09

Jon Purdy

Annotating a list with its indices temporarily uses an infinite list of indices:

Click to copy
```
zip [0..] ['a','b','c','d'] = [(0,'a'), (1,'b'), (2,'c'), (3,'d')]
```

Memoizing functions while maintaining purity (in this case this transformation causes an exponential speed increase, because the memo table is used recursively):

Click to copy

fib = (memo !!)
    where
    memo = map fib' [0..]  -- cache of *all* fibonacci numbers (evaluated on demand) 
    fib' 0 = 0
    fib' 1 = 1
    fib' n = fib (n-1) + fib (n-2)

Purely mocking programs with side-effects (free monads)

Click to copy
```
data IO a = Return a
          | GetChar (Char -> IO a)
          | PutChar Char (IO a)
```
Potentially non-terminating programs are represented with infinite IO strucutres; e.g. forever (putChar 'y') = PutChar 'y' (PutChar 'y' (PutChar 'y' ...))
Tries: if you define a type approximately like the following:

Click to copy
```
data Trie a = Trie a (Trie a) (Trie a)
```
it can represent an infinite collection of as indexed by the naturals. Note that there is no base case for the recursion, so every Trie is infinite. But the element at index n can be accessed in log(n) time. Which means you can do something like this (using some functions in the inttrie library):

Click to copy
```
findIndices :: [Integer] -> Trie [Integer]
findIndices = foldr (\(i,x) -> modify x (i:)) (pure []) . zip [0..]
```
this builds an efficient "reverse lookup table" which given any value in the list can tell you at which indices it occurs, and it both caches results and streams information as soon as it's available:

Click to copy
```
-- N.B. findIndices [0, 0,1, 0,1,2, 0,1,2,3, 0,1,2,3,4...]
> table = findIndices (concat [ [0..n] | n <- [0..] ])
> table `apply` 0
[0,1,3,6,10,15,21,28,36,45,55,66,78,91,...
```
all from a one-line infinite fold.

I'm sure there are more examples, there are so many cool things you can do.

answered Sep 28 '22 09:09

luqui

Related questions
                            
                                How can I avoid writing boilerplate code for functions performing pattern matching?
                            
                                NoMonomorphismRestriction helps preserve sharing?
                            
                                Deriving instances with TypeFamilies
                            
                                How do I implement an Applicative instance for a parser without assuming Monad?
                            
                                How can I get rid of `let` in Core?
                            
                                Yesod ExitFailure 1 when installing scaffolded app
                            
                                "InternalIOException getAddrInfo: does not exist (error 10093)" on Windows 8
                            
                                Lenses and Monomorphism Restriction
                            
                                Why two definitions of 'reverse' in Haskell's Data.List
                            
                                Applicative transformer classes
                            
                                What's the reason behind cabal (dependency) hell?
                            
                                Haskell equivalent for python -m http.server?
                            
                                How to fill up a Data.Map in a space&time efficient way
                            
                                How do I set default values in a record
                            
                                Filter IO [Maybe String] to IO [String]
                            
                                How to get a decimal string from a Double instead of scientific notation in Haskell?
                            
                                Idiomatic way to handle nested IO in Haskell
                            
                                Haskell >>= operation: why is the function argument required to return another Monad?
                            
                                How do I parameterize a function by module in Haskell?
                            
                                Preventing an argument to be a complex number

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Haskell - why would I use infinite data structures?

Tags:

functional-programming

haskell

lazy-evaluation

Moritz Schmidt

People also ask

2 Answers

Jon Purdy

luqui

Recent Activity

Donate For Us