Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple lookup structures for same data: Memory duplication?

Tags:

haskell

Suppose I have data on a bunch of people, and I want to be able to look them up in different ways. Maybe there's some kind of data structure (like a binary tree) that facilitates lookup by name. And maybe there's another (like a list) that's by order of creation. And perhaps many more.

In many languages, you would have each person allocated exactly once on the heap. Each data structure would contain pointers to that memory. Thus, you're not allocating a new set of people every time you add a new way to look them up.

How about in Haskell? Is there any way to avoid memory duplication when different data structures need to index the same data?

like image 950
rlkw1024 Avatar asked May 15 '13 21:05

rlkw1024


1 Answers

I feel sure there's a deeper, more knowledgeable answer to this question, but for the time being...

Since in a pure functional programming language data is immutable, there's no need to do anything other than copy the pointer instead of copying its target.

As a quick and very dirty example, I fired up the ghci interpreter:

Prelude> let x = replicate 10000 'm' in all (==x) $ replicate 10000 x
True
(1.61 secs, 0 bytes)

I admit that these stats are unreliable, but what it's not doing is allocating memory for all 10000 copies of a list 10000 characters long.

Summary:

The way to avoid memory duplication is to
(a) use haskell
(b) avoid pointlessly reconstructing your data.

How can I pointlessly reconstruct my data?

A very simple and pointless example:

 pointlessly_reconstruct_list :: [a] -> [a]
 pointlessly_reconstruct_list [] = []
 pointlessly_reconstruct_list (x:xs) = x:xs

This kind of thing causes a duplicate of the list structure.

Have you got any examples that are a little less pointless but still simple?

Interestingly, if you do xs ++ ys you essentially reconstruct xs in order to place ys at the end of it (replacing []), so the list structure of xs is nearly copied wholesale. However, there's no need to replicate the actual data, and there certainly only needs to be one copy of ys.

like image 59
AndrewC Avatar answered Nov 15 '22 05:11

AndrewC