To Use or Not to Use Data.Map

Question

I'm currently working on a Haskell API. The latter provides some functions that currently take a list of lists as input, i.e. [(String,[(String, Double)])].

For visualization purposes, here's a sample of the list of lists mentioned above:

[
    ("A",   [
                ("I1", 1),
                ("I2", 2),
            ]
    ),
    ("B",   [
                ("I1", 3),
            ]
    )
]

I've defined some private helper functions. One helper function will search for specific entries in this list (Data.List.find = O(n)); another one will perform intersections; and another function will transform the list presented above to the following one:

[
    ("I1",  [
                ("A", 1),
                ("B", 3),
            ]
    ),
    ("I2",  [
                ("A", 2),
            ]
    )
]

The function that performs the transformation uses Data.Map, since it offers some functions that simplify that process a lot, like Data.Map.unionWith and Data.Map.insertWith. Well, since the transformation function had to call Data.Map.fromList and Data.Map.toList, I thought it would be nice to have a map of maps instead of a list of lists from the beginning. And so I changed my sample input to match the map of maps requirement.

Again, for visualization purposes, here's the list from above as a map of maps:

Map.fromList [
    ("A",   Map.fromList [
                ("I1", 1),
                ("I2", 2),
            ]
    ),
    ("B",   Map.fromList [
                ("I1", 3),
            ]
    )
]

Thanks to this step my code lost a few lines, and thanks to Data.Map.lookup, finding a desired now only takes O(log n) time.

Nonetheless, I'm currently asking myself if this really is a good solution? Is a map of maps the way to go? Or should the transformation function work with Data.Map.fromList and Data.Map.toList, and let the rest work with list of lists? Or better yet, is there a data structure that is more suitable for this kind of work?

I'm really looking forward to your replies.

rampion · Accepted Answer

Initialization of the map-of-maps still only takes O(n).

Consider the list-of-lists first.

Let's say the outer list is [ a₁, a₂, ..., a_p ], and each inner item is a_j = ( l_j, [ b₀, b₁, ..., b_{q_j} ]). Then construction of the list-of-lists takes O(n = ∑_j=1^p q_j).

Initializing an inner map takes m_j. = O(q_j). Initializing the map-of-maps takes O(∑_j=1^p m_j) = O(n).

max · Answer

This smells like graphs and edges. One slightly different approach, which may or may not work is to rework your problem so instead of [(String,[(String,Double)])] you simply operate on 2-tuples of strings. Then you have [((String, String), Double)] and the resulting map is of type Data.Map.Map (String, String) Double.

Alternatively, if the space of string keys is limited, and can thus be mapped efficiently into machine ints, look into using an IntMap. Same semantics as a map except that the keys MUST be machine ints (Int32 or Int64). Will have much better performance.

To Use or Not to Use Data.Map

Tags:

data-structures

haskell

Giuseppe Accaputo

2 Answers

rampion

max

Recent Activity

Donate For Us

To Use or Not to Use Data.Map

Tags:

data-structures

haskell

Giuseppe Accaputo

2 Answers

rampion

max

Related questions

Recent Activity

Donate For Us