Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

To Use or Not to Use Data.Map

I'm currently working on a Haskell API. The latter provides some functions that currently take a list of lists as input, i.e. [(String,[(String, Double)])].

For visualization purposes, here's a sample of the list of lists mentioned above:

[
    ("A",   [
                ("I1", 1),
                ("I2", 2),
            ]
    ),
    ("B",   [
                ("I1", 3),
            ]
    )
]

I've defined some private helper functions. One helper function will search for specific entries in this list (Data.List.find = O(n)); another one will perform intersections; and another function will transform the list presented above to the following one:

[
    ("I1",  [
                ("A", 1),
                ("B", 3),
            ]
    ),
    ("I2",  [
                ("A", 2),
            ]
    )
]

The function that performs the transformation uses Data.Map, since it offers some functions that simplify that process a lot, like Data.Map.unionWith and Data.Map.insertWith. Well, since the transformation function had to call Data.Map.fromList and Data.Map.toList, I thought it would be nice to have a map of maps instead of a list of lists from the beginning. And so I changed my sample input to match the map of maps requirement.

Again, for visualization purposes, here's the list from above as a map of maps:

Map.fromList [
    ("A",   Map.fromList [
                ("I1", 1),
                ("I2", 2),
            ]
    ),
    ("B",   Map.fromList [
                ("I1", 3),
            ]
    )
]

Thanks to this step my code lost a few lines, and thanks to Data.Map.lookup, finding a desired now only takes O(log n) time.

Nonetheless, I'm currently asking myself if this really is a good solution? Is a map of maps the way to go? Or should the transformation function work with Data.Map.fromList and Data.Map.toList, and let the rest work with list of lists? Or better yet, is there a data structure that is more suitable for this kind of work?

I'm really looking forward to your replies.

like image 322
Giuseppe Accaputo Avatar asked Dec 28 '22 03:12

Giuseppe Accaputo


2 Answers

Initialization of the map-of-maps still only takes O(n).

Consider the list-of-lists first.

Let's say the outer list is [ a1, a2, ..., ap ], and each inner item is aj = ( lj, [ b0, b1, ..., bqj ]). Then construction of the list-of-lists takes O(n = ∑j=1p qj).

Initializing an inner map takes mj. = O(qj). Initializing the map-of-maps takes O(∑j=1p mj) = O(n).

like image 126
rampion Avatar answered Dec 30 '22 16:12

rampion


This smells like graphs and edges. One slightly different approach, which may or may not work is to rework your problem so instead of [(String,[(String,Double)])] you simply operate on 2-tuples of strings. Then you have [((String, String), Double)] and the resulting map is of type Data.Map.Map (String, String) Double.

Alternatively, if the space of string keys is limited, and can thus be mapped efficiently into machine ints, look into using an IntMap. Same semantics as a map except that the keys MUST be machine ints (Int32 or Int64). Will have much better performance.

like image 32
max Avatar answered Dec 30 '22 18:12

max