Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is Tie::IxHash implemented in Perl?

Tags:

perl

I've recently come upon a situation in Perl where the use of an order-preserving hash would make my code more readable and easier to use. After a bit of searching, I found out about the Tie::IxHash CPAN module, which does exactly what I want. Before I throw caution to the wind and just start using it though, I'd like to get a better idea of how it works and what kind of performance I can expect from it.

From what I know, ordered associative arrays are usually implemented as tries, which I've never actually used before, but do know that their performance falls in line with my expectations (I expect to do a lot of reading and writing, and will need to always remember the order keys were originally inserted). My problem is I can't figure out if this is how Tie::IxHash was made, or what sort of performance I should expect from it, or whether there's some better/cleaner option for me (I'd really rather not keep a separate array and hash to accomplish what I need as this produces ugly code and space inefficiency). I'm also just curious for curiosity's sake. If it wasn't implemented as a trie, how was it implemented? I know I can wade at the source code, but I'm hoping someone else has already done this, and I'm guessing that I'm not the only person who'll be interested in the answer.

So... Ideas? Suggestions? Advice?

like image 666
Eli Avatar asked Dec 27 '22 23:12

Eli


1 Answers

A Tie::IxHash object is implemented in a direct fashion, using the regular Perl building blocks that one would expect. Specifically, such an object is a blessed array reference holding 4 elements.

  • [0] A hash reference to store the keys of the user's hash. This is used any time the module needs to check for the existence of a key.

  • [1] An array reference to store the keys of the user's hash, in order.

  • [2] A parallel array reference to store the values, also in order.

  • [3] An integer to keep track of the current position within the two parallel arrays. This is needed for iteration.

Regarding performance, a good benchmark is usually worth more than speculation. My guess is that the biggest performance hit will come with deletion, because the arrays holding the ordered keys and values will require adjustment.

like image 95
FMc Avatar answered Jan 12 '23 20:01

FMc