Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i count the number of collisions in a hash table?

My insert function already handles collisions correctly but i want to be able to count the number of collisions in each different hashing way(chaining,linear probing, and quadratic probing). How would i go about doing this?

This my code so far:

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <ctime>
#include "Chaining.h"
#include "QuadraticProbing.h"
#include "LinearProbing.h"

using namespace std;

int main()
{
    int collision_count = 0;
    float diff = 0.0;
    clock_t tStart, tStop;
    string ITEM_NOT_FOUND = "ITEM_NOT_FOUND";
    std::vector<std::string> DataArray;
    std::copy(std::istream_iterator<std::string>(std::ifstream("OHenry.txt")),istream_iterator<string>(),back_inserter(DataArray));
    std::vector<std::string> QueryArray;
    std::copy(std::istream_iterator<std::string>(std::ifstream("queries.txt")),istream_iterator<string>(),back_inserter(QueryArray));

    cout << "Testing chaining probing...\n";
    HashTable_chaining ChainingHT( ITEM_NOT_FOUND, 101 );
    int i = 0;
    double average_c = 0.0;
    double total_c = 0.0;
    double temp_c = 0.0;
    while(i != DataArray.size())
    {
        tStart = clock();
        ChainingHT.insert(DataArray[i]);
        tStop = clock();
        total_c = tStop - tStart;
        temp_c = total_c + temp_c;
        average_c = total_c / DataArray.size(); 
        if(DataArray[i] != DataArray[NULL])
        {
            collision_count++;
        }
        i++;
    }   

    cout<<"The number of collisions using Chaining is: "<<collision_count<<endl;
    cout<<"The average time per insertion using Chaining is: "<<average_c<<endl;
    system("pause");
    // Quadratic Probing 
    cout << "Testing quadratic probing...\n";
    HashTable_chaining QuadraticProbingHT( ITEM_NOT_FOUND, 101 );   
    int j = 0;
    int collision_count_quadratic = 0;
    double average_qp = 0;
    while(j != DataArray.size())
    {
        clock_t tStartq = clock();
        QuadraticProbingHT.insert(DataArray[j]);
        if(DataArray[j] != DataArray[NULL])
        {
            collision_count_quadratic++;
        }
        j++;
        average_qp = (((double)(clock() - tStartq)/CLOCKS_PER_SEC) + average_qp) / DataArray.size();

    }   
    cout<<"The number of collisions using Quadratic is: "<<collision_count<<endl;
    cout<<"The average time per insertion using Quadratic is: "<<average_qp<<endl;
like image 831
user977154 Avatar asked Dec 05 '11 02:12

user977154


1 Answers

It is possible to have the hashtable itself report the number of colissions it has seen without exposing its internal implementation at all.

For hashtables that use probing (of any kind), the number of colissions is equal to the number of elements positioned at an index not consistent with their hash code (that is because the position they would normally have been stored in was already occupied).

For hashtables that use chaining, the number of colissions is equal to the number of items in the hashtable, minus the number of occupied buckets (in other words, count all inserted items except the first in each bucket). This is also quite intuitive.

So what I would do in your shoes is give each hashtable a count_colissions() method that calculates the number of colissions in O(n) time using the corresponding method and returns it.

like image 148
Jon Avatar answered Oct 31 '22 15:10

Jon