Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Questions about implementing my own HashMap in Java

Tags:

java

hashmap

I am working on an assignment where I have to implement my own HashMap. In the assignment text it is being described as an Array of Lists, and whenever you want to add an element the place it ends up in the Array is determined by its hashCode. In my case it is positions from a spreadsheet, so I have just taken columnNumber + rowNumber and then converted that to a String and then to an int, as the hashCode, and then I insert it that place in the Array. It is of course inserted in the form of a Node(key, value), where the key is the position of the cell and the value is the value of the cell.

But I must say I do not understand why we need an Array of Lists, because if we then end up with a list with more than one element, will it not increase the look up time quite considerably? So should it not rather be an Array of Nodes?

Also I have found this implementation of a HashMap in Java:

public class HashEntry {
      private int key;
      private int value;

      HashEntry(int key, int value) {
            this.key = key;
            this.value = value;
      }     

      public int getKey() {
            return key;
      }

      public int getValue() {
            return value;
      }
}

public class HashMap {
  private final static int TABLE_SIZE = 128;

  HashEntry[] table;

  HashMap() {
        table = new HashEntry[TABLE_SIZE];
        for (int i = 0; i < TABLE_SIZE; i++)
              table[i] = null;
  }

  public int get(int key) {
        int hash = (key % TABLE_SIZE);
        while (table[hash] != null && table[hash].getKey() != key)
              hash = (hash + 1) % TABLE_SIZE;
        if (table[hash] == null)
              return -1;
        else
              return table[hash].getValue();
  }

  public void put(int key, int value) {
        int hash = (key % TABLE_SIZE);
        while (table[hash] != null && table[hash].getKey() != key)
              hash = (hash + 1) % TABLE_SIZE;
        table[hash] = new HashEntry(key, value);
  }
}

So is it correct that the put method, looks first at the table[hash], and if that is not empty and if what is in there has not got the key, being inputted in the method put, then it moves on to table[(hash + 1) % TABLE_SIZE]. But if it is the same key it simply overwrites the value. So is that correctly understood? And is it because the get and put method use the same method of looking up the place in the Array, that given the same key they would end up at the same place in the Array?

I know these questions might be a bit basic, but I have spend quite some time trying to get this sorted out, why any help would be much appreciated!

Edit

So now I have tried implementing the HashMap myself via a Node class, which just constructs a node with a key and a corresponding value, it has also got a getHashCode method, where I just concatenate the two values on each other.

I have also constructed a SinglyLinkedList (part of a previous assignment), which I use as the bucket.

And my Hash function is simply hashCode % hashMap.length.

Here is my own implementation, so what do you think of it?

package spreadsheet; 

public class HashTableMap {

  private SinglyLinkedListMap[] hashArray;
  private int size;


  public HashTableMap() {
    hashArray = new SinglyLinkedListMap[64];
    size = 0;  
  }


  public void insert(final Position key, final Expression value) {

      Node node = new Node(key, value); 
      int hashNumber = node.getHashCode() % hashArray.length;       
      SinglyLinkedListMap bucket = new SinglyLinkedListMap();
      bucket.insert(key, value);
      if(hashArray[hashNumber] == null) {
        hashArray[hashNumber] = bucket;
        size++; 
      }
      if(hashArray[hashNumber] != null) {
        SinglyLinkedListMap bucket2 = hashArray[hashNumber];
        bucket2.insert(key, value);
        hashArray[hashNumber] = bucket2;
        size++; 
      }
      if (hashArray.length == size) {
          SinglyLinkedListMap[] newhashArray = new SinglyLinkedListMap[size * 2];
      for (int i = 0; i < size; i++) {
          newhashArray[i] = hashArray[i];
      }
      hashArray = newhashArray;
    }
  } 

  public Expression lookUp(final Position key) {
      Node node = new Node(key, null); 
      int hashNumber = node.getHashCode() % hashArray.length;
      SinglyLinkedListMap foundBucket = hashArray[hashNumber];
      return foundBucket.lookUp(key); 
  }
 }

The look up time should be around O(1), so I would like to know if that is the case? And if not how can I improve it, in that regard?

like image 374
Per John Avatar asked Jan 28 '13 18:01

Per John


2 Answers

You have to have some plan to deal with hash collisions, in which two distinct keys fall in the same bucket, the same element of your array.

One of the simplest solutions is to keep a list of entries for each bucket.

If you have a good hashing algorithm, and make sure the number of buckets is bigger than the number of elements, you should end up with most buckets having zero or one items, so the list search should not take long. If the lists are getting too long it is time to rehash with more buckets to spread the data out.

like image 177
Patricia Shanahan Avatar answered Sep 20 '22 14:09

Patricia Shanahan


It really depends on how good your hashcode method is. Lets say you tried to make it as bad as possible: You made hashcode return 1 every time. If that were the case, you'd have an array of lists, but only 1 element of the array would have any data in it. That element would just grow to have a huge list in it.

If you did that, you'd have a really inefficient hashmap. But, if your hashcode were a little better, it'd distribute the objects into many different array elements and as a result it'd be much more efficient.

The most ideal case (which often isn't achievable) is to have a hashcode method that returns a unique number no matter what object you put into it. If you could do that, you wouldn't ever need an array of lists. You could just use an array. But since your hashcode isn't "perfect" it's possible for two different objects to have the same hashcode. You need to be able to handle that scenario by putting them in a list at the same array element.

But, if your hashcode method was "pretty good" and rarely had collisions, you rarely would have more than 1 element in the list.

like image 37
Daniel Kaplan Avatar answered Sep 20 '22 14:09

Daniel Kaplan