Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hash : How does it work internally?

This might sound as an very vague question upfront but it is not. I have gone through Hash Function description on wiki but it is not very helpful to understand.

I am looking simple answers for rather complex topics like Hashing. Here are my questions:

  1. What do we mean by hashing? How does it work internally?
  2. What algorithm does it follow ?
  3. What is the difference between HashMap, HashTable and HashList ?
  4. What do we mean by 'Constant Time Complexity' and why does different implementation of the hash gives constant time operation ?
  5. Lastly, why in most interview questions Hash and LinkedList are asked, is there any specific logic for it from testing interviewee's knowledge?

I know my question list is big but I would really appreciate if I can get some clear answers to these questions as I really want to understand the topic.

like image 738
Rachel Avatar asked Dec 15 '10 18:12

Rachel


People also ask

How hash function works internally?

Hashing means generating a (hopefully) unique number that represents a value. Different types of values ( Integer , String , etc) use different algorithms to compute a hashcode. HashMap and HashTable are maps; they are a collection of unqiue keys, each of which is associated with a value.

How does the hashing process work?

Hashing is the process of transforming any given key or a string of characters into another value. This is usually represented by a shorter, fixed-length value or key that represents and makes it easier to find or employ the original string. The most popular use for hashing is the implementation of hash tables.

What are internal hashing techniques?

For internal files, hashing is typically implemented as a hash table through the use of an array of records. For internal files, hashing is typically implemented as a hash table through the use of an array of records.

What is hash and digest?

The output of a hash function (e.g., hash(data) = digest). Also known as a message digest, digest or harsh value. The number of cryptographic has functions a processor can calculate in a given time, usually denominated as hashes per second.


1 Answers

  1. Here is a good explanation about hashing. For example you want to store the string "Rachel" you apply a hash function to that string to get a memory location. myHashFunction(key: "Rachel" value: "Rachel") --> 10. The function may return 10 for the input "Rachel" so assuming you have an array of size 100 you store "Rachel" at index 10. If you want to retrieve that element you just call GetmyHashFunction("Rachel") and it will return 10. Note that for this example the key is "Rachel" and the value is "Rachel" but you could use another value for that key for example birth date or an object. Your hash function may return the same memory location for two different inputs, in this case you will have a collision you if you are implementing your own hash table you have to take care of this maybe using a linked list or other techniques.

  2. Here are some common hash functions used. A good hash function satisfies that: each key is equally likely to hash to any of the n memory slots independently of where any other key has hashed to. One of the methods is called the division method. We map a key k into one of n slots by taking the remainder of k divided by n. h(k) = k mod n. For example if your array size is n = 100 and your key is an integer k = 15 then h(k) = 10.

  3. Hashtable is synchronised and Hashmap is not. Hashmap allows null values as key but Hashtable does not.

  4. The purpose of a hash table is to have O(c) constant time complexity in adding and getting the elements. In a linked list of size N if you want to get the last element you have to traverse all the list until you get it so the complexity is O(N). With a hash table if you want to retrieve an element you just pass the key and the hash function will return you the desired element. If the hash function is well implemented it will be in constant time O(c) This means you dont have to traverse all the elements stored in the hash table. You will get the element "instantly".

  5. Of couse a programer/developer computer scientist needs to know about data structures and complexity =)

like image 187
Enrique Avatar answered Sep 26 '22 22:09

Enrique