Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Hash function for detecting data changes?

We have a pricing dataset that changes the contained values or the number of records. The number of added or removed records is small compared to the changes in values. The dataset usually has between 50 and 500 items with 8 properties.

We currently use AJAX to return a JSON structure that represents the dataset and update a webpage using this structure with the new values and where necessary removing or adding items.

We make the request with two hash values, one for the values and another for the records. These are MD5 hashes returned with the JSON structure to be sent with a following request. If there is a change to the hashes we know we need a new JSON structure otherwise the hashes are just returned to save bandwidth and eliminate unnecessary client-side processing.

As MD5 is normally used with encryption is the best choice of hashing algorithm for just detecting data changes?

What alternative ways can we detect a change to the values and update as well as detecting added or removed items and manipulating the page DOM accordingly?

like image 660
Dave Anderson Avatar asked Apr 16 '09 14:04

Dave Anderson


People also ask

Which hash function is best?

Probably the one most commonly used is SHA-256, which the National Institute of Standards and Technology (NIST) recommends using instead of MD5 or SHA-1. The SHA-256 algorithm returns hash value of 256-bits, or 64 hexadecimal digits.

Which hashing technique is best in data structure?

Open Hashing (Separate Chaining): It is the most commonly used collision hashing technique implemented using Lined List. When any two or more elements collide at the same location, these elements are chained into a single-linked list called a chain.

Which hash function is the most secure?

Common attacks like brute force attacks can take years or even decades to crack the hash digest, so SHA-2 is considered the most secure hash algorithm.

What is SHA-256 hash function?

SHA-256 is a patented cryptographic hash function that outputs a value that is 256 bits long. What is hashing? In encryption, data is transformed into a secure format that is unreadable unless the recipient has a key. In its encrypted form, the data may be of unlimited size, often just as long as when unencrypted.


2 Answers

MD5 is a reasonable algorithm to detect changes to a set of data. However, if you're not concerned with the cryptographic properties, and are very concerned with the performance of the algorithm, you could go with a simpler checksum-style algorithm that isn't designed to be cryptographically secure. (though weaknesses in MD5 have been discovered in recent years, it's still designed to be cryptographically secure, and hence does more work than may be required for your scenario).

However, if you're happy with the computational performance of MD5, I'd just stick with it.

like image 118
Jonathan Rupp Avatar answered Oct 07 '22 01:10

Jonathan Rupp


MD5 is just fine. Should it have too low performance, you can try fast checksum algorithm, such as for example Adler-32.

like image 37
vartec Avatar answered Oct 06 '22 23:10

vartec