We have a pricing dataset that changes the contained values or the number of records. The number of added or removed records is small compared to the changes in values. The dataset usually has between 50 and 500 items with 8 properties.
We currently use AJAX to return a JSON structure that represents the dataset and update a webpage using this structure with the new values and where necessary removing or adding items.
We make the request with two hash values, one for the values and another for the records. These are MD5 hashes returned with the JSON structure to be sent with a following request. If there is a change to the hashes we know we need a new JSON structure otherwise the hashes are just returned to save bandwidth and eliminate unnecessary client-side processing.
As MD5 is normally used with encryption is the best choice of hashing algorithm for just detecting data changes?
What alternative ways can we detect a change to the values and update as well as detecting added or removed items and manipulating the page DOM accordingly?
Probably the one most commonly used is SHA-256, which the National Institute of Standards and Technology (NIST) recommends using instead of MD5 or SHA-1. The SHA-256 algorithm returns hash value of 256-bits, or 64 hexadecimal digits.
Open Hashing (Separate Chaining): It is the most commonly used collision hashing technique implemented using Lined List. When any two or more elements collide at the same location, these elements are chained into a single-linked list called a chain.
Common attacks like brute force attacks can take years or even decades to crack the hash digest, so SHA-2 is considered the most secure hash algorithm.
SHA-256 is a patented cryptographic hash function that outputs a value that is 256 bits long. What is hashing? In encryption, data is transformed into a secure format that is unreadable unless the recipient has a key. In its encrypted form, the data may be of unlimited size, often just as long as when unencrypted.
MD5 is a reasonable algorithm to detect changes to a set of data. However, if you're not concerned with the cryptographic properties, and are very concerned with the performance of the algorithm, you could go with a simpler checksum-style algorithm that isn't designed to be cryptographically secure. (though weaknesses in MD5 have been discovered in recent years, it's still designed to be cryptographically secure, and hence does more work than may be required for your scenario).
However, if you're happy with the computational performance of MD5, I'd just stick with it.
MD5 is just fine. Should it have too low performance, you can try fast checksum algorithm, such as for example Adler-32.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With