Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate a checksum for a string

Tags:

c#

I got a string of an arbitrary length (lets say 5 to 2000 characters) which I would like to calculate a checksum for.

Requirements

  • The same checksum must be returned each time a calculation is done for a string
  • The checksum must be unique (no collisions)
  • I can not store previous IDs to check for collisions

Which algorithm should I use?

Update:

  • Are there an approach which is reasonable unique? i.e. the likelihood of a collision is very small.
  • The checksum should be alphanumeric
  • The strings are unicode
  • The strings are actually texts that should be translated and the checksum is stored with each translation (so a translated text can be matched back to the original text).
  • The length of the checksum is not important for me (the shorter, the better)

Update2

Let's say that I got the following string "Welcome to this website. Navigate using the flashy but useless menu above".

The string is used in a view in a similar way to gettext in linux. i.e. the user just writes (in a razor view)

@T("Welcome to this website. Navigate using the flashy but useless menu above") 

Now I need a way to identity that string so that I can fetch it from a data source (there are several implementations of the data source). Having to use the entire string as a key seems a bit inefficient and I'm therefore looking for a way to generate a key out of it.

like image 838
jgauffin Avatar asked Mar 23 '12 10:03

jgauffin


People also ask

How do you write a checksum for a string?

The checksum should be alphanumeric. The strings are unicode. The strings are actually texts that should be translated and the checksum is stored with each translation (so a translated text can be matched back to the original text). The length of the checksum is not important for me (the shorter, the better)

How do you calculate checksum value?

To calculate the checksum of an API frame: Add all bytes of the packet, except the start delimiter 0x7E and the length (the second and third bytes). Keep only the lowest 8 bits from the result. Subtract this quantity from 0xFF.

How do you find the checksum of a string in Java?

First, ask for the length of data to send, in order to ascertain the number of segments. Then perform one complement of each data being entered simultaneously adding them. This means the sum would not be required to be complimented again. Then send the data along with the computed checksum to the server.


2 Answers

That's not possible.

If you can't store previous values, it's not possible to create a unique checksum that is smaller than the information in the string.

Update:

The term "reasonably unique" doesn't make sense, either it's unique or it's not.

To get a reasonably low risk of hash collisions, you can use a resonably large hash code.

The MD5 algorithm for example produces a 16 byte hash code. Convert the string to a byte array using some encoding that preserves all characters, for example UTF-8, calculate the hash code using the MD5 class, then convert the hash code byte array into a string using the BitConverter class:

string theString = "asdf";  string hash; using (System.Security.Cryptography.MD5 md5 = System.Security.Cryptography.MD5.Create()) {   hash = BitConverter.ToString(     md5.ComputeHash(Encoding.UTF8.GetBytes(theString))   ).Replace("-", String.Empty); }  Console.WriteLine(hash); 

Output:

912EC803B2CE49E4A541068D495AB570 
like image 177
Guffa Avatar answered Oct 10 '22 03:10

Guffa


You can use cryptographic Hash functions for this. Most of them are available in .Net

For example:

var sha1 = System.Security.Cryptography.SHA1.Create(); byte[] buf = System.Text.Encoding.UTF8.GetBytes("test"); byte[] hash= sha1.ComputeHash(buf, 0, buf.Length); //var hashstr  = Convert.ToBase64String(hash); var hashstr = System.BitConverter.ToString(hash).Replace("-", ""); 
like image 21
L.B Avatar answered Oct 10 '22 03:10

L.B