Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating unique hash code (string) in SQL Server from a combination of two or more columns (of different data types)

Tags:

sql

sql-server

I would like to create unique string columns (32 characters in length) from combination of columns with different data types in SQL Server 2005.

like image 440
Nagesh Avatar asked Mar 01 '11 04:03

Nagesh


People also ask

How do you add unique constraints to multiple columns in SQL?

To define a UNIQUE constraint, you use the UNIQUE keyword followed by one or more columns. You can define a UNIQUE constraint at the column or the table level. Only at the table level, you can define a UNIQUE constraint across multiple columns.

How do I create a composite unique key in SQL Server?

You can also create a Composite Unique Key consisting of two or more fields. To do that we need to apply the Unique Constraint on the table level. In the following example, we create a unique key consisting of FirstName & LastName . The FirstName & LastName themselves can contain duplicate values.

How do I create a unique field in SQL Server?

Use SQL Server Management StudioOn the Table Designer menu, select Indexes/Keys. In the Indexes/Keys dialog box, select Add. In the grid under General, select Type and choose Unique Key from the drop-down list box to the right of the property, and then select Close.

Is hash of string unique?

If two string objects are equal, the GetHashCode method returns identical values. However, there is not a unique hash code value for each unique string value. Different strings can return the same hash code.


2 Answers

I have found out the solution elsewhere in StackOverflow

SELECT SUBSTRING(master.dbo.fn_varbintohexstr(HashBytes('MD5', 'HelloWorld')), 3, 32)

The answer thread is here

like image 188
Nagesh Avatar answered Oct 13 '22 10:10

Nagesh


With HASBYTES you can create SHA1 hashes, that have 20 bytes, and you can create MD5 hashes, 16 bytes. There are various combination algorithms that can produce arbitrary length material by repeated hash operations, like the PRF of TLS (see RFC 2246).

This should be enough to get you started. You need to define what '32 characters' mean, since hash functions produce bytes not characters. Also, you need to internalize that no algorithm can possibly produce hashes of fixed length w/o collisions (guaranteed 'unique'). Although at 32 bytes length (assuming that by 'characters' you mean bytes) the theoretical collision probability of 50% is at 4x1038 hashed elements (see birthday problem), that assumes a perfect distribution for your 32 bytes output hash function, which you're not going to achieve.

like image 23
Remus Rusanu Avatar answered Oct 13 '22 10:10

Remus Rusanu