Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I create a column that is a 0-9 hash of text in another column?

Our application has the following table definition:

CREATE TABLE [dbo].[Phrase] (
    [PhraseId]    UNIQUEIDENTIFIER DEFAULT (newid()) NOT NULL,
    [English]     NVARCHAR (250)   NOT NULL,
    [EnglishHash] AS (CONVERT([bigint],hashbytes('md5',[English])%(5)+(5))) PERSISTED,
    PRIMARY KEY CLUSTERED ([PhraseId] ASC)
);

The intention was for the EnglishHash column to be a value of either 0,1,2,3,4,5,6,7,8, or 9

However it's only giving values: 1,2,3,4,5,6,7,8, or 9

Can anyone help to explain how I can modify this so it gives values 0-9 inclusive?

Note that I tried out the suggestion by Sandip. This gives me a distribution but over 11,000 records the distribution is not what I expected:

0   593
9   652
3   1324
6   1253
7   1293
1   1932
4   1325
5   1282
2   1295
8   635
like image 853
Alan2 Avatar asked Dec 23 '22 18:12

Alan2


1 Answers

Your results only allow you to have 9 distinct values because you are taking the modulus of 5. Look at the below results. if you were to continue taking the modulus, you'll see it cannot be any integer greater than 4 and no less than -4 (this is before adding the last +5). It's just going to start looping. Instead why don't you take modulus of 10 of the absolute value of the bigint value of the binary hash.

SELECT 0%5
    , 1%5
    , 2%5
    , 3%5
    , 4%5
    , 5%5
    , 6%5

SELECT 0%5
    , -1%5
    , -2%5
    , -3%5
    , -4%5
    , -5%5
    , -6%5

Try using this instead

ABS(CONVERT(bigint, HASHBYTES('md5',[English])))%10

Here's a nice example using the system error messages for some random text.

SELECT ABS(CONVERT(bigint, HASHBYTES('md5',[text])))%10 AS 'Result'
    , COUNT(*) AS 'Distribution'
from sys.messages
GROUP BY ABS(CONVERT(bigint, HASHBYTES('md5',[text])))%10
ORDER BY ABS(CONVERT(bigint, HASHBYTES('md5',[text])))%10

Results:

Result  Distribution
0   25326
1   25218
2   25115
3   25322
4   25167
5   25322
6   25278
7   25119
8   25139
9   25158
like image 102
EMUEVIL Avatar answered Dec 27 '22 06:12

EMUEVIL