Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unique string for each record in the database table

In my Asp.Net MVC 5 project I use Entity Framework code first to work with MS SQL database. Suppose this is the table:

public class Ticket
{
    [Key]
    public int Id { get; set; }

    [Required]
    public string ReferenceCode { get; set; }

    //Rest of the table
}

In this table, whenever I add a new code I want the ReferenceCode column to be a unique and random AlphaNumeric (containing only letters and digits) string with a specific length. This will allow users to refer to a specific ticket for instance.

These are some examples with 10 character lenght: TK254X26W1, W2S564Z111, 1STT135PA5...

Right now, I'm able to generate random strings with the given length. However, I'm not sure how to guarantee their uniqueness. I do it like this:

db.Tickets.Add(new Ticket()
{
   ReferenceCode = GenerateRandomCode(10),
   //...
});

To be exact, I want the GenerateRandomCode function or maybe another method to be able to make sure the generated string has not been used for another record.

I can use a for loop to check each generated code but I don't think it's a good idea. Especially after a while when the table will have thousands of records.

like image 267
Alireza Noori Avatar asked Mar 07 '17 00:03

Alireza Noori


People also ask

How do you generate unique strings in SQL Server?

If you need a string of random digits up to 32 characters for test data or just need some junk text to fill a field, SQL Server's NEWID() function makes this simple. NEWID() is used to create a new GUID (globally unique identifier), and we can use that as a base to get a string of random characters.

How do you insert unique rows in SQL?

INSERT DISTINCT Records INTO New Tables In order to copy data from an existing table to a new one, you can use the "INSERT INTO SELECT DISTINCT" pattern. After "INSERT INTO", you specify the target table's name - organizations in the below case.


2 Answers

You can use Guid in order to generate unique (but not that random when it comes to security) keys.

Pulling from this SO question:

Guid g = Guid.NewGuid();
string GuidString = Convert.ToBase64String(g.ToByteArray());
GuidString = GuidString.Replace("=","");
GuidString = GuidString.Replace("+","");
GuidString = GuidString.ToUpper();

will generate a unique key to fit your ReferenceCode property needs but longer (22 characters). Collapsing it and using a X characters would no longer guarantee its uniqueness.

OZVV5TPP4U6XJTHACORZEQ

like image 101
pijemcolu Avatar answered Oct 02 '22 22:10

pijemcolu


Mind an off-the-beaten-path solution? You've got two needs, that I can see:

  • Randomness. You can't have a "deterministic" function, because if someone can guess the algorithm, they could figure out everyone elses' ticket numbers.

  • Uniqueness. You can't have any duplicate ticket nums - which makes Random a bit difficult (you'll have to account for collisions and retry.)

But there's no reason you can't do both - you've got plenty of bit-space with 36^10. You could dedicate 4 bytes to Uniqueness, and 6 bytes to Randomness. Here's some sample code:

public partial class Form1 : Form
{

  private static Random random = new Random();
  private static int largeCoprimeNumber = 502277;
  private static int largestPossibleValue = 1679616;  // 36 ^ 4

  private static char[] Base36Alphabet = new char[] { '0','1','2','3','4','5','6','7','8','9',
    'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z' };

  public static string GetTicket(int id)
  {
    int adjustedID = id * largeCoprimeNumber % largestPossibleValue;
    string ticket = IntToString(adjustedID);
    while (ticket.Length < 4) ticket = "0" + ticket;
    return ticket + new string(Enumerable.Repeat(Base36Alphabet, 6) .Select(s => s[random.Next(s.Length)]).ToArray());
  }

  private static string IntToString(int value)
  {
    string result = string.Empty;
    int targetBase = Base36Alphabet.Length;

    do
    {
        result = Base36Alphabet[value % targetBase] + result;
        value = value / targetBase;
    }
    while (value > 0);

    return result;
}

Quick rundown on what the code's doing. You're passing in your int id - which it then hashes in such a way that it looks random, but is guaranteed to never repeat a number for the first 1.68 million entries.

It then takes this hashed int value, and turns it into a 4-digit code; this is the "uniqueness part" - you're guaranteed a different 4 digit code at the beginning of the first 1.68 million IDs (the magic of coprime numbers.)

That leaves 6 more characters to play with. Just fill them in with random characters - that makes the whole 10-digit code awfully difficult to guess.

This solves both of your problems. It's guaranteed to be unique for the first million+ records. And it's not really "guessable" by the client, since even if they guessed the algorithm, they'd have 2 billion different possibilities for any given ID they wanted to crack.

like image 26
Kevin Avatar answered Oct 03 '22 00:10

Kevin