Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I create an unique random sequence of characters in C#?

Tags:

string

c#

random

I'm implementing a URL shortening feature in my application in order to provide my users shorter alternative URLs that can be used in Twitter. The point is to be independent from the shortening services that offer this same service and include it as a feature of my web app.

What's the best way to create an unique random sequence of characters of about 6 chars? I plan to use that as an index for the items in my database that will have the alternative URLs.

Edited:

This feature will be used in a job board website, where every new job ad will get a custom URL with the title plus the shorter one to be used in Twitter. That said, the total number of unique 6 char combinations will be more than enough for a long time.

like image 373
holiveira Avatar asked Aug 14 '09 00:08

holiveira


People also ask

How to generate random string in C?

The rand() function generates the random alphabets in a string and srand() function is used to seed the rand() function. Initially, we have a program that sets the array of alphabets size as “ch_Max,” which is of int char data type.


5 Answers

Do you really need 'random', or would 'unique' be sufficient?

Unique is extremely simple - just insert the URL into a database, and convert the sequential id for that record to a base-n number which is represented by your chosen characterset.

For example, if you want to only use [A-Z] in your sequence, you convert the id of the record to a base 26 number, where A=1, B=2,... Z=26. The algothithm is a recursive div26/mod26, where the quotient is the required character and the remainder is used to calculate the next character.

Then when retrieving URL, you perform the inverse function, which is to convert the base-26 number back to decimal. Perform SELECT URL WHERE ID = decimal, and you're done!

EDIT:

private string alphabet = "abcdefghijklmnopqrstuvwxyz"; 
   // or whatever you want.  Include more characters 
   // for more combinations and shorter URLs

public string Encode(int databaseId)
{
    string encodedValue = String.Empty;

    while (databaseId > encodingBase)
    {
        int remainder;
        encodedValue += alphabet[Math.DivRem(databaseId, alphabet.Length, 
            out remainder)-1].ToString();
        databaseId = remainder;
    }
    return encodedValue;
}

public int Decode(string code)
{
    int returnValue;

    for (int thisPosition = 0; thisPosition < code.Length; thisPosition++)
    {
        char thisCharacter = code[thisPosition];

        returnValue += alphabet.IndexOf(thisCharacter) * 
            Math.Pow(alphabet.Length, code.Length - thisPosition - 1);
    }
    return returnValue;
}
like image 116
Kirk Broadhurst Avatar answered Oct 12 '22 22:10

Kirk Broadhurst


The simplest way to make unique sequences is to do this sequentially, ie: aaaaaa aaaaab aaaaac ... These aren't necessarily the prettiest, but will guarantee uniqueness for the first 12230590463 sequences (provided you used a-z and A-Z as unique characters). If you need more URLs than that, you'd need to add a seventh char.

They aren't random sequences, though. If you make random ones, just pick a random char of the 48, 6 times. You'll need to check your existing DB for "used" sequences, though, as you'll be more likely to get collisions.

like image 45
Reed Copsey Avatar answered Oct 12 '22 22:10

Reed Copsey


I would use an autonumber system, and create an algorithm to generate the keys. ie 1 = a, 2 = b, 27 = aa etc.

You can use the database autonumber to guarantee that your URL is unique, and you can calculate the URL possibly in a sproc in the DB or in your business layer?

Additionally you can now index on the incrementing number which is cheap and DB's are optimised for these to be used and hashed as primary/foreign keys as opposed to a variable length random string.

like image 27
Spence Avatar answered Oct 12 '22 22:10

Spence


The usefulness of a random generator is limited to preventing users from plugging random URLs in to find things they shouldn't have a link to. If this is not your goal then sequential IDs should work just fine. If you just don't want to give users the impression that they are using "infant" technology (when they see that their job ad is #000001), why not start the sequence at some arbitrary value?

like image 24
Tullo_x86 Avatar answered Oct 12 '22 23:10

Tullo_x86


When you state "total number of unique 6 char combinations will be more than enough for a long time" for your random generation have you factored the birthday paradox into your calculations? This is generally the bane of any attempt to create random IDs within a range that is only 1 order of magnitude or less then the expected range that will be needed.

To create truly random IDs, you would need to create a loop that generates a new random value, checks to see if that value has already been used, and then repeats the loop if needed. The birthday paradox means that you quickly get to the point where many of the values generated are already in use (despite only a fraction of the total range being consumed), which causes the program to get slower and slower over time until it is taking thousands of attempts (and database lookups) to generate each ID.

I would suggest you go with the idea of encoding sequential IDs. To avoid the problem of users being able to simply increment/decrement the value in the URL to "explore", you can use a combination bit shifting and an alternate ordered list of letters (instead of 1=a, 2=b use 1=t, 2=j, etc).

like image 31
David Avatar answered Oct 13 '22 00:10

David