i want to uniquely shorten strings-file ids to use in urls like the ones on bit.ly etc. I can use ids from a db but i want urls to be random like.
what would be the best solution?
site will be a mobile site so i want to it to as short as possible
C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...
In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.
C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.
What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.
You can't "uniquely shorten" arbitrary strings. Pigeonhole principle and all.
What you want to do (and, AFAIK what url-shortening services do) is keep a database of everything submitted, and the short string used. Then you can look it up in the database.
You can generate the short strings by simply incrementing a number and Base64 encoding it for each time.
There are two methods to implementing a mapping service like the one you describe.
Clients submit globally unique ids
As far as I know, 1. should only be attempted with Guid
s, unless you devise a similar means to cram sufficiently distinct information into a short byte stream. Either way, if you have a stream of bytes that represent a globally unique identifier, you may do something like this
// source is either a Guid, or some other globally unique byte stream
byte[] bytes = Guid.NewGuid ().ToByteArray ();
string base64String = Convert.ToBase64String (bytes).Trim ("=");
to obtain a user-readable string of alphanumerics that appears random, but avoids collisions inherent in other random schemes. A Guid
contains 16 bytes, or 128 bits, which translates to approximately 19 characters for a full Base64 encoding.
The advantage to this approach is that clients may generate their own tiny Uris without a central authority. The downside is hefty length if you roll with Guid
, or implementing your own globally unique byte stream which - let's face it - is error prone.
If you do go this route, consider Google'ing globally unique byte streams or the such. Oh, and STAY AWAY FROM RANDOM BYTES, otherwise you will have to build collision resolution ON TOP OF your tiny Uri generator.
Server generates globally unique ids
Again, the primary advantage to the above is that Client's may generate their Uris a priori. Particularly handy if you are about to submit a long running request you wish to check up on. This may not be particularly relevant to your situation, and may provide only limited value.
So, that aside, a server-centric approach, in which a single authority generates and doles out ids may be more appealing. If this is the route you choose, then the only question is how long would you like your Uri?
Presuming a desired length of 5 characters, and let's say you go with a Base64 encoding, each id may represent up to 5 characters by 7 bits per character equals 35 bits or 2^35 [34 359 738 368] distinct values. That's a fairly large domain. *
Then it becomes a question of returning a value for a given submission. There are probably a great many many ways to do this, but I would go with something like this,
Enhancements or optimizations may include
Conclusion
Bottom line is, you want to guarantee uniqueness - so collisions are a big no-no.
*=34 359 738 368 is the size of the raw domain, this is all ids of 0 length to 5 length. If you are interested in restricting all ids to a minimum and maximum of 5 length, then your domain looks like all ids of length 0 to 5 (2^35) less all ids of length 0 to 4 (2^28) is 2^35 - 2^28 = 34 091 302 912, which is still quite large :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With