My ASP.NET page has following query string parameter:
…?IDs=1000000012,1000000021,1000000013,1000000022&...
Here IDs
parameter will always have numbers separated by something, in this case ,
. Currently there are 4 numbers but normally they would be in between 3
and 7
.
Now, I am looking for method to convert each big number from above into smallest possible value; specifically compressing value of IDs
query string parameter. Both, compressing each number algorithm or compressing whole value of IDs
query string parameter are welcome.
IDs
query string parameter.IDs
and then retrieving its value from some data source is out of scope.Is there an algorithm to compress such big numbers to small values or to compress value of the IDs
query string parameter all together?
Well, yes, a big number can be expressed mathematically, and potentially save some space doing so. So if you convert each character to its ASCII value, then every character is expanded from a single byte to 1, 2, or 3 bytes. That is, 'A' becomes '65'. 'z' becomes '122'.
"String Compression Algorithm” or “Run Length Encoding” happens when you compress a string, and the consecutive duplicates of each string are replaced with the character, followed by the consecutive, repeated character count. For example: After string compression, the string “aaaabbcddddd” would return “a4b2c1d5”.
Algorithm for string compression in pythonPick the first character from the input string ( str ). Append it to the compressed string. Count the number of subsequent occurrences of the character (in str) and append the count to the compressed string if it is more than 1 only.
string compression in java can be performed using a ZLIB compression library. It offers some distinct features to effectively compress string data in java. Although the compression rate could vary based on the factors such as the amount of compression required, length of data and repetitions in string data.
You basically need so much room for your numbers because you are using base 10 to represent them. An improvement would be to use base 16 (hex). So for example, you could represent 255 (3 digits) as ff (2 digits).
You can take that concept further by using a much larger number base... the set of all characters that are valid query string parameters:
A-Z, a-z, 0-9, '.', '-', '~', '_', '+'
That gives you a base of 67 characters to work with (see Wikipedia on QueryString).
Have a look at this SO post for approaches to converting base 10 to arbitrary number bases.
EDIT:
In the linked SO post, look at this part:
string xx = IntToString(42,
new char[] { '0','1','2','3','4','5','6','7','8','9',
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x'});
That's almost what you need. Just expand it by adding the few characters it is missing:
yz.-~_+
That post is missing a method to go back to base 10. I'm not going to write it :-) but the procedure is like this:
Define a counter I'll call TOTAL.
Look at the right most character and find it's position in the array.
TOTAL = (the position of the character in the array)
Example: Input is BA1. TOTAL is now 1 (since "1" is in position 1 in the array)
Now look at the next character left of the first one and find it's position in the array. TOTAL += 47 * (the position of the character in the array) Example: Input is BA1. TOTAL is now (47 * 11) + 1 = 518
Now look at the next character left of the previous one and find it's position in the array. TOTAL += 47 * 47 * (the position of the character in the array) Example: Input is BA1. Total is now (47 * 47 * 10) + (47 * 11) + 1 = 243508
And so on.
I suggest you write a unit test that converts a bunch of base 10 numbers into base 47 and then back again to make sure your conversion code works properly.
Note how you represented a 6 digit base 10 number in just 3 digits of base 47 :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With