I have strings that look like this:
000101456890
348324000433
888000033380
They are strings that are all the same length and they contain only numbers.
I would like to find a way to encode and then ompress (reduce the length) of the strings. The compression algoithm would need to just compress down to ASCII characters as these will be used as web page links.
So for example:
www.stackoverflow.com/000101456890 goes to www.stackoverflow.com/aJks
Is there some way I could do this, some method that would do the job of compressing quickly.
Thanks,
There is no encoding that "reduces size." Encodings are just mappings of bits to the character they represent. That said, ASCII is a 7 bit character set (encoding) that is often stored in 8 bits of space. If you limit the ranges that you accept, you can also weed out the control characters.
Start by taking the first character of the given string and appending it to the compressed string. Next, count the number of occurrences of that specific character and append it to the compressed string. Repeat this process for all the characters until the end of the string is reached.
string compression in java can be performed using a ZLIB compression library. It offers some distinct features to effectively compress string data in java. Although the compression rate could vary based on the factors such as the amount of compression required, length of data and repetitions in string data.
To do it simply, you could consider each as a long
(plenty of room there), and hex-encode; that gives you:
60c1bfa
5119ba72b1
cec0ed3264
base-64 would be shorter, but you'd need to look at it as big-endian (note most .NET is little-endian) and ignore leading 0 bytes. That gives you:
Bgwb+g==
URm6crE=
zsDtMmQ=
For example:
static void Main()
{
long x = 000101456890L, y = 348324000433L, z = 888000033380L;
Console.WriteLine(Convert.ToString(x, 16));
Console.WriteLine(Convert.ToString(y, 16));
Console.WriteLine(Convert.ToString(y, 16));
Console.WriteLine(Pack(x));
Console.WriteLine(Pack(y));
Console.WriteLine(Pack(z));
Console.WriteLine(Convert.ToInt64("60c1bfa", 16).ToString().PadLeft(12, '0'));
Console.WriteLine(Convert.ToInt64("5119ba72b1", 16).ToString().PadLeft(12, '0'));
Console.WriteLine(Convert.ToInt64("cec0ed3264", 16).ToString().PadLeft(12, '0'));
Console.WriteLine(Unpack("Bgwb+g==").ToString().PadLeft(12, '0'));
Console.WriteLine(Unpack("URm6crE=").ToString().PadLeft(12, '0'));
Console.WriteLine(Unpack("zsDtMmQ=").ToString().PadLeft(12, '0'));
}
static string Pack(long value)
{
ulong a = (ulong)value; // make shift easy
List<byte> bytes = new List<byte>(8);
while (a != 0)
{
bytes.Add((byte)a);
a >>= 8;
}
bytes.Reverse();
var chunk = bytes.ToArray();
return Convert.ToBase64String(chunk);
}
static long Unpack(string value)
{
var chunk = Convert.FromBase64String(value);
ulong a = 0;
for (int i = 0; i < chunk.Length; i++)
{
a <<= 8;
a |= chunk[i];
}
return (long)a;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With