lately i've faced a problem which gets me so confused , the problem is : i want to compress a sequence so no information is lost , for example :
a,a,a,b --> a,b
a,b,a,a,c --> a,b,a,a,c (it can't be compressed to a,b,a,c because in this way we lose a,a)
Is there any algorithm to do such a thing ? what is name of this problem ? is it compression ? or anything else ? I would really appreciate any help Thanks in advance
Every algorithm which is able to transform data in such a way that is takes up less memory is called compression. May it be lossless or lossy.
e.g. (compressed form for "example given" :-) )
The following is imho the simplest form, called run length encoding, in short RLE:
a,a,a,b,c -> 3a,1b,1c
As you can see all subsequent characters which are identical are compressed into one.
You can also search for subsequent patterns which is much more difficult:
a,b,a,b,a,c --> 2(a,b),1(a),1(c)
There are lots of literature and web sources about compression algorithms, you should use them to get a deeper view.
Another good algorithm is Lempel–Ziv–Welch
I found marvellous this simple Javascript LZW function, from the magicians at 140 bytes of javascript :
function compress(a){
for (
var b = a + "Ā", // Append first "illegal" character (charCode === 256).
c = [], // dictionary
d = 0, // dictionary size
e = d, // iterator
f = c, // w
g = c, // result
h; // c
h = b.charAt(e++);
)
c[h] = h.charCodeAt(), // Fill in the dictionary ...
f = 1 + c[a = f + h] ? a : (g[d++] = c[f], c[a] = d + 255, h);
return g
}
console.log(compress("aaabbcccccb"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With