I am implementing the LZW algorithm. I have implemented it for strings and text files successfully and am currently modifying my code to work with binary files, such as images or executables (since I cannot read these files as strings).
I have replaced the String type in my code with the ArrayList<Byte> type. My code now compresses and decompresses binary files correctly, however it is at least 10x slower! This is unacceptable in a compression application where speed is a critical element.
Have I made the correct substitution of ArrayList<Byte> for String. Is there a faster alternative with similar functionality? Note that the LZW algorithm requires array resizing hence standard arrays[] are not suitable.
Regards.
Using a List<Byte> will box every byte into a separate object instance.
In bulk, that is one of the worst possible things you can do for performance.
By contrast, an array or string can occupy a solid block of memory.
Instead, you should use ByteArrayOutputStream, or use byte[] directly and resize as needed (you can make a wrapper class for that)
You are boxing bytes within an ArrayList, which uses much more memory than simple Strings. What this means is each byte is wrapped in a whole object, and referred to by a reference. Note that such a reference is itself 4 to 8 times larger than the original byte!
You would be much better off using primitive byte [] arrays, or perhaps a primitive collections library (which properly abstracts primitive arrays as collections) such as this or this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With