I am implementing the LZW algorithm. I have implemented it for strings and text files successfully and am currently modifying my code to work with binary files, such as images or executables (since I cannot read these files as strings).
I have replaced the String
type in my code with the ArrayList<Byte>
type. My code now compresses and decompresses binary files correctly, however it is at least 10x slower! This is unacceptable in a compression application where speed is a critical element.
Have I made the correct substitution of ArrayList<Byte>
for String
. Is there a faster alternative with similar functionality? Note that the LZW algorithm requires array resizing hence standard arrays[]
are not suitable.
Regards.
Using a List<Byte>
will box every byte into a separate object instance.
In bulk, that is one of the worst possible things you can do for performance.
By contrast, an array or string can occupy a solid block of memory.
Instead, you should use ByteArrayOutputStream
, or use byte[]
directly and resize as needed (you can make a wrapper class for that)
You are boxing byte
s within an ArrayList
, which uses much more memory than simple String
s. What this means is each byte
is wrapped in a whole object, and referred to by a reference. Note that such a reference is itself 4 to 8 times larger than the original byte!
You would be much better off using primitive byte []
arrays, or perhaps a primitive collections library (which properly abstracts primitive arrays as collections) such as this or this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With