Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ArrayList<Byte> vs String in Java

I am implementing the LZW algorithm. I have implemented it for strings and text files successfully and am currently modifying my code to work with binary files, such as images or executables (since I cannot read these files as strings).

I have replaced the String type in my code with the ArrayList<Byte> type. My code now compresses and decompresses binary files correctly, however it is at least 10x slower! This is unacceptable in a compression application where speed is a critical element.

Have I made the correct substitution of ArrayList<Byte> for String. Is there a faster alternative with similar functionality? Note that the LZW algorithm requires array resizing hence standard arrays[] are not suitable.

Regards.

like image 745
Danny Rancher Avatar asked Mar 21 '23 21:03

Danny Rancher


2 Answers

Using a List<Byte> will box every byte into a separate object instance.
In bulk, that is one of the worst possible things you can do for performance.

By contrast, an array or string can occupy a solid block of memory.

Instead, you should use ByteArrayOutputStream, or use byte[] directly and resize as needed (you can make a wrapper class for that)

like image 79
SLaks Avatar answered Apr 02 '23 02:04

SLaks


You are boxing bytes within an ArrayList, which uses much more memory than simple Strings. What this means is each byte is wrapped in a whole object, and referred to by a reference. Note that such a reference is itself 4 to 8 times larger than the original byte!

You would be much better off using primitive byte [] arrays, or perhaps a primitive collections library (which properly abstracts primitive arrays as collections) such as this or this.

like image 45
phs Avatar answered Apr 02 '23 02:04

phs