Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UTF-8 String class for java

I need to hold lots of string objects in memory (hundreds of MB) and I want to hold them in UTF-8 format since in most cases it will require half of the memory the default implementation use.
The default String class requires for a 12 characters string 60 bytes (See http://blog.griddynamics.com/2010/01/java-tricks-reducing-memory-consumption.html).
Most of my Strings are 10-20 characters long.
I wonder if there is some open source library which offers a wrapper for such strings?
I know how to convert String to UTF-8 byte array but I'm looking for a wrapper class which will provide all needed utilities functions (Hash, Equal, toString, fromString, etc).

like image 777
Avner Levy Avatar asked Jan 09 '13 14:01

Avner Levy


1 Answers

Apache Avro has an UTF8 wrapper class which implements CharSequence, but I don't know the memory consumption of such objects

Hadoop has the Text class which has quite the kind of interface you desire

like image 149
Grooveek Avatar answered Oct 05 '22 03:10

Grooveek