Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using char[] array in HashMap in java

Tags:

java

hashmap

I am using a HashMap with a char array as the key. But when I put the key, value into the HashMap and print it the value printed as the key is some strange value. If I use a string instead, it is working fine. Please let me know how to make this work for char[] values.

            while((ch = br.read()) != -1) {
            s = new char[10];
            s[i++] = (char)ch;

            while( (i < 10) && ((ch = br.read()) != -1)) {
                s[i++] = (char)ch;                  
            }

            //System.out.println(s);
            hmap.put(s, 1); 
            //System.out.println(hmap);
            i = 0;              
        }

Key: BZOUA1578L Value: 1

The contents in the hashmap are {[C@31ff23=1} instead of {BZOUA1578L, 1}

like image 438
vamsi360 Avatar asked Nov 30 '22 16:11

vamsi360


2 Answers

I have a huge GBs of data to be read and if I use String, I am running out of the heap size.

You will have to think of a different way then.

  • You could use consider using a TreeMap<char[], V> with a custom Comparator ... but this is a stop-gap measure.

  • You could implement a custom hash table-based Map class that used char[] as the key, but did the array hashing and equality tests without using the key object's equals(Object) and hashcode() methods. This also is a stop-gap measure.

  • You could get a bigger machine ... another stop-gap measure.

  • You could redesign the algorithm so that it doesn't have to put all of the data into a big in-memory hash table in the address space of one Java program.

What you are currently doing doesn't scale. Even if you manage to implement a map using char[] keys instead of String keys, you probably only reduce the space used to hold the keys by half. The best this is going to give you is the ability to handle roughly twice the number of keys you can currently handle. Then you hit the wall again.

In the long term, the last option is the best if you want to keep scaling up.


Incidentally @Sean Patrick Floyd's suggestion of interning the key strings will probably only make matters worse. First, you won't ever get Strings that are equal but not == as keys in the same map. So interning doesn't save anything. Second, interning is performed using a private hash table, and the JVM needs to allocate space to represent this table.

The only scenario where interning is potentially worthwhile is if the strings that you are using to to do map lookups are likely to survive multiple GC cycles. Only then could interning save space.


Finally, there's one scenario which may cause your key strings to use far more memory than you realize. Consider this:

BufferedReader br = ...
Map<String, Value> map = new HashMap<String, Value>();

String line;
while ((line = br.readLine()) != null) {
    ...
    String key = line.substring(...);
    map.put(key, ...);
}

The problem is that the substring method shares the same backing char[] as the original String. If key is long-lived (which it probably will be), this means that the original large backing array will also be long lived, even though we're only ever going to refer to a slice of that array it via the key object.

The solution is to write this:

    String key = new String(line.substring(...));

which forces the copying of the characters into a new (smaller) character array.

UPDATE - Changes to the implementation of java.lang.String in Java 7 addressed this problem. The substring methods now make a copy of the relevant slice of the backing array.

like image 84
Stephen C Avatar answered Dec 10 '22 03:12

Stephen C


Printing is the least of your poblems, really. The real problem is that char[] does not have meaningful implementation of the equals() and hashCode() methods that HashMap relies on. Therefore the map will not recognize different arrays with the same content as the same keys, which in all probablity defeats your purpose of using a hash map in the first place.

So you need to wrap your arrays in something that does have equals() and hashCode(). May I suggest wrapping them in java.lang.String?

like image 30
hmakholm left over Monica Avatar answered Dec 10 '22 01:12

hmakholm left over Monica