Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why my JDBC call is consuming memory 4 times more that actual size of data

Tags:

java

jdbc

I wrote a small java program which loads data from DB2 database using simple JDBC call. I am using select query to get data and using java statement for this purpose. I have properly closed statement and connection objects. I am using 64 bit JVM for compilation and for running the program.

The query is returning 52 million records, each row having 24 columns, which takes me around 4 minutes to load complete data in Unix (having multiprocessor environment). I am using HashMap as data-structure to load the data: Map<String, Map<String, GridTradeStatus>>. The bean GridTradeStatus is a simple getter/setter bean with 24 properties in it.

The memory required for the program is alarmingly high. Java heap size goes up to 5.8 - 6GB to load complete data while actual used heap size remains between 4.7 - 4.9GB. I know that we should not load this much data into memory but my business requirements are in that way only.

The question is that when I put whole data of my table in a flat file it comes out to be roughly equivalent to ~1.2GB. I want to know why my java program is consuming memory 4 times more that its actual size.

like image 539
Prashant Mishra Avatar asked Nov 13 '22 05:11

Prashant Mishra


1 Answers

There is nothing surprising here (to me at least).

a.) Strings in java consume double the space compared to most common text formats (because Strings are always represented as UTF-16 in the heap). Also, String as an object has quite some overhead (String object itself, reference to the char[] it contains, hashCode etc.). For small strings the String object costs easily as much memory as the data it contains.

b.) You put stuff into a HashMap. HashMap is not exactly memory efficient. First it uses a default load factor of 75%, which means a map with many entries has also a big bucket array. Then, each entry in the map is an object itself, which costs at least two references (key and value) plus object overhead.

In conclusion you pretty much have to expect the memory requirements to increase quite a bit. A factor of 4 is reasonable if your average data String is relatively short.

like image 99
Durandal Avatar answered Nov 15 '22 10:11

Durandal