Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Embedded SQL Firebird batch update OutOfMemoryError with Java SE

I'm having a big problem with the embedded firebird database engine and Java SE. I'm currently developing a filtering tool for users to filter out data. So I have made two Options for filtering, user can chose one or both:

  1. Filter out from a black list(the black list is controled by user).
  2. Filter out according to a massive list that records every record ever uploaded and filtered out.

The data the user uploads its in plain text comma or token separated like this:

(SET OF COLUMNS)| RECORD TO FILTER |
0-MANY COLUMNS  |       ABC2       |
0-MANY COLUMNS  |       ABC5       |

When I Upload it to the DB, I add A FLAG for every filter

(SET OF COLUMNS) | RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B  |
0-MANY COLUMNS   |       ABC2       |                   |                    |
0-MANY COLUMNS   |       ABC5       |                   |                    | 

So, when it comes to the second filter, the program has a main empty table on the first run of the software, then it fills that table with all the records from the very first upload. The main table will have unique records like the following table after a few text uploads made by the user:

 Record |      Date criteria for filtering      |
 ABC1   | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
 ABC2   | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
 ABC3   | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
 ABC4   | 12/11/2012:1,11/11/2012:0,10/11/2012:0|
 ABC5   | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
 ABC9   | 11/11/2012:3,10/11/2012:1,09/11/2012:0|

When the data is processed, for example, the software updates both, the main table and the user table:

(SET OF COLUMNS| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B  |
0-MANY COLUMNS |       ABC4       |                   |                    | 
0-MANY COLUMNS |       ABC9       |                   |                    | 

So the main table will update:

 Record |      Day criteria for filtering      |
 ABC1   | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
 ABC2   | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
 ABC3   | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
 ABC4   | 12/11/2012:1,11/11/2012:0,10/11/2012:0| ->12/11/2012:2,11/11/2012:0,10/11/2012:0
 ABC5   | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
 ABC9   | 11/11/2012:3,10/11/2012:1,09/11/2012:0| ->12/11/2012:1,11/11/2012:3,10/11/2012:1

If in the last three days the data criteria event has reached more than four, the user table will flag filter B. Notice that each date has an integer next to it.

(SET OF COLUMNS)| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B  |
 0-MANY COLUMNS |       ABC4       |                   |                    | 
 0-MANY COLUMNS |       ABC9       |                   |          X         | 

Both updates are in a single transaction, the problem is that when the user uploads more than 800,000 records my program throws the following exception in the while loop. I use StringBuilder parsing and append methods for maximun performance on the mutable days string.

java.lang.OutOfMemoryError: Java heap space

Here it's my code, I use five days:

FactoriaDeDatos factoryInstace = FactoriaDeDatos.getInstance();
Connection cnx = factoryInstace.getConnection();
cnx.setAutoCommit(false);
PreparedStatement pstmt = null;
ResultSet rs=null;
pstmt = cnx.prepareStatement("SELECT CM.MAIL,CM.FECHAS FROM TCOMERCIALMAIL CM INNER JOIN TEMPMAIL TMP ON CM.MAIL=TMP."+colEmail);
rs=pstmt.executeQuery();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_MAIL INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_FECHAS INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
sql_com_local_tranx=0;
int trxNum=0;
int ix=0;
int ixE1=0;
int ixAc=0;
StringBuilder sb;
StringTokenizer st;
String fechas;
int pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9;
StringBuilder s1,s2,sSQL,s4,s5,s6,s7,s8,s9,s10;
long startLoop = System.nanoTime();

long time2 ;
boolean ejecutoMax=false;
//int paginador_sql=1000;
//int trx_ejecutada=0;
sb=new StringBuilder();
s1=new StringBuilder();
s2=new StringBuilder();
sSQL=new StringBuilder();
s4=new StringBuilder();
s6=new StringBuilder();
s8=new StringBuilder();
s10=new StringBuilder();
while(rs.next()){
   //De aqui
   actConteoDia=0;
   sb.setLength(0);
   sb.append(rs.getString(2));
   pos1= sb.indexOf(":",0);  
   pos2= sb.indexOf(",",pos1+1);  
   pos3= sb.indexOf(":",pos2+1);  
   pos4= sb.indexOf(",",pos3+1);  
   pos5= sb.indexOf(":",pos4+1);
   pos6= sb.indexOf(",",pos5+1);
   pos7= sb.indexOf(":",pos6+1);
   pos8= sb.indexOf(",",pos7+1);
   pos9= sb.indexOf(":",pos8+1);
   s1.setLength(0);
   s1.append(sb.substring(0, pos1));
   s2.setLength(0);
   s2.append(sb.substring(pos1+1, pos2));
   s4.setLength(0);
   s4.append(sb.substring(pos3+1, pos4));
   s6.setLength(0);
   s6.append(sb.substring(pos5+1, pos6));
   s8.setLength(0);
   s8.append(sb.substring(pos7+1, pos8));
   s10.setLength(0);
   s10.append(sb.substring(pos9+1));
   actConteoDia=Integer.parseInt(s2.toString());
   actConteoDia++;
   sb.setLength(0);
   //sb.append(s1).a
   if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
      actConteoDia=MAXIMO_LIMITE_POR_SEMANA+1;
   }
   sb.append(s1).append(":").append(actConteoDia).append(",").append(rs.getString(2).substring(pos2+1, rs.getString(2).length()));
   //For every date record it takes aprox 8.3 milisec by record

   sSQL.setLength(0);
   sSQL.append("UPDATE TCOMERCIALMAIL SET FECHAS='").append(sb.toString()).append("' WHERE MAIL='").append(rs.getString(1)).append("'");

   pstmtDet1.addBatch(sSQL.toString());
   //actConteoDia=0;
   //actConteoDia+=Integer.parseInt(s2.toString());
   actConteoDia+=Integer.parseInt(s4.toString());
   actConteoDia+=Integer.parseInt(s6.toString());
   actConteoDia+=Integer.parseInt(s8.toString());
   actConteoDia+=Integer.parseInt(s10.toString());
   if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
      sSQL.setLength(0);
      sSQL.append("UPDATE TEMPMAIL SET DIASLIMITE='S' WHERE ").append(colEmail).append("='").append(rs.getString(1)).append("'");
      pstmtDet.addBatch(sSQL.toString());
   }

   sql_com_local_tranx++;

   if(sql_com_local_tranx%2000==0 || sql_com_local_tranx%7000==0  ){
      brDias.setString("PROCESANDO "+sql_com_local_tranx);
      pstmtDet1.executeBatch();
      pstmtDet.executeBatch();

   }
   if(sql_com_local_tranx%100000==0){
       System.gc();
       System.runFinalization();
   }
}

pstmtDet1.executeBatch();
pstmtDet.executeBatch();
cnx.commit();

I've made telemetry tests so I can trace where the problem lies. The big while it's the problem, I think, but I don't know where the problem may exactly be. I'm adding some images of the telemtry tests, please I need to interpretate them properly.

The gc becomes inverse to the time the jvm to keep objects alive:

http://imageshack.us/photo/my-images/849/66780403.png

The memory heap goes from 50 MB to 250 MB, the used heap reaches the 250 MB creating the outOfMemory exception:

50 MB
http://imageshack.us/photo/my-images/94/52169259.png

REACHING 250 MB
http://imageshack.us/photo/my-images/706/91313357.png

OUT OF MEMORY
http://imageshack.us/photo/my-images/825/79083069.png

The final stack of objetcts generated ordered by LiveBytes:

http://imageshack.us/photo/my-images/546/95529690.png

Any help, suggestion, answer will be vastly appreciated.

like image 744
jRam90 Avatar asked Apr 25 '26 11:04

jRam90


2 Answers

The problem is that you are using PreparedStatement as if it is a Statement, as you are calling addBatch(string). The javadoc of this method says:

Note:This method cannot be called on a PreparedStatement or CallableStatement.

This comment was added with JDBC 4.0, before that it said the method was optional. The fact that Jaybird allows you to call this method on PreparedStatement is therefor a bug: I created issue JDBC-288 in the Jaybird tracker.

Now to the cause of the OutOfMemoryError: When you use addBatch(String) on the PreparedStatement implementation of Jaybird (FBPreparedStatement), it is added to a list internal to the Statement implementation (FBStatement). In case of FBStatement, when you call executeBatch(), it will execute all statements in this list and then clear it. In FBPreparedStatement however executeBatch() is overridden to execute the originally prepared statement with batch parameters (in your example it won't do anything, as you never actually add a PreparedStatement-style batch). It will never execute the statements you added with addBatch(String), but it will also not clear the list of statements in FBStatement and that is most likely the cause of your OutOfMemoryError.

Based on this, the solution should be to create a Statement using cnx.createStatement and use that to execute your queries, or investigate if you could benefit from using one or more PreparedStatement objects with a parameterized query. It looks like you should be able to use two separate PreparedStatements, but I am not 100% sure; the added benefit would be protection against SQL injection and a minor performance improvement.

Addendum

This issue has been fixed since Jaybird 2.2.2

Full disclosure: I am the developer of the Jaybird / Firebird JDBC driver.

like image 195
Mark Rotteveel Avatar answered May 01 '26 03:05

Mark Rotteveel


do not execute batch statements while iterating through the result set. store the sql you want to execute in a collection and then when you have finished processing the result set start executing the new sql. does everything have to happen inside the same transaction?

like image 39
BevynQ Avatar answered May 01 '26 03:05

BevynQ



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!