I'm having a big problem with the embedded firebird database engine and Java SE. I'm currently developing a filtering tool for users to filter out data. So I have made two Options for filtering, user can chose one or both:
The data the user uploads its in plain text comma or token separated like this:
(SET OF COLUMNS)| RECORD TO FILTER |
0-MANY COLUMNS | ABC2 |
0-MANY COLUMNS | ABC5 |
When I Upload it to the DB, I add A FLAG for every filter
(SET OF COLUMNS) | RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC2 | | |
0-MANY COLUMNS | ABC5 | | |
So, when it comes to the second filter, the program has a main empty table on the first run of the software, then it fills that table with all the records from the very first upload. The main table will have unique records like the following table after a few text uploads made by the user:
Record | Date criteria for filtering |
ABC1 | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
ABC2 | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
ABC3 | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
ABC4 | 12/11/2012:1,11/11/2012:0,10/11/2012:0|
ABC5 | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
ABC9 | 11/11/2012:3,10/11/2012:1,09/11/2012:0|
When the data is processed, for example, the software updates both, the main table and the user table:
(SET OF COLUMNS| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC4 | | |
0-MANY COLUMNS | ABC9 | | |
So the main table will update:
Record | Day criteria for filtering |
ABC1 | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
ABC2 | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
ABC3 | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
ABC4 | 12/11/2012:1,11/11/2012:0,10/11/2012:0| ->12/11/2012:2,11/11/2012:0,10/11/2012:0
ABC5 | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
ABC9 | 11/11/2012:3,10/11/2012:1,09/11/2012:0| ->12/11/2012:1,11/11/2012:3,10/11/2012:1
If in the last three days the data criteria event has reached more than four, the user table will flag filter B. Notice that each date has an integer next to it.
(SET OF COLUMNS)| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC4 | | |
0-MANY COLUMNS | ABC9 | | X |
Both updates are in a single transaction, the problem is that when the user uploads more than 800,000 records my program throws the following exception in the while loop. I use StringBuilder parsing and append methods for maximun performance on the mutable days string.
java.lang.OutOfMemoryError: Java heap space
Here it's my code, I use five days:
FactoriaDeDatos factoryInstace = FactoriaDeDatos.getInstance();
Connection cnx = factoryInstace.getConnection();
cnx.setAutoCommit(false);
PreparedStatement pstmt = null;
ResultSet rs=null;
pstmt = cnx.prepareStatement("SELECT CM.MAIL,CM.FECHAS FROM TCOMERCIALMAIL CM INNER JOIN TEMPMAIL TMP ON CM.MAIL=TMP."+colEmail);
rs=pstmt.executeQuery();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_MAIL INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_FECHAS INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
sql_com_local_tranx=0;
int trxNum=0;
int ix=0;
int ixE1=0;
int ixAc=0;
StringBuilder sb;
StringTokenizer st;
String fechas;
int pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9;
StringBuilder s1,s2,sSQL,s4,s5,s6,s7,s8,s9,s10;
long startLoop = System.nanoTime();
long time2 ;
boolean ejecutoMax=false;
//int paginador_sql=1000;
//int trx_ejecutada=0;
sb=new StringBuilder();
s1=new StringBuilder();
s2=new StringBuilder();
sSQL=new StringBuilder();
s4=new StringBuilder();
s6=new StringBuilder();
s8=new StringBuilder();
s10=new StringBuilder();
while(rs.next()){
//De aqui
actConteoDia=0;
sb.setLength(0);
sb.append(rs.getString(2));
pos1= sb.indexOf(":",0);
pos2= sb.indexOf(",",pos1+1);
pos3= sb.indexOf(":",pos2+1);
pos4= sb.indexOf(",",pos3+1);
pos5= sb.indexOf(":",pos4+1);
pos6= sb.indexOf(",",pos5+1);
pos7= sb.indexOf(":",pos6+1);
pos8= sb.indexOf(",",pos7+1);
pos9= sb.indexOf(":",pos8+1);
s1.setLength(0);
s1.append(sb.substring(0, pos1));
s2.setLength(0);
s2.append(sb.substring(pos1+1, pos2));
s4.setLength(0);
s4.append(sb.substring(pos3+1, pos4));
s6.setLength(0);
s6.append(sb.substring(pos5+1, pos6));
s8.setLength(0);
s8.append(sb.substring(pos7+1, pos8));
s10.setLength(0);
s10.append(sb.substring(pos9+1));
actConteoDia=Integer.parseInt(s2.toString());
actConteoDia++;
sb.setLength(0);
//sb.append(s1).a
if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
actConteoDia=MAXIMO_LIMITE_POR_SEMANA+1;
}
sb.append(s1).append(":").append(actConteoDia).append(",").append(rs.getString(2).substring(pos2+1, rs.getString(2).length()));
//For every date record it takes aprox 8.3 milisec by record
sSQL.setLength(0);
sSQL.append("UPDATE TCOMERCIALMAIL SET FECHAS='").append(sb.toString()).append("' WHERE MAIL='").append(rs.getString(1)).append("'");
pstmtDet1.addBatch(sSQL.toString());
//actConteoDia=0;
//actConteoDia+=Integer.parseInt(s2.toString());
actConteoDia+=Integer.parseInt(s4.toString());
actConteoDia+=Integer.parseInt(s6.toString());
actConteoDia+=Integer.parseInt(s8.toString());
actConteoDia+=Integer.parseInt(s10.toString());
if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
sSQL.setLength(0);
sSQL.append("UPDATE TEMPMAIL SET DIASLIMITE='S' WHERE ").append(colEmail).append("='").append(rs.getString(1)).append("'");
pstmtDet.addBatch(sSQL.toString());
}
sql_com_local_tranx++;
if(sql_com_local_tranx%2000==0 || sql_com_local_tranx%7000==0 ){
brDias.setString("PROCESANDO "+sql_com_local_tranx);
pstmtDet1.executeBatch();
pstmtDet.executeBatch();
}
if(sql_com_local_tranx%100000==0){
System.gc();
System.runFinalization();
}
}
pstmtDet1.executeBatch();
pstmtDet.executeBatch();
cnx.commit();
I've made telemetry tests so I can trace where the problem lies. The big while it's the problem, I think, but I don't know where the problem may exactly be. I'm adding some images of the telemtry tests, please I need to interpretate them properly.
The gc becomes inverse to the time the jvm to keep objects alive:
http://imageshack.us/photo/my-images/849/66780403.png
The memory heap goes from 50 MB to 250 MB, the used heap reaches the 250 MB creating the outOfMemory exception:
50 MB
http://imageshack.us/photo/my-images/94/52169259.png
REACHING 250 MB
http://imageshack.us/photo/my-images/706/91313357.png
OUT OF MEMORY
http://imageshack.us/photo/my-images/825/79083069.png
The final stack of objetcts generated ordered by LiveBytes:
http://imageshack.us/photo/my-images/546/95529690.png
Any help, suggestion, answer will be vastly appreciated.
The problem is that you are using PreparedStatement as if it is a Statement, as you are calling addBatch(string). The javadoc of this method says:
Note:This method cannot be called on a PreparedStatement or CallableStatement.
This comment was added with JDBC 4.0, before that it said the method was optional. The fact that Jaybird allows you to call this method on PreparedStatement is therefor a bug: I created issue JDBC-288 in the Jaybird tracker.
Now to the cause of the OutOfMemoryError: When you use addBatch(String) on the PreparedStatement implementation of Jaybird (FBPreparedStatement), it is added to a list internal to the Statement implementation (FBStatement). In case of FBStatement, when you call executeBatch(), it will execute all statements in this list and then clear it. In FBPreparedStatement however executeBatch() is overridden to execute the originally prepared statement with batch parameters (in your example it won't do anything, as you never actually add a PreparedStatement-style batch). It will never execute the statements you added with addBatch(String), but it will also not clear the list of statements in FBStatement and that is most likely the cause of your OutOfMemoryError.
Based on this, the solution should be to create a Statement using cnx.createStatement and use that to execute your queries, or investigate if you could benefit from using one or more PreparedStatement objects with a parameterized query. It looks like you should be able to use two separate PreparedStatements, but I am not 100% sure; the added benefit would be protection against SQL injection and a minor performance improvement.
This issue has been fixed since Jaybird 2.2.2
Full disclosure: I am the developer of the Jaybird / Firebird JDBC driver.
do not execute batch statements while iterating through the result set. store the sql you want to execute in a collection and then when you have finished processing the result set start executing the new sql. does everything have to happen inside the same transaction?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With