I have a user table (Oracle 11g DB) with more than 1 million rows which has all the user passwords in plain text which I am trying to hash using SHA512 algorithm (hash and salt). To start with below is my Java class to read all the records from user table, hash it and update back to user table.
setFetchSize(1000)
)try {
ps = con.prepareStatement("update user set password=? where ID=?");
psSel = con.prepareStatement("select ID, password from user");
psSel.setFetchSize(1000);
rs = psSel.executeQuery();
String hashPassword = null;
while (rs.next()) {
long id = rs.getLong(1);
String pwd = rs.getString(2);
hashPassword = <<CALL TO PASSWORD HASHING UTIL>>;
ps.setString(1, hashPassword);
ps.setLong(2, id);
ps.addBatch();
//Every 5000 records update and commit
if(++count % batchSize == 0) {
ps.executeBatch();
con.commit();
}
}
ps.executeBatch();
con.commit();
} catch (SQLException e) {
e.printStackTrace();
}
To update 100,000 records the above method takes close to 8 minutes which I feel is quite high.
Database used: Oracle 11g
Java Version: 1.6
Environment: Windows 7
I am not sure if I am missing something. Can you advise or recommend any best way to process such bulk loads?
I took a second look at the temp table - USER I created before and could see there was no Primary Key constraint added to the ID column. I went ahead and added the PK constraint for ID column and re ran my utility. Now it just took 36 seconds to process 100,000 rows.
To be double sure I also created another temp table USER_TMP2 without PK constraint and ran my utility and it took 8 mins as usual for 100,000
I took a second look at the temp table - USER I created before and could see there was no Primary Key constraint added to the ID column. I went ahead and added the PK constraint for ID column and re ran my utility. Now it just took 36 seconds to process 100,000 rows.
To be double sure I also created another temp table USER_TMP2 without PK constraint and ran my utility and it took 8 mins as usual for 100,000
Moral of the story: When investigating poor performance the first thing to do is investigate the indexing of the tables involved – either by simple inspection or by looking at the execution plans of the queries – to ensure that you are not doing a lot of unnecessary table scans.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With