Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Poor performance to SELECT and UPDATE one million rows in Oracle via JDBC

I have a user table (Oracle 11g DB) with more than 1 million rows which has all the user passwords in plain text which I am trying to hash using SHA512 algorithm (hash and salt). To start with below is my Java class to read all the records from user table, hash it and update back to user table.

  • I am using prepared statement for both SELECT and UPDATE queries
  • I have set the prepared statement fetch size to 1000 (setFetchSize(1000))
  • I have set the auto commit property to false
  • Using batch method to do bulk update
try {
    ps = con.prepareStatement("update user set password=? where ID=?");
    psSel = con.prepareStatement("select ID, password from user");
    psSel.setFetchSize(1000);
    rs = psSel.executeQuery();
    String hashPassword = null;
    while (rs.next()) {
        long id = rs.getLong(1);
        String pwd = rs.getString(2);
        hashPassword = <<CALL TO PASSWORD HASHING UTIL>>;
        ps.setString(1, hashPassword);
        ps.setLong(2, id);
        ps.addBatch();

      //Every 5000 records update and commit
        if(++count % batchSize == 0) {
            ps.executeBatch();
            con.commit();
        }

    }
    ps.executeBatch();
    con.commit();
} catch (SQLException e) {
    e.printStackTrace();
}

To update 100,000 records the above method takes close to 8 minutes which I feel is quite high.

Database used: Oracle 11g

Java Version: 1.6

Environment: Windows 7

I am not sure if I am missing something. Can you advise or recommend any best way to process such bulk loads?

UPDATE

I took a second look at the temp table - USER I created before and could see there was no Primary Key constraint added to the ID column. I went ahead and added the PK constraint for ID column and re ran my utility. Now it just took 36 seconds to process 100,000 rows.

To be double sure I also created another temp table USER_TMP2 without PK constraint and ran my utility and it took 8 mins as usual for 100,000

like image 207
Sachin Avatar asked Nov 20 '22 21:11

Sachin


1 Answers

I took a second look at the temp table - USER I created before and could see there was no Primary Key constraint added to the ID column. I went ahead and added the PK constraint for ID column and re ran my utility. Now it just took 36 seconds to process 100,000 rows.

To be double sure I also created another temp table USER_TMP2 without PK constraint and ran my utility and it took 8 mins as usual for 100,000

Moral of the story: When investigating poor performance the first thing to do is investigate the indexing of the tables involved – either by simple inspection or by looking at the execution plans of the queries – to ensure that you are not doing a lot of unnecessary table scans.

like image 187
Gord Thompson Avatar answered Dec 09 '22 13:12

Gord Thompson