Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Updating multiple rows of single table

Tags:

java

mysql

I need to update every row of a table having more then 60k rows. Currently I'm doing it like this:

public void updateRank(Map<Integer, Double> map) {
    Iterator<Map.Entry<Integer, Double>> it = map.entrySet().iterator();
    while (it.hasNext()) {
        Map.Entry<Integer, Double> pairs = (Map.Entry<Integer, Double>) it.next();
        String query = "update profile set rank = " + pairs.getValue()
                + " where profileId = " + pairs.getKey();
        DBUtil.update(query);
        it.remove();
    }
}

This method alone took around 20+ mins to complete, hitting the database for each row(60k) is what i think the reason here.(though i'm using dbcp for connecton pooling, with 50 maximum active connections)

It'd be great if i'd be able to update rows with single database hit. Is that Possible ? How ?

Or any other way to improve timing here ?

like image 670
Mohammad Adil Avatar asked Dec 03 '13 15:12

Mohammad Adil


People also ask

Can we update multiple rows in a single update statement?

Column values on multiple rows can be updated in a single UPDATE statement if the condition specified in WHERE clause matches multiple rows. In this case, the SET clause will be applied to all the matched rows.

How can I update multiple rows in a single column in SQL?

First, specify the table name that you want to change data in the UPDATE clause. Second, assign a new value for the column that you want to update. In case you want to update data in multiple columns, each column = value pair is separated by a comma (,). Third, specify which rows you want to update in the WHERE clause.

How do I update multiple rows of columns?

We can update multiple columns by specifying multiple columns after the SET command in the UPDATE statement. The UPDATE statement is always followed by the SET command, it specifies the column where the update is required.


3 Answers

If every row should get a different value that cannot be derived from the existing data in the database, there is not much you can do to optimize the overall complexity. So do not expect too much wonders.

That said, you should start using prepared statements and batching:

public void updateRank(Map<Integer,Double> map){
    Iterator<Entry<Integer, Double>> it = map.entrySet().iterator();
    String query = "";
    int i = 0;

    Connection connection = getConnection(); // get the DB connection from somewhere
    PreparedStatement stmt = connection.prepareStatement("update profile set rank = ? where profileId = ?");

    while (it.hasNext()) {
        Map.Entry<Integer,Double> pairs = (Map.Entry<Integer,Double>)it.next();
        stmt.setInt(1, pairs.getValue());
        stmt.setDouble(2, pairs.getKey());
        stmt.addBatch(); // this will just collect the data values
        it.remove();
    }       
    stmt.executeBatch(); // this will actually execute the updates all in one
}

What this does:

  1. the prepared statement causes the SQL parser to only parse the SQL once
  2. the batching minimizes the client-server-roundtrips so that not one for every update
  3. the communication between client and server is minimized because the SQL is only transmitted once and the data is collected and sent as a packet (or at least fewer packets)

In addition:

  • Please check if the database column profileId is using an index so that looking up the respective row is fast enough
  • You could check if your connection is set to auto-commit. If so try to disable auto-commit and explicitly commit the transaction after all rows are updated. This way the single update operations could be faster as well.
like image 85
Stefan Winkler Avatar answered Oct 17 '22 10:10

Stefan Winkler


Right now you execute each query independently which causes a huge connection overhead (even when using connection pooling). Instead use a batch mechanism to execute several queries together.

Using JDBC (which DBCP apparently is using) and prepared statements, you can do this very easily by using addBatch() and executeBatch(). I recently had to do this my own and a batch size of around 1000 queries was the fastest. Though this may be entirely different in your case.

References

  • http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#addBatch%28java.lang.String%29
  • http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeBatch%28%29
like image 3
str Avatar answered Oct 17 '22 12:10

str


You could concatenate your queries (separate them by a ;) and send only batches of 100 queries for example:

public void updateRank(Map<Integer, Double> map) {
    Iterator<Map.Entry<Integer, Double>> it = map.entrySet().iterator();
    String queries = "";
    int i = 0;
    while (it.hasNext()) {
        Map.Entry<Integer, Double> pairs =
                (Map.Entry<Integer, Double>) it.next();
        queries += "update profile set rank = " + pairs.getValue()
                + " where profileId = " + pairs.getKey() + ";";
        it.remove();
        if (i++ % 100 == 99) {
            DBUtil.update(queries);
            queries = "";
        }
    }
}
like image 2
kgautron Avatar answered Oct 17 '22 12:10

kgautron