Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

EclipseLink batch insert very very slow

I'm trying to optimize my JPA implementation, using EclipseLink. I've added batch operations to it. But it is still taking A LOT of time to do 50 000 inserts. It takes more than 10 times the amount of time it takes to do the exact same insert using raw SQL with JDBC.

To make sure batch operations were in fact working I used Wireshark to check my packets and it is not using batch inserts.

Here's one of the insert packets:

enter image description here

It is not doing:

INSERT INTO ENTITYCLASSTEST (LASTNAME, NAME) VALUES ('sfirosijfhgdoi 0', 'dsufius0'), ('sfirosijfhgdoi 0', 'dsufius0'), ('sfirosijfhgdoi 0', 'dsufius0'), ('sfirosijfhgdoi 0', 'dsufius0')... and so on

I was expecting it to do as above but it is inserting one line per packet and not multiple lines per packet.

Here's is my Entity Class:

@Entity
public class EntityClassTest implements Serializable {
    private static final long serialVersionUID = 1L;

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String name;
    private String lastname;

    public EntityClassTest() {
    }

    public EntityClassTest(Long id, String name, String lastname) {
        this.id = id;
        this.name = name;
        this.lastname = lastname;
    }

    public EntityClassTest(String name, String lastname) {
        this.name = name;
        this.lastname = lastname;
    }

    public Long getId() {
        return id;
    }

    public String getName() {
        return name;
    }

    public String getLastName() {
        return lastname;
    }

    public void setId(Long id) {
        this.id = id;
    }

    public void setName(String name) {
        this.name = name;
    }

    public void setLastName(String lastname) {
        this.lastname = lastname;
    }

    @Override
    public int hashCode() {
        int hash = 0;
        hash += (id != null ? id.hashCode() : 0);
        return hash;
    }

    @Override
    public boolean equals(Object object) {
        // TODO: Warning - this method won't work in the case the id fields are not set
        if (!(object instanceof EntityClassTest)) {
            return false;
        }
        EntityClassTest other = (EntityClassTest) object;
        if ((this.id == null && other.id != null) || (this.id != null && !this.id.equals(other.id))) {
            return false;
        }
        return true;
    }

    @Override
    public String toString() {
        return "database.EntityClassTest [id=" + id + " ]";
    }

}

And here is my persist method that receives a List and persists all the objects inside.

public void insertListToTable(final String persistenceUnit, final List list) throws SQLException {
        final EntityManagerFactory entityManagerFactory = Persistence.createEntityManagerFactory(persistenceUnit);
        final EntityManager entityManager = entityManagerFactory.createEntityManager();
        final EntityTransaction transaction = entityManager.getTransaction();

        try {            
            final int listSize = list.size();
            transaction.begin();

            for (int i = 0; i<listSize; i++) { //Object object : list) {
                final Object object = list.get(i);
                entityManager.persist(object);

                if ( i % 500 == 0 ) { //500, same as the JDBC batch size defined in the persistence.xml
                    //flush a batch of inserts and release memory:
                    entityManager.flush();
                    entityManager.clear();
                }
            }
            transaction.commit();
        }
        catch(Exception e) {
            if (transaction != null) {
                transaction.rollback();
            }
            throw new SQLException(e.getMessage());
        } 
        finally {
            entityManager.close();
        }
    }

And my persistence.xml, where I set 500 as the batch value, file is:

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
<persistence-unit name="ExternalServer" transaction-type="RESOURCE_LOCAL">
    <provider>org.eclipse.persistence.jpa.PersistenceProvider</provider>
    <!-- List of Entity classes -->
    <class>model.EntityClassTest</class>
    <properties>
      <property name="javax.persistence.jdbc.url" value="jdbc:mysql://myServer:3306/testdb?zeroDateTimeBehavior=convertToNull"/>
      <property name="javax.persistence.jdbc.user" value="testdbuser"/>
      <property name="javax.persistence.jdbc.driver" value="com.mysql.jdbc.Driver"/>
      <property name="javax.persistence.jdbc.password" value="myPassword"/>
      <property name="javax.persistence.schema-generation.database.action" value="create"/>

      <!-- Weaving -->
      <property name="eclipselink.weaving" value="static"/>
      <!-- SQL dialect / Database type -->
      <property name="hibernate.dialect" value="org.hibernate.dialect.MySQLDialect"/>
      <property name="eclipselink.target-database" value="MySQL"/>
      <!-- Tell the JPA provider to, by default, create the table if it does not exist. -->
      <property name="javax.persistence.schema-generation.database.action" value="create"/>
      <!-- No logging (For development change the value to "FINE") -->
      <property name="eclipselink.logging.level" value="OFF"/>
      <!-- Enable batch writing -->
      <property name="eclipselink.jdbc.batch-writing" value="JDBC"/>
      <!-- Batch size -->
      <property name="eclipselink.jdbc.batch-writing.size" value="500"/>

    </properties>
  </persistence-unit>
</persistence>

So my question is, why is it not batch inserting? I believe I've EclipseLink well configured to do so from what I've been reading around on EclipseLink website and here SO as well.

//////////////////////////// EDIT //////////////////////////

As suggested by Chris's answer, I changed in my EntityClassTest this value @GeneratedValue(strategy = GenerationType.IDENTITY) to @GeneratedValue(strategy = GenerationType.SEQUENCE) and re-run the test and the packets are being sent as before (like the image I posted above). So it didn't fix my problem I'm afraid.

//////////////////////////// EDIT 2 ////////////////////////

I've changed the logging level in the persistence.xml file to FINEST as shown next.

<property name="eclipselink.logging.level" value="FINEST"/>

And here is the log generated. I placed it in a pastebin because it is quite long.

http://pastebin.com/rKihCKMW

It seems to be calling Execute query InsertObjectQuery quite a lot of times.

//////////////////////////// EDIT 3 ////////////////////////

Here's the version for each component I'm using.

+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.6.12                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.6.12-log                   |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Win64                        |
+-------------------------+------------------------------+

Netbeans 8.0

EclipseLink (JPA 2.1)

mysql-connector-java-5.1.24.jar

//////////////////////////// EDIT 4 ////////////////////////

Following CuriousMind's answer I've edited my EntityClassTest id annotation to:

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO, generator="id-seq-gen")
    @SequenceGenerator( name="id-seq-gen", sequenceName="ID_SEQ_GEN", allocationSize=500 )
    private Long id;

But it didn't solve my problem, I'm still getting one single insert per packet (as described by the image above) and on the EclipseLink log I'm getting:

[EL Fine]: sql: 2014-10-19 06:44:02.608--ClientSession(824177287)--Connection(1674390738)--Thread(Thread[main,5,main])--SELECT LAST_INSERT_ID()
[EL Finest]: sequencing: 2014-10-19 06:44:02.608--UnitOfWork(1985011414)--Thread(Thread[main,5,main])--assign sequence to the object (1.251 -> database.EntityClassTest [id=null ])
[EL Finest]: query: 2014-10-19 06:44:02.608--UnitOfWork(1985011414)--Thread(Thread[main,5,main])--Execute query InsertObjectQuery(database.EntityClassTest [id=null ])
[EL Finest]: query: 2014-10-19 06:44:02.608--ClientSession(824177287)--Thread(Thread[main,5,main])--Execute query ValueReadQuery(name="ID_SEQ_GEN" sql="SELECT LAST_INSERT_ID()")
[EL Fine]: sql: 2014-10-19 06:44:02.608--ClientSession(824177287)--Connection(1674390738)--Thread(Thread[main,5,main])--INSERT INTO ENTITYCLASSTEST (LASTNAME, NAME) VALUES (?, ?)
    bind => [sfirosijfhgdoi 2068, dsufius1034]
[EL Fine]: sql: 2014-10-19 06:44:02.608--ClientSession(824177287)--Connection(1674390738)--Thread(Thread[main,5,main])--SELECT LAST_INSERT_ID()
[EL Finest]: sequencing: 2014-10-19 06:44:02.608--UnitOfWork(1985011414)--Thread(Thread[main,5,main])--assign sequence to the object (1.252 -> database.EntityClassTest [id=null ])
[EL Finest]: query: 2014-10-19 06:44:02.608--UnitOfWork(1985011414)--Thread(Thread[main,5,main])--Execute query InsertObjectQuery(database.EntityClassTest [id=null ])
[EL Finest]: query: 2014-10-19 06:44:02.608--ClientSession(824177287)--Thread(Thread[main,5,main])--Execute query ValueReadQuery(name="ID_SEQ_GEN" sql="SELECT LAST_INSERT_ID()")
[EL Fine]: sql: 2014-10-19 06:44:02.608--ClientSession(824177287)--Connection(1674390738)--Thread(Thread[main,5,main])--INSERT INTO ENTITYCLASSTEST (LASTNAME, NAME) VALUES (?, ?)
    bind => [sfirosijfhgdoi 2244, dsufius1122]

And so on...

like image 550
dazito Avatar asked Oct 17 '14 02:10

dazito


1 Answers

You are using GenerationType.IDENTITY for sequencing, which requires retrieving the IDs from each insert statement one by one. Try a sequencing scheme that allows preallocation in batches of 500 and you will see improvements: http://en.wikibooks.org/wiki/Java_Persistence/Identity_and_Sequencing#Identity_sequencing

like image 59
Chris Avatar answered Oct 02 '22 11:10

Chris