Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hibernate faster EntityManagerFactory creation

In my desktop application new databases get opened quite often. I use Hibernate/JPA as an ORM. The problem is, creating the EntityManagerFactory is quite slow, taking about 5-6 Seconds on a fast machine. I know that the EntityManagerFactory is supposed to be heavyweight but this is just too slow for a desktop application where the user expects the new database to be opened quickly.

  1. Can I turn off some EntityManagerFactory features to get an instance faster? Or is it possible to create some of the EntityManagerFactory lazily to speed up cration?

  2. Can I somehow create the EntityManagerFactory object before knowing the database url? I would be happy to turn off all validation for this to be possible.

  3. By doing so, can I pool EntityManagerFactorys for later use?

  4. Any other idea how to create the EntityManagerFactory faster?

Update with more Information and JProfiler profiling

The desktop application can open saved files. Our application document file format constists of 1 SQLite database + and some binary data in a ZIP file. When opening a document, the ZIP gets extracted and the db is opened with Hibernate. The databases all have the same schema, but different data obviously.

It seems that the first time I open a file it takes significantly longer than the following times. I profiled the first and second run with JProfiler and compared the results.

1st Run:

create EMF: 4385ms
    build EMF: 3090ms
    EJB3Configuration configure: 900ms
    EJB3Configuration <clinit>: 380ms

calltree1.png.

2nd Run:

create EMF: 1275ms
    build EMF: 970ms
    EJB3Configuration configure: 305ms
    EJB3Configuration <clinit>: not visible, probably 0ms

compare_calltree.png.

In the Call tree comparison you can see that some methods are significantly faster (DatabaseManager. as starting point):

create EMF: -3120ms
    Hibernate create EMF: -3110ms
        EJB3Configuration configure: -595ms
        EJB3Configuration <clinit>: -380ms
        build EMF: -2120ms
            buildSessionFactory: -1945ms
                secondPassCompile: -425ms
                buildSettings: -346ms
                SessionFactoryImpl.<init>: -1040ms

The Hot spot comparison now has the interesting results:

screenshot compare_hotspot.png.

ClassLoader.loadClass: -1686ms
XMLSchemaFactory.newSchema: -184ms
ClassFile.<init>: -109ms

I am not sure if it is the loading of Hibernate classes or my Entity classes.

A first improvement would be to create an EMF as soon as the application starts just to initialize all necessary classes (I have an empty db file as a prototype already shipped with my Application). @sharakan thank you for your answer, maybe a DeferredConnectionProvider would already be a solution for this problem.

I will try the DeferredConnectionProvider next! But we might be able to speed it up even further. Do you have any more suggestions?

like image 538
user643011 Avatar asked Feb 23 '13 02:02

user643011


People also ask

What is difference between EntityManagerFactory and SessionFactory?

Using EntityManagerFactory approach allows us to use callback method annotations like @PrePersist, @PostPersist,@PreUpdate with no extra configuration. Using similar callbacks while using SessionFactory will require extra efforts.

What is difference between EntityManagerFactory and EntityManager?

EntityManagerFactory vs EntityManager EntityManager: whenever using spring avoid managing/using EntityManagerFactory since Spring manages concurreny for you. The entity manger injected by @PersistenceContext is thread safe. While EntityManagerFactory instances are thread-safe, EntityManager instances are not.

Is EntityManagerFactory and EntityManager thread safe?

EntityManagerFactory instances are thread-safe. Applications create EntityManager instances in this case by using the createEntityManager method of javax.


2 Answers

You should be able to do this by implementing your own ConnectionProvider as a decorator around a real ConnectionProvider.

The key observation here is that the ConnectionProvider isn't used until an EntityManager is created (see comment in supportsAggressiveRelease() for a caveat to that). So you can create a DeferredConnectionProvider class, and use it to construct the EntityManagerFactory, but then wait for user input, and do the deferred initialization before actually creating any EntityManager instances. I'm written this as a wrapper around ConnectionPoolImpl, but you should be able to use any other implementation of ConnectionProvider as the base.

public class DeferredConnectionProvider implements ConnectionProvider {

    private Properties configuredProps;
    private ConnectionProviderImpl realConnectionProvider;

    @Override
    public void configure(Properties props) throws HibernateException {
        configuredProps = props;
    }

    public void finalConfiguration(String jdbcUrl, String userName, String password) {
        configuredProps.setProperty(Environment.URL, jdbcUrl);
        configuredProps.setProperty(Environment.USER, userName);
        configuredProps.setProperty(Environment.PASS, password);

        realConnectionProvider = new ConnectionProviderImpl();
        realConnectionProvider.configure(configuredProps);
    }

    private void assertConfigured() {
        if (realConnectionProvider == null) {
            throw new IllegalStateException("Not configured yet!");
        }
    }        

    @Override
    public Connection getConnection() throws SQLException {
        assertConfigured();

        return realConnectionProvider.getConnection();
    }

    @Override
    public void closeConnection(Connection conn) throws SQLException {
        assertConfigured();

        realConnectionProvider.closeConnection(conn);
    }

    @Override
    public void close() throws HibernateException {
        assertConfigured();

        realConnectionProvider.close();
    }

    @Override
    public boolean supportsAggressiveRelease() {
        // This gets called during EntityManagerFactory construction, but it's 
        // just a flag so you should be able to either do this, or return
        // true/false depending on the actual provider.
        return new ConnectionProviderImpl().supportsAggressiveRelease();
    }
}

a rough example of how to use it:

    // Get an EntityManagerFactory with the following property set:
    //     properties.put(Environment.CONNECTION_PROVIDER, DeferredConnectionProvider.class.getName());
    HibernateEntityManagerFactory factory = (HibernateEntityManagerFactory) entityManagerFactory;

    // ...do user input of connection info...

    SessionFactoryImpl sessionFactory = (SessionFactoryImpl) factory.getSessionFactory();
    DeferredConnectionProvider connectionProvider = (DeferredConnectionProvider) sessionFactory.getSettings()
                    .getConnectionProvider();

    connectionProvider.finalConfiguration(jdbcUrl, userName, password);

You could put the initial set up of the EntityManagerFactory on a separate thread or something, so that the user never has to wait for it. Then the only thing they'll wait for, after specifying the connection info, is the setting up of the connection pool, which should be fairly quick compared to parsing the object model.

like image 70
sharakan Avatar answered Oct 04 '22 05:10

sharakan


Can I turn off some EntityManagerFactory features to get an instance faster?

Don't believe so. EMFs don't really have too many features, other than initializing a JDBC connection/pool.

Or is it possible to create some of the EntityManagerFactory lazily to speed up cration?

Rather than creating the EMF lazily, when the user will notice the performance hit, I suggest you should head in the opposite direction - create the EMF proactively before the user actually needs it. Create it once, up-front, possibly in a separate thread during application initialisation (or at least as soon as you know about your database). Reuse it throughout the existence of your application/database.

Can I somehow create the EntityManagerFactory object before knowing the database url?

No - it creates a JDBC connection.

I think a better question is: why does your application dynamically discover database connection URLs? Are you saying your databases are created/made available on-the-fly and there's no way to anticipate in advance the connection parameters. That really is to be avoided.

By doing so, can I pool EntityManagerFactorys for later use?

No, you can't pool EMFs. It's the connections that you can pool.

Any other idea how to create the EntityManagerFactory faster?

I agree - 6 seconds is too slow for initialisation of EMFs.

I suspect it's more to do with your selected database technology than JPA/JDBC/JVM. My guess is that maybe your database is initialising itself as you connect. Are you using Access? What DB are you using?

Are you connecting to a database remotely located? Over a WAN? Is network speed/latency good?

Are the client PCs limited in performance?

EDIT: Added after comments

Implementing your own ConnectionProvider as a decorator around a real ConnectionProvider will not speed up the user's experience at all. The database instance still needs to be initialised, the EMF & EM created and the JDBC connection still needs to be subsequently established.

Options:

  1. Share a common preloaded DB instance: seems not possible for your business scenario (although JSE technology supports this and also supports client-server design).
  2. Change to a DB with a faster startup: Derby (a.k.a. Java DB) is included in modern JVMs and has a startup time of about 1.5 seconds (cold) and 0.7 seconds (warm - data pre-loaded).
  3. In many (most?) scenarios, the fastest solution would be to load data directly into in-memory java objects using JAXB with STAX. Subsequently, use in-memory cached data (particularly using smart structures like maps, hashing and arraylists). Just as JPA can map POJO classes to database tables & columns, so JAXB can map POJO classes to XML schema & work with XML doc instances. If you have very complex queries using SQL set-based logic with multiple joins and strong use of DB indexes, this would be less desirable.

(2) would probably give the best improvement for limited effort.
Additionally: - try to unzip the data files during deployment rather than during app usage. - initialize the EMF in a startup thread that runs in parallel to the UI startup - try to start the DB initializing as one of the very first steps of the app (that means connecting to the actual instance using JDBC).

like image 3
Glen Best Avatar answered Oct 04 '22 06:10

Glen Best