Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL/Hibernate - How do I debug a MySQL pooled connection that keeps dropping?

For months, my web application ran smoothly, but for the past week or two, it keeps dropping its connection to MySQL server. I'm not a DBA guy and have no idea how to debug this.

Here is what I know:

  1. The connection seems to drop every few hours. Sometimes during the day, but always during the night.
  2. My lab has a MySQL server that hosts databases for multiple applications.
  3. Currently, we have 46 connections to the MySQL server.
  4. To my knowledge, no other application is experiencing this issue.
  5. My application is using the same stack, configuration, and even code for connecting to the DB as another application—this other application supports around 200 users per day and has been running smoothly since 2013.
  6. Both applications use Hibernate ORM; this is the only configuration that I know of:

    <!-- TomcatJDBCConnectionProvider class is common to both applications -->
    <property name="hibernate.connection.provider_class">org.hibernate.connection.TomcatJDBCConnectionProvider</property>
    <property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>
    <property name="hibernate.connection.driver_class">com.mysql.jdbc.Driver</property>
    <property name="hibernate.connection.pool_size">5</property>
    <property name="hibernate.current_session_context_class">thread</property>
    <property name="hibernate.tomcatJdbcPool.validationQuery">SELECT 1</property>
    <property name="hibernate.tomcatJdbcPool.testOnBorrow">true</property>
    <property name="hibernate.enable_lazy_load_no_trans">true</property>
    
  7. The issue started around the same time as when someone tried to use the application's RESTful API to download our data. This user—actually a collaborator—has a small script iterates over every row in a specific table and requests all the metadata.

  8. The issue also started around the same time that my lab started offering a Coursera Massive Open Online Course. I don't know what the numbers are, but the actual usage on the site must have jumped.

I'm aware that this is a broad question, but I'm really at a loss as to how to go about debugging this. Any suggestions are appreciated.

EDIT:

Digging around the other application's ServletContextListener, I found this bit of code that my contextDestroyed function does not have:

// TODO: Find memory leak that requires server to be restarted after hot deploying several (3?) times.
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
for (Thread t : threadSet) {
    if (t.getName().contains("Abandoned connection cleanup thread")) {
        synchronized (t) {
            System.out.println("Forcibly stopping thread to avoid memory leak: " + t.getName());
            t.stop(); // don't complain, it works
        }
    }
}

It appears to iterate over the stack traces, find the one with the text "Abandoned connection cleanup thread" and manually stop it. It seems probably that this is related to my issue?

EDIT 21/9/2015:

My application went down this weekend. Here is the stack trace from the error log from yesterday (when I believe it went down):

20-Sep-2015 14:22:18.160 SEVERE [http-apr-8080-exec-35] org.apache.catalina.core.StandardWrapperValve.invoke Servlet.service() for servlet [edu.mssm.pharm.maayanlab.Harmonizome.api.GeneMetadataApi] in context with path [/Harmonizome] threw exception
 org.hibernate.exception.GenericJDBCException: Could not open connection
    at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:54)
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125)
    at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110)
    at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:304)
    at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.getConnection(LogicalConnectionImpl.java:169)
    at org.hibernate.engine.transaction.internal.jdbc.JdbcTransaction.doBegin(JdbcTransaction.java:67)
    at org.hibernate.engine.transaction.spi.AbstractTransactionImpl.begin(AbstractTransactionImpl.java:160)
    at org.hibernate.internal.SessionImpl.beginTransaction(SessionImpl.java:1395)
    at org.hibernate.collection.internal.AbstractPersistentCollection.withTemporarySessionIfNeeded(AbstractPersistentCollection.java:224)
    at org.hibernate.collection.internal.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:545)
    at org.hibernate.collection.internal.AbstractPersistentCollection.read(AbstractPersistentCollection.java:124)
    at org.hibernate.collection.internal.PersistentSet.iterator(PersistentSet.java:180)
    at edu.mssm.pharm.maayanlab.Harmonizome.json.serdes.GeneMetadataSerializer.serialize(GeneMetadataSerializer.java:54)
    at edu.mssm.pharm.maayanlab.Harmonizome.json.serdes.GeneMetadataSerializer.serialize(GeneMetadataSerializer.java:23)
    at com.google.gson.TreeTypeAdapter.write(TreeTypeAdapter.java:70)
    at com.google.gson.Gson.toJson(Gson.java:600)
    at com.google.gson.Gson.toJson(Gson.java:579)
    at com.google.gson.Gson.toJson(Gson.java:534)
    at edu.mssm.pharm.maayanlab.Harmonizome.api.GeneMetadataApi.doGet(GeneMetadataApi.java:65)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:622)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
    at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:616)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:518)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1091)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:673)
    at org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run(AprEndpoint.java:2440)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [http-apr-8080-exec-35] Timeout: Pool empty. Unable to fetch a connection in 30 seconds, none available[size:5; busy:5; idle:0; lastwait:30000].
    at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:672)
    at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:186)
    at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:127)
    at org.hibernate.connection.TomcatJDBCConnectionProvider.getConnection(TomcatJDBCConnectionProvider.java:208)
    at org.hibernate.internal.AbstractSessionImpl$NonContextualJdbcConnectionAccess.obtainConnection(AbstractSessionImpl.java:292)
    at org.hibernate.engine.jdbc.internal.LogicalConnectionImpl.obtainConnection(LogicalConnectionImpl.java:297)

Here is are my connection variables from MySQL:

mysql>  SHOW VARIABLES LIKE '%connect%';
+-----------------------------------------------+-----------------+
| Variable_name                                 | Value           |
+-----------------------------------------------+-----------------+
| character_set_connection                      | utf8            |
| collation_connection                          | utf8_general_ci |
| connect_timeout                               | 5               |
| default_master_connection                     |                 |
| extra_max_connections                         | 1               |
| init_connect                                  |                 |
| max_connect_errors                            | 100             |
| max_connections                               | 100             |
| max_user_connections                          | 0               |
| performance_schema_session_connect_attrs_size | 512             |
+-----------------------------------------------+-----------------+

mysql>  SHOW VARIABLES LIKE '%timeout%';
+-----------------------------+----------+
| Variable_name               | Value    |
+-----------------------------+----------+
| connect_timeout             | 5        |
| deadlock_timeout_long       | 50000000 |
| deadlock_timeout_short      | 10000    |
| delayed_insert_timeout      | 300      |
| innodb_flush_log_at_timeout | 1        |
| innodb_lock_wait_timeout    | 50       |
| innodb_rollback_on_timeout  | OFF      |
| interactive_timeout         | 28800    |
| lock_wait_timeout           | 31536000 |
| net_read_timeout            | 30       |
| net_write_timeout           | 60       |
| slave_net_timeout           | 3600     |
| thread_pool_idle_timeout    | 60       |
| wait_timeout                | 28800    |
+-----------------------------+----------+

EDIT 22/9/2015:

Would a SEVERE Tomcat error cause the issue? I am seeing an error, unrelated to the database, about parsing a date:

22-Sep-2015 10:09:53.481 SEVERE [http-apr-8080-exec-26] org.apache.catalina.core.StandardWrapperValve.invoke Servlet.service() for servlet [edu.mssm.pharm.maayanlab.Harmonizome.page.DatasetPage] in context with path [/Harmonizome] threw exception [javax.servlet.ServletException: javax.servlet.jsp.JspException: In &lt;parseDate&gt;, a parse locale can not be established] with root cause
 javax.servlet.jsp.JspException: In &lt;parseDate&gt;, a parse locale can not be established
    at org.apache.taglibs.standard.tag.common.fmt.ParseDateSupport.doEndTag(ParseDateSupport.java:147)

Attaching JConsole output of heap memory usage:

enter image description here

JConsole output for thread usage; it started around 24-25 and jumped up to 34 once I started using the site. Even after closing the browser window, it remained there:

enter image description here

EDIT 23/9/2015:

One thing I changed right before the issue began was how I deal with Hibernate transactions. Previously, I had enable_lazy_load_no_trans disabled (which is the default). Previously, I was using the "open session in view" pattern. It seemed like people didn't like the open session in view pattern, so I enabled enable_lazy_load_no_trans. Thus, I have code like this:

List<MyObjects> myObjects = null;
try {
    HibernateUtil.beginTransaction();
    myObjects = // fetch my objects from the DB
    HibernateUtil.commitTransaction();
} catch (HibernateException he) {
    HibernateUtil.rollbackTransaction();
} finally {
    HibernateUtil.close();
}

// render myObjects in JSP/JSTL
// this JSP may lazily load related objects

In retrospect, this seems... problematic. I have no idea when Hibernate "lets go" of the objects.

like image 211
jds Avatar asked Sep 18 '15 14:09

jds


3 Answers

From the stack-trace you provided, I can draw a single conclusion: you are simply running out of connections.

This can be caused by long running transactions, possibly due to slow queries or improper application transaction boundaries.

I suggest you start using FlexyPool, which supports Tomcat DBCP and get a better understanding of both the connection and transaction usage. FlexyPool provides many histograms you might be interested in, like connection acquisition time and lease time.

An, just to be on the safe side, check the MySQL driver version too and see if you're running on an outdated library.

like image 61
Vlad Mihalcea Avatar answered Oct 18 '22 21:10

Vlad Mihalcea


Hibernate errors are a bit abstract and sometimes it can be tricky to find the bug by the stack trace. I think that may be a problem of your application, maybe you're not closing Hibernate connections properly on some cases or your application may have a memory leak.

Have you tried to monitor the application with jconsole from the JDK?

You can set this on your Tomcat configuration console in the Java arguments (I'm assuming you're using Tomcat), to enable the jconsole

-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=8086
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false

Then connect to a remote process for example

localhost:8086 

and watch the threads as you go thru the operations that make the application stop.

Edit

If you're not using Tomcat and you're running your application in a Windows environment you can monitorize the threads using for example Process Explorer and monitorize your application.

like image 3
pedromendessk Avatar answered Oct 18 '22 21:10

pedromendessk


It seems your connection pool cannot return a free connection to Hibernate within timeout duration. This happens because your application have very long transactions or transaction dead locks. You can try following options to fix the bug.

  1. change your connection pool size in following line

    <property name="hibernate.connection.pool_size">5</property>

make the pool size about 10 and test. You should keep your eye on the count of connections to your database. If it exceeds the mysql database connection limitations change max_connections of mysql server and keep testing.

  1. Use an another connection pool. I recommend to use apache commons dbcp2. Maven dependencies of dbcp2 as follows.

    <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-dbcp2</artifactId> <version>2.1</version> </dependency>

Add dbcp2 into your POM then config dbcp2 with your application.

If it was the solution your application had only long transactions. Sometimes it may minimize the occurrence, and if it is still happening definitely your application have transaction dead locks. So you have to identify what are the possible problems with your code.

There are other alternative solutions such changing the waiting timeout to a higher value. But it is not good for your application performance and it doesn't make any sense for transaction dead locks. Finally you should remember to care about transaction management and database structure in your further developments for better performance of database.

like image 1
Channa Jayamuni Avatar answered Oct 18 '22 20:10

Channa Jayamuni