Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Microsoft SQL Server 2012 query take minutes over JDBC 4.0 but second(s) in Management Studio?

I am dealing with what is apparently a performance issue while retrieving a relatively large ResultSet from a remote Microsoft SQL Server 2012 to a Java client that uses Microsoft JDBC Driver 4.0.

When I run the corresponding query on the remote server's Microsoft SQL Server Management Studio, it returns approx. 220k rows almost instantaneously. When I issue the same query from the client, it stalls. The same test has worked fine also on the client with an earlier version of the database where only approx. 400 rows qualified.

I tried to tackle this by appending ;responseBuffering=adaptive" to the URL passed to DriverManager.getConnection(). After the connection is established, I see this property (among several others) in the result from connection.getMetaData().getURL(), but[ connection.getClientInfo(responseBuffering) returns null, and what is more the client is still stalling.

What could be going wrong here and how can I instruct the a Microsoft SQL Server (not just suggest to it -- programmatically in Java) that it must return rows in smaller chunks rather than all at once or improve JDBC query times by some other measures.

Two further observations that seem somewhat strange and that perhaps point to a different root cause entirely:

  • When the client stalls it still shows only relatively light CPU load, unlike what I would expect from heavy garbage collection
  • "responseBuffering=adaptive" should be the normal default by now

UPDATE I've checked and found that switching from PreparedStatement to Statementdoes not improve things in my case (it apparently can help in other cases).

UPDATE Here is my current query:

select 
    PARENT.IDENTIFIER    as PARENT_IDENTIFIER,
    PARENT.CLASS         as PARENT_CLASS,
    CHILD.TYPE           as CHILD_TYPE,
    CHILD.IDENTIFIER     as CHILD_IDENTIFIER,
    PROPERTY.IDENTIFIER  as PROPERTY_IDENTIFIER,
    PROPERTY.DESCRIPTION as PROPERTY_DESCRIPTION,
    PROPERTY.TYPE        as PROPERTY_TYPE,
    PROPERTY.PP          as PROPERTY_PP,
    PROPERTY.STATUS      as PROPERTY_STATUS,
    PROPERTY.TARGET      as PROPERTY_TARGET -- a date
from
    OBJECTS as CHILD
    left outer join RELATIONS              on RELATIONS.CHILD = CHILD.IDENTIFIER
    left outer join OBJECTS    as PARENT   on RELATIONS.PARENT = PARENT.IDENTIFIER
    inner join      PROPERTIES as PROPERTY on PROPERTY.OBJECT = CHILD.IDENTIFIER
where
    PROPERTY.TARGET is not null
order by
    case when PARENT.IDENTIFIER is null then 1 else 0 end,
    PARENT.IDENTIFIER,
    CHILD.IDENTIFIER,
    PROPERTY.TARGET,
    PROPERTY.IDENTIFIER
like image 688
Drux Avatar asked Oct 22 '14 06:10

Drux


People also ask

Why is JDBC slow?

How Connection Pools Enhance Performance. Establishing a JDBC connection with a DBMS can be very slow. If your application requires database connections that are repeatedly opened and closed, this can become a significant performance issue. WebLogic connection pools offer an efficient solution to this problem.


2 Answers

The adaptive buffering is a good answer. I would also recommend checking the connections' SET options via SQL Server Profiler.

When you start a trace, make sure ExistingConnections is selected. Compare a SPID from a JDBC connection and a SSMS connection. ARITHABORT comes to mind as one that I have seen cause a difference in performance between SSMS and JDBC driver. Microsoft briefly mentions it here: http://msdn.microsoft.com/en-us/library/ms190306.aspx. Stack Exchange information here: https://dba.stackexchange.com/questions/9840/why-would-set-arithabort-on-dramatically-speed-up-a-query

On Oracle, I have seen huge impacts by playing with the setFetchSize method on the Statement / PreparedStatement object. Apparently, the SQL Server driver does not support that method. However, there is an internal method in the driver for it. See Set a default row prefetch in SQL Server using JDBC driver for details.

Also, what are you doing in your while (rs.next()) loop? Try doing nothing other than reading a column, like rs.getInt(1). See what happens. If it flies, that suggests the bottleneck is in your former processing of the result set. If it is still slow, then the problem must be in the driver or database.

You could use SQL Server Profiler to compare the executions as they come in via JDBC and as you run it via SSMS. Compare the CPU, reads, writes and duration. If they are different, then the execution plan is probably different, which points me back to the first thing I mentioned: the SET options.

like image 130
Brandon Avatar answered Oct 07 '22 19:10

Brandon


I'm simply going to toss out this suggestion, and leave it for you to test.

The JDBC driver may well be FETCHING all of the rows before it returns, whereas the other system is simply returning the open cursor.

I have seen this behavior on other databases with JDBC, but had not direct experience with SQL Server.

In the examples where I have seen it, setting the auto commit to false for the connection prevents it from loading the entire result set. There are other settings to have it load only portions, etc.

But that could well be the underlying issue you are facing.

like image 29
Will Hartung Avatar answered Oct 07 '22 19:10

Will Hartung