Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hibernate SQL In clause making CPU usage to 100%

In my java application I am using SQL server and Hibernate3 with EJB. When I tried to execute a select query with In clause, the DB server CPU usage reaches to 100%. But when I tried to run the same query in SQL management studio, the query is running without any CPU spikes. Application server and DB server are two different machines. My table has the following schema,

CREATE TABLE student_table (
       Student_Id BIGINT NOT NULL IDENTITY
     , Class_Id BIGINT NOT NULL
     , Student_First_Name VARCHAR(100) NOT NULL
     , Student_Last_Name VARCHAR(100)
     , Roll_No VARCHAR(100) NOT NULL
     , PRIMARY KEY (Student_Id)
     , CONSTRAINT UK_StudentUnique_1 UNIQUE  (Class_Id, Roll_No)
);

The table contains around 1000k records. My query is

select Student_Id from student_table where Roll_No in ('A101','A102','A103',.....'A250');

In clause contains 250 values, When I tried to run above query in SQL management studio the result is retrieved within 1 seconds and without any CPU spikes. But when I tried to run the same query through hibernate the CPU spikes reaches to 100% for around 60 seconds and result is retrieved around 60 seconds. The hibernate query is,

Criteria studentCriteria = session.createCriteria(StudentTO.class);
studentCriteria.add(Restrictions.in("rollNo", rollNoLists)); //rollNoLists is an Arraylist contains 250 Strings
studentCriteria.setProjection(Projections.projectionList().add(Projections.property("studentId")));
List<Long> studentIds = new ArrayList<Long>();
List<Long> results = (ArrayList<Long>) studentCriteria.list();
if (results != null && results.size() > 0) {
   studentIds.addAll(results);
}
return studentIds;

What is the problem why it is so. If the same query is running through management studio the result is retrieved without any spikes and result is retrieved within 1 seconds. Any solution???

Edit1: My hibernate generated query is,

select this_.Student_Id as y0_ from student_table this_ where this_.Roll_No in

Edit2: My execution plan This was after indexing roll_no

CREATE INDEX i_student_roll_no ON student_table (Roll_No) 

My execution plan,

like image 606
Jaya Ananthram Avatar asked Apr 23 '15 05:04

Jaya Ananthram


People also ask

Why is CPU utilization high in SQL Server?

Although there are many possible causes of high CPU usage that occur in SQL Server, the following ones are the most common causes: High logical reads that are caused by table or index scans because of the following conditions: Out-of-date statistics. Missing indexes.

What is the correct usage of FROM clause in hibernate query Language?

FROM Clause and Aliases Hibernate allows us to assign aliases to the classes in our query with the as a clause. Use the aliases to refer back to the class inside the query. The from clause is very basic and useful for working directly with objects.

How can we improve query performance in hibernate?

Hibernate performance best practices include; reducing selectivity on queries, choosing the right FetchType, caching, logging SQL statements, and else. Performance tuning is important for all application development, but it's especially important for data-driven web applications.


1 Answers

The query you run from the console is easily cacheable and that's why the response is instantaneous. If you look at the query, you'll see that all parameters are embedded in the query, so the query planner can detect there's no variation and all executions will always go to the same plan and to the same cached result.

The query that you run with Hibernate, even if it were a native query, it uses a PreparedStatement and parameters are bind at query execution time and to quote one of the best author on indexing:

What has that to do with bind parameters?

The shared execution plan caches of DB2, Oracle and SQL Server use a hash value of the literal SQL string as key to the cache. Cached plans are not found if the SQL contains literal values that vary with each execution.

Place holders (bind parameters) unify the statement so that the SQL string is identical when executed with different values—thus, increasing the cache-hit rate.

To solve it, you need to add an index on both the (Roll_No, Student_Id) columns so that the query becomes an index-only scan.

SQL Server defaults to cluster indexes, which limit you to one clustered index per table, so you might want to turn this table into a heap table instead and focus on index-only scans.

like image 113
Vlad Mihalcea Avatar answered Nov 11 '22 07:11

Vlad Mihalcea