Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mybatis nested select vs. nested results

I'm a junior user of mybatis, I wonder the difference of nested select and nested results whether it's just simply like the difference between sub-query vs. join, especially in performance. Or it will do some optimization?

I used mybatis 3.4.7 version and oracle DB.

here is an example for reference:

private List<Post> posts;

    <resultMap id="blogResult" type="Blog">
        <collection property="posts" javaType="ArrayList" column="id" 
        ofType="Post" select="selectPostsForBlog"/>
    </resultMap>
    <select id="selectBlog" resultMap="blogResult">
        SELECT * FROM BLOG WHERE ID = #{id}
    </select>
    <select id="selectPostsForBlog" resultType="Post">
        SELECT * FROM POST WHERE BLOG_ID = #{id}
    </select>  

or

    <select id="selectBlog" resultMap="blogResult">
        select
        B.id as blog_id,
        B.title as blog_title,
        B.author_id as blog_author_id,
        P.id as post_id,
        P.subject as post_subject,
      P.body as post_body,
      from Blog B
      left outer join Post P on B.id = P.blog_id
      where B.id = #{id}
    </select>

    <resultMap id="blogResult" type="Blog">
        <id property="id" column="blog_id" />
        <result property="title" column="blog_title"/>
        <collection property="posts" ofType="Post">
        <id property="id" column="post_id"/>
        <result property="subject" column="post_subject"/>
        <result property="body" column="post_body"/>
      </collection>
    </resultMap>

if there is still N+1 problem in nested select like sub-query?

do you have any advice or experience of which one performs better in a certain environment or condition? thanks a lot :).

like image 345
Wang Kenneth Avatar asked Apr 13 '26 15:04

Wang Kenneth


1 Answers

First of all a slight terminology note. Subquery in SQL is a part of the query that is a query by itself, for example:

SELECT ProductName
  FROM Product 
WHERE Id IN (SELECT ProductId 
            FROM OrderItem
           WHERE Quantity > 100)

In this case the following piece of the query is the subquery:

SELECT ProductId 
 FROM OrderItem
WHERE Quantity > 100

So you are using term "subquery" here incorrectly. In mybatis documentation the term nested select is used.

There are two ways to fetch associated entities/collections in mybatis. Here's relevant part of the documentation:

Nested Select: By executing another mapped SQL statement that returns the complex type desired. Nested Results: By using nested result mappings to deal with repeating subsets of joined results.

When nested select is used mybatis executes the main query first (in your case selectBlog) and then for every record it executes another select (hence the name nested select) to fetch associated Post entities.

When Nested results are used only one query is executed but it already has the association data joined. So mybatis maps the result to the object structure.

In your example single Blog entity is returned so when nested select is used two queries are executed, but in general case (if you would get the list of Blogs) you would hit N+1 problem.

Now let's deal with performance. All the following assumes that the queries are tuned (as in there are no missing indices), you are using connection pool, the database is collocated, basically speaking your system is tuned in all other regards.

Speaking of the performance there is no single correct answer and you milage may differ. You always need to test your particular workflows in your setup. Given so many factors affect performance like data distribution (think of max/min/arg posts each blog have), the size of the record in DB (think of number and size of the data fields in blog and post), the DB parameters (like disk type and speed, amount of memory available for dataset caching etc) there may no be a single answer only some generic observations that follow.

But we can understand the performance difference if we look at the cases on the ends of the performance spectrum. Like to see cases when nested select significantly outperforms join and vice versa.

For collection fetching join should be better in most cases because network latency to do N+1 request counts.

One case when nested select may be better is for one-to-many association when the record in the main table reference some other table and the cardinality of the other table is not large and the size of the record in the other table is large.

For example, let's consider Blog has a category property that references categories table and it may have one of these values Science, Fashion, News. And let's imagine the list of blogs is selected by some filter like keywords in the blog title. If the result contains let's say 500 items then most of the associated categories would be duplicates.

If we select them with join every record in the result set would contain Category data fields (and as a reminder most of them are duplicates and we have a lot of data in Category record).

If we select them using nested select we would do the query for fetch the category by category id for every record and here mybatis session cache comes to play. For the duration of the SqlSession every time mybatis executes the query it stores its result in the session cache so it does not execute repeating requests to the database but takes them from the cache. It means that after mybatis has retrieved some category by id for the first record it would reuse it for all other records in the recordset it processes.

In the above example we would do up to 4 requests to the database but the reduced amount of the data passed over the network may overweight the need to do 4 requests.

like image 119
Roman Konoval Avatar answered Apr 16 '26 01:04

Roman Konoval



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!