Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Doctrine2: Limiting with Left Joins / Pagination - Best Practice

i have a big query (in my query builder) and a lot of left joins. So i get Articles with their comments and tags and so on. Let's say i have the following dql:

$dql = 'SELECT blogpost, comment, tags 
FROM BlogPost blogpost 
LEFT JOIN blogpost.comments comments
LEFT JOIN blogpost.tags tags';

Now let's say my database has more than 100 blogposts but i only want the first 10, but with all the comments of those 10 and all their tags, if they exist. If i use setMaxResults it limits the Rows. So i might get the first two Posts, but the last one of those is missing some of it's comments or tags. So the followin doesn't work.

$result = $em->createQuery($dql)->setMaxResults(15)->getResult();

Using the barely documented Pagination Solution that ships with doctrine2.2 doesn't really work for me either since it is so slow, i could as well load all the data.

I tried the Solutions in the Stackoverflow Article, but even that Article is still missing a Best Practise and the presented Solution is deadly slow.

Isn't there a best practise on how to do this? Is nobody using Doctrine2.2 in Production mode?

like image 942
Andresch Serj Avatar asked May 22 '12 12:05

Andresch Serj


1 Answers

Getting the proper results with a query like this is problematic. There is a tutorial on the Doctrine website explaining this problem.

Pagination

The tutorial is more about pagination rather than getting the top 5 results, but the overall idea is that you need to do a "SELECT DISTINCT a.id FROM articles a ... LIMIT 5" instead of a normal SELECT. It's a little more complicated than this, but the last 2 points in that tutorial should put you on the right track.

Update:

The problem here is not Doctrine, or any other ORM. The problem lies squarely on the database being able to return the results you're asking for. This is just how joins work.

If you do an EXPLAIN on the query, it will give you a more in depth answer of what is happening. It would be a good idea to add the results of that to your initial question.

Building on what is discussed in the Pagination article, it would appear that you need at least 2 queries to get your desired results. Adding DISTINCT to a query has the potential to dramatically slow down your query, but its only really needed if you have joins in it. You could write another query that just retrieves the first 10 posts ordered by created date, without the joins. Once you have the IDs of those 10 posts, do another query with your joins, and a WHERE blogpost.id IN (...) ORDER BY blogpost.created. This method should be much more efficient.

SELECT 
    bp 
FROM 
    Blogpost bp 
ORDER BY 
    bp.created DESC
LIMIT 10

Since all you care about in the first query are the IDs, you could set Doctrine to use Scalar Hydration.

SELECT 
    bg 
FROM 
    Blogpost bp 
LEFT JOIN 
    bp.comments c 
LEFT JOIN 
    bp.tags t 
WHERE 
    bp.id IN (...) 
ORDER BY 
    bp.created DESC

You could also probably do it in one query using a correlated subquery. The myth that subqueries are always bad is NOT true. Sometimes they are faster than joins. You will need to experiment to find out what the best solution is for you.

like image 127
Bart Wegrzyn Avatar answered Sep 23 '22 21:09

Bart Wegrzyn