Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order by within group by in Doctrine 2

I'm using Symfony 2 PR12 with Doctrine 2 and MySQL. I have a database storing articles and views of those articles:

// ...
class Article {

    /**
     * @orm:Column(type="bigint")
     * @orm:Id
     * @orm:GeneratedValue
     * @var int
     */
    protected $id;

    /**
     * @orm:OneToMany(targetEntity="ArticleView",mappedBy="article")
     * @var ArrayCollection
     */
    protected $views;

    // ...
}

// ...
class ArticleView {

    /**
     * @orm:Column(type="bigint")
     * @orm:Id
     * @orm:GeneratedValue
     * @var int
     */
    protected $id;

    /**
     * @orm:Column(type="bigint",name="DateRead",nullable=true)
     * @var int
     */
    protected $viewDate;

    /**
     * @orm:ManyToOne(targetEntity="Article",inversedBy="views")
     * @var Article
     */
    protected $article;

    // ...
}

I want to get, for example, the 20 most-recently-viewed articles. My first thought would be something like:

$qb = <instance of Doctrine\ORM\QueryBuilder>;
$qb->select('a')
   ->from('Article', 'a')
   ->join('a.views', 'v')
   ->orderBy('v.viewDate', 'DESC')
   ->groupBy('a.id')
   ->setMaxResults(20)
;

However, when there's more than one view associated with an article, the order-by/group-by combination gives unpredictable results for the ordering.

This is expected behavior for MySQL, since grouping is handled before ordering, and there are working raw-query solutions to this problem at http://www.artfulsoftware.com/infotree/mysqlquerytree.php (Aggregates -> Within-group aggregates). But I can't figure out how to translate any of these solutions into DQL, since as far as I can tell there's no way to select from subqueries or perform self-exclusion joins.

Any ideas on how to solve the problem with reasonable performance?

like image 673
Kevin Montag Avatar asked Apr 23 '11 21:04

Kevin Montag


1 Answers

I ended up solving it with a correlated subquery:

$qb
    ->select('a')
    ->from('Article', 'a')
    ->join('a.views', 'v')
    ->orderBy('v.viewDate', 'DESC')
    ->setMaxResults(20)

    // Only select the most recent article view for each individual article
    ->where('v.viewDate = (SELECT MAX(v2.viewDate) FROM ArticleView v2 WHERE v2.article = a)')

That way the sort ignores ArticleView's other than the most recent for any given article. Though my guess is that this performs fairly poorly relative to the other raw SQL solutions - any answers with better performance would still be greatly appreciated :).

like image 122
Kevin Montag Avatar answered Oct 21 '22 07:10

Kevin Montag