I have a Model, Articles, which hasMany Abstracts. I want to load the 10 latest Articles, and for each Article, the Abstract with the highest number of points. My function looks like this:
public function getArticles($category, $viewName) {
$subArticles = $this->Articles->findByCategory($category)->contain([
'Abstracts' => function ($q) {
return $q
->select(['body', 'points', 'article_id'])
->where(['Abstracts.approved' => true])
->limit(10)
->order(['Abstracts.points' => 'DESC']);
}
])
->limit(10)
->order(['Articles.created' => 'DESC']) ;
$this->set( $viewName . 'Articles', $subArticles );
}
The result that I get is not what I intend though. Looking through the SQL, first CakePHP is getting the articles.id of everything in the category (fine). Then, CakePHP goes into the Abstracts table, using those 10 articles.id's it just found, and asks for the 10 Abstracts with the highest votes (that belong to those Articles).
The problem is that I want 1 Abstract for each Article, not the 10 Abstracts belonging to any Article in that category. How can I fix this? Thanks!
EDIT
ndm suggested that this was a duplicate of Using limit() on contained model so I attempted the solution there. Namely, I added this to my Model:
$this->hasOne('TopAbstract', [
'className' => 'Abstracts',
'foreignKey' => 'abstract_id',
'strategy' => 'select',
'sort' => ['TopAbstract.points' => 'DESC'],
'conditions' => function ($e, $query) {
$query->limit(1);
return $e;
} ]);
And then I try to find the Articles byCategory, with contain(['TopAbstract']), only this kills my SQL. It dies a horrible death:
Error: SQLSTATE[HY000]: General error: 1 near ")": syntax error
Debug doesn't even show the query that killed it, so I'm not sure how to debug this one?
EDIT
Talking to myself a bit, but the error is definitely in the 'conditions' part of the hasOne. I take that out, and it works fine. Can't find an example of how this is supposed to look on the interwebs.. anyone have any idea?
What you are looking for, is a solution to the greatest-n-per-group problem. You didn't mention any specific RDBMS, but nonetheless see also http://dev.mysql.com/doc/refman/5.6/en/example-maximum-column-group-row.html
For those who are a little bit adventurous, I've developed some custom associations that transparently integrate into the ORM layer, and allow for basic limit per group for hasMany
and belongsToMany
relations: https://github.com/icings/partitionable.
Using them, the solution for the question would be to set up an association like this:
$this
->partitionableHasMany('TopAbstracts')
->setClassName('Abstracts')
->setLimit(1)
->setSort([
'Abstracts.points' => 'DESC',
'Abstracts.id' => 'ASC',
]);
TopAbstracts
could then be contained just like any other association.
So let's give this a try, here's three options that can be applied on association level (defining the conditions could also be moved into custom finders), however you might consider them as not that "straightforward".
$this->hasOne('TopAbstracts', [
'className' => 'Abstracts',
'strategy' => 'select',
'conditions' => function (\Cake\Database\Expression\QueryExpression $exp, \Cake\ORM\Query $query) {
$query->innerJoin(
[
'AbstractsFilter' => $query
->connection()
->newQuery()
->select(['article_id', 'points' => $query->func()->max('points')])
->from('abstracts')
->group('article_id')
],
[
'TopAbstracts.article_id = AbstractsFilter.article_id',
'TopAbstracts.points = AbstractsFilter.points'
]
);
return [];
}
]);
This will select the top abstracts via a join query that is based on the max points, it will look something like
SELECT
TopAbstracts.id AS `TopAbstracts__id`, ...
FROM
abstracts TopAbstracts
INNER JOIN (
SELECT
article_id, (MAX(points)) AS `points`
FROM
abstracts
GROUP BY
article_id
)
AbstractsFilter ON (
TopAbstracts.article_id = AbstractsFilter.article_id
AND
TopAbstracts.points = AbstractsFilter.points
)
WHERE
TopAbstracts.article_id in (1,2,3,4,5,6,7,8, ...)
$this->hasOne('TopAbstracts', [
'className' => 'Abstracts',
'strategy' => 'select',
'conditions' => function (\Cake\Database\Expression\QueryExpression $exp, \Cake\ORM\Query $query) {
$query->leftJoin(
['AbstractsFilter' => 'abstracts'],
[
'TopAbstracts.article_id = AbstractsFilter.article_id',
'TopAbstracts.points < AbstractsFilter.points'
]);
return $exp->add(['AbstractsFilter.id IS NULL']);
}
]);
This will use a self-join that filters based on the rows that don't have a.points < b.points
, it will look something like
SELECT
TopAbstracts.id AS `TopAbstracts__id`, ...
FROM
abstracts TopAbstracts
LEFT JOIN
abstracts AbstractsFilter ON (
TopAbstracts.article_id = AbstractsFilter.article_id
AND
TopAbstracts.points < AbstractsFilter.points
)
WHERE
(AbstractsFilter.id IS NULL AND TopAbstracts.article_id in (1,2,3,4,5,6,7,8, ...))
$this->hasOne('TopAbstracts', [
'className' => 'Abstracts',
'foreignKey' => false,
'conditions' => function (\Cake\Database\Expression\QueryExpression $exp, \Cake\ORM\Query $query) {
$subquery = $query
->connection()
->newQuery()
->select(['SubTopAbstracts.id'])
->from(['SubTopAbstracts' => 'abstracts'])
->where(['Articles.id = SubTopAbstracts.article_id'])
->order(['SubTopAbstracts.points' => 'DESC'])
->limit(1);
return $exp->add(['TopAbstracts.id' => $subquery]);
}
]);
This will use a correlated subquery that uses a rather specific select with simple ordering and limiting to pick the top comment. Note that the foreignKey
option is set to false
in order to avoid an additional Articles.id = TopAbstracts.article_id
condition to be compiled into the join conditions.
The query will look something like
SELECT
Articles.id AS `Articles__id`, ... ,
TopAbstracts.id AS `TopAbstracts__id`, ...
FROM
articles Articles
LEFT JOIN
abstracts TopAbstracts ON (
TopAbstracts.id = (
SELECT
SubTopAbstracts.id
FROM
abstracts SubTopAbstracts
WHERE
Articles.id = SubTopAbstracts.article_id
ORDER BY
SubTopAbstracts.points DESC
LIMIT
1
)
)
All these 3 options will query and inject the records without any hackery, it's just not very "straightforward".
For the sake of completeness, it is of course always possible to manually load the associcated records and format the results appropriately, for example using result formatters, see for example CakePHP Entity contain without foreign key
Just for reference, one of the weird solutions I stumbled over initially. This one really shouldn't be used!
This will select all associated abstracts, and then the ORM will iterate over them and for each article pick the first one with a matching article_id
value. So in theory, when ordered descing on points
, the ORM should pick the one with he most points.
While I would have expected this to work out of the box, it seems that the ORM iterates over the results in reversed order, which will cause the wrong rows to be picked. In order to get this working, the query needs to use the opposite order that would normally need to be used, ie ASC
instead of DESC
.
$this->hasOne('TopAbstracts', [
'className' => 'Abstracts',
'foreignKey' => 'abstract_id',
'strategy' => 'select',
'conditions' => function (\Cake\Database\Expression\QueryExpression $exp, \Cake\ORM\Query $query) {
$query->order(['TopAbstracts.points' => 'ASC']);
return [];
}
]);
Also the function needs to return an empty array instead of the expression like shown in the linked answer, as this will cause invalid SQL to be compiled. Both of these behaviours, the reversed order iterating and the invalid SQL might be bugs.
While this will work, it will always select all associated abstracts, not only the top ones, which might be considered rather inefficient, and look something like
SELECT
Articles.id AS `Articles__id`, ...
FROM
articles Articles
SELECT
TopAbstracts.id AS `TopAbstracts__id`, ...
FROM
abstracts TopAbstracts
WHERE
TopAbstracts.article_id in (1,2,3,4,5,6,7,8, ...)
ORDER BY
TopAbstracts.points ASC
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With