This example aggregation will throw an IllegalArgumentException
Invalid reference 'role'!
We got this problem every time after renaming a field after a projection stage.
final Aggregation aggregation = newAggregation(
// We only like to have the "company" and "empolyee.role" renamed to "role"
project("company")
.and("employee.role").as("role"),
// Group by the **renamed** "role"
group("role").count().as("count"), // this will fail because "role" is an invalid reference.
limit(2)
);
return aggregation;
The JSON we are working on looks like this:
{
// some fields
company : {
// some fields
}
employee : {
role : {
// some fields
}
}
}
Thoughts:
Here Oliver said
It's important to understand that you define aggregations in terms of type properties, not document field names.
Is that the reason why we get the exception? If so, how to use the nice aggegration api spring data offers.
Update::
This is the Stacktrace i get with version 1.5.0.M1:
java.lang.IllegalArgumentException: Invalid reference 'role'!
at org.springframework.data.mongodb.core.aggregation.ExposedFieldsAggregationOperationContext.getReference(ExposedFieldsAggregationOperationContext.java:78)
at org.springframework.data.mongodb.core.aggregation.ExposedFieldsAggregationOperationContext.getReference(ExposedFieldsAggregationOperationContext.java:62)
at org.springframework.data.mongodb.core.aggregation.GroupOperation.toDBObject(GroupOperation.java:292)
at org.springframework.data.mongodb.core.aggregation.Aggregation.toDbObject(Aggregation.java:247)
at com.xxx.report.adapter.AggrigateByTopic.aggrigateBy(AggrigateByTopic.java:38)
at com.xxx.report.adapter.AggrigateByTopicTest.shouldAggrigate(AggrigateByTopicTest.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:74)
at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:83)
at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:72)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:232)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:89)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:71)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:175)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
The $project takes a document that can specify the inclusion of fields, the suppression of the _id field, the addition of new fields, and the resetting of the values of existing fields. Alternatively, you may specify the exclusion of fields. Specifies the inclusion of a field.
Use the _id field in the $group pipeline stage to set the group key. See below for usage examples. In the $group stage output, the _id field is set to the group key for that document. The output documents can also contain additional fields that are set using accumulator expressions.
We can group by single as well as multiple field from the collection, we can use $group operator in MongoDB to group fields from the collection and returns the new document as result. We are using $avg, $sum, $max, $min, $push, $last, $first and $addToSet operator with group by in MongoDB.
The $$ROOT variable contains the source documents for the group. If you'd like to just pass them through unmodified, you can do this by $pushing $$ROOT into the output from the group.
It is true that the implementation "does not like" the type of field aliasing that you are doing here, but in the strictest interpretation, what you are doing does not make much sense.
Your statement should be something like:
final Aggregation aggregation = newAggregation(
group("employee.role").count().as("count"),
sort(Sort.Direction.DESC,"count"),
limit(2)
);
System.out.println(aggregation);
Which produces the pipeline as:
{
"aggregate" : "__collection__",
"pipeline" : [
{ "$group" : {
"_id" : "$employee.role",
"count" : { "$sum" : 1}
}},
{ "$sort" : { "count" : -1} },
{ "$limit" : 2}
]
}
The point being that your $project
usage here isn't really doing anything other than selecting one field that you do not use later, and creating an alias for another field that you don't really use anyway as it just becomes the _id
field for your grouping. Also note the use of $sort
as it doesn't really make much sense to $limit
unless you have things in an expected order, and $group
does not do that by itself.
As for explaining the "properties" concept, which I am not really a fan of, then you might consider the following code:
final Aggregation aggregation = newAggregation(
group("country","employee.role").count().as("count"),
group("employee.role","count").count().as("totalCount"),
sort(Sort.Direction.DESC,"totalCount"),
limit(2)
);
System.out.println(aggregation);
Then the pipeline that is constructed would look like this:
{
"aggregate" : "__collection__",
"pipeline" : [
{ "$group" : {
"_id" : {
"country" : "$country" ,
"role" : "$employee.role"
},
"count" : { "$sum" : 1}
}},
{ "$group" : {
"_id" : {
"role" : "$_id.employee.role" ,
"count" : "$count"
},
"totalCount" : { "$sum" : 1}
}},
{ "$sort" : { "totalCount" : -1} },
{ "$limit" : 2 }
]
}
So while that will run through to the output dump as shown without an exception, there is still a problem in the pipeline produced. While the first $group
statement compacts an alias for the sub-document field, and all if fine at this point, it is the second $group
stage that introduces a problem.
The builder methods are just "not happy" unless you refer to that field by the full "employee.role" notation as property of the original document. And though it does work out that this will now be part of the _id
field from the previous stage, it completely forgets that the field was aliased.
For my two cents, that is the wrong behavior and a strong reason why I am not a big fan of the builders.
So you can use them, but I think the design is not entirely there yet and has some flaws. Again, for my money it seems safer and more flexible to just work with DBObject
types to construct the pipeline and be done with it. At least you know you always get exactly what you mean.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With