Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongodb returns capitalized strings first when sorting

When I tried to sort a collection a string field (here Title), sorting not working as expected. Please see below:

db.SomeCollection.find().limit(50).sort({ "Title" : -1 });

Actual Result order

  • "Title" : "geog.3 students' book"
  • "Title" : "geog.2 students' book"
  • "Title" : "geog.1 students' book"
  • "Title" : "Zoe and Swift"
  • "Title" : "Zip at the Theme Park"
  • "Title" : "Zip at the Supermarket"

Expected Result order

  • "Title" : "Zoe and Swift"
  • "Title" : "Zip at the Theme Park"
  • "Title" : "Zip at the Supermarket"
  • "Title" : "geog.3 students' book"
  • "Title" : "geog.2 students' book"
  • "Title" : "geog.1 students' book"

Same issues occurs when I tried to sort by Date field.

Any suggestions?

like image 617
Bibin Avatar asked Nov 08 '13 09:11

Bibin


3 Answers

Update: Version 3.4 has case insensitive indexes

This is a known issue. MongoDB doesn't support lexical sorting for strings (JIRA: String lexicographical ordering). You should sort the results in your application code, or sort using a numeric field. It should sort date fields reliably though. Can you give an example where sorting by date doesn't work?

like image 141
Munim Avatar answered Oct 09 '22 21:10

Munim


What exactly surprises you?

It sorts based on the presentation of the numerical representation of the symbol. If you will look here (I know that mongodb stores string in UTF-8, so this is just for educational purpose). You will see that the upper case letters have corresponding numbers lower then lower case letters. Thus they will go in front.

Mongodb can not sort letters based on localization or case insensitive.

In your case g has higher number then Z, so it goes first (sorting in decreasing order). And then 3 has corresponding number higher then 2 and 1. So basically everything is correct.

like image 28
Salvador Dali Avatar answered Oct 09 '22 21:10

Salvador Dali


If you use aggregation expected output is possible see below:


    db.collection.aggregate([
    { 
        "$project": {
           "Title": 1,        
           "output": { "$toLower": "$Title" }       
        }},
        { "$sort": {  "output":-1 } },
        {"$project": {"Title": 1, "_id":0}}
    ])


it will give you expected output as below:


    {
        "result" : [ 
            {
                "Title" : "Zoe and Swift"
            }, 
            {
                "Title" : "Zip at the Theme Park"
            }, 
            {
                "Title" : "Zip at the Supermarket"
            }, 
            {
                "Title" : "geog.3 students' book"
            }, 
            {
                "Title" : "geog.2 students' book"
            }, 
            {
                "Title" : "geog.1 students' book"
            }
        ],
        "ok" : 1
    }

like image 3
Parag Vaidya Avatar answered Oct 09 '22 20:10

Parag Vaidya