Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mongo unique index case insensitive

@CompoundIndexes({
    @CompoundIndex(name = "fertilizer_idx",
        unique = true,
        def = "{'name': 1, 'formula': 1, 'type': 1}")
})
public class Fertilizer extends Element implements Serializable {
//class stuff
}

Is it possible to create the index case insensitive? Right now it is differentiating from NAME to NAMe. Saving a second field lowercase (or uppercase) is not a possibility for me.

Thanks, Pedro

like image 766
Pedro Dusso Avatar asked Nov 16 '15 13:11

Pedro Dusso


4 Answers

Prior of MongoDB version 3.4 we were unable to create index with case insensitive.

In version 3.4 has collation option that allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.

The collation option has the following syntax:

collation: {
   locale: <string>,
   caseLevel: <boolean>,
   caseFirst: <string>,
   strength: <int>,
   numericOrdering: <boolean>,
   alternate: <string>,
   maxVariable: <string>,
   backwards: <boolean>
}

where the locale field is mandatory; all other fields are optional.

To create index with case insensitive we need to use mandatory field locale and strength field for string comparison level. strength allows value rage 1 - 5. read more about collation

The strength attribute determines whether accents or case are taken into account when collating or matching text

Example:

if strength=1 then role = Role = rôle

if strength=2 then role = Role < rôle

if strength=3 then role < Role < rôle

Comparison level doc

So we need to use strength=2 to create index. like:

db.collectionName.createIndex(
  { name: 1, formula: 1, type: 1 },
  { 
    name: "fertilizer_idx",
    collation: {locale: "en", strength: 2},
    unique: true
  }
)

N.B: collation option is not available for text indexes.

like image 114
Shaishab Roy Avatar answered Sep 24 '22 04:09

Shaishab Roy


Spring Data Mongo2.2 provides 'Annotation-based Collation support through @Document and @Query.'

Ref. What's new in Spring Data Mongo2.2

@Document(collection = 'fertilizer', collation = "{'locale':'en', 'strength':2}")
public class Fertilizer extends Element implements Serializable {

    @Indexed(unique = true)
    private String name;
    //class stuff
}

When the application is started, it will create the indexes along with respective collation for every document.

Collation changes at DB level based on Java code

like image 39
Dhaval Simaria Avatar answered Sep 26 '22 04:09

Dhaval Simaria


Yes, it is now available in MongoDB 3.4 with the new collation feature.

you can create a case insensitive index like this:

db.collection.createIndex({
   name:1,
   formula:1,
   type:1
},
{
   collation:{
      locale:"en",
      strength:2
   }
});

where the strength attribute is the comparaison level

you can then get case insensitive match with this query:

db.collection.find({name: "name"}).collation({locale: "en", strength: 2});

see collation for details

if you upgraded to mongodb 3.4 from a previous version, you may need to set compatibility before creating the index like this

db.adminCommand( { setFeatureCompatibilityVersion: "3.4" } )
like image 38
felix Avatar answered Sep 26 '22 04:09

felix


db.collection.createIndex(
{ name: 1, formula: 1, type: 1 },
{ name: "fertilizer_idx", unique: true, collation:{ locale: "en", strength: 2 } }
)

Use collation as an option for db.collection.createIndex()

more info here: https://docs.mongodb.com/manual/reference/method/db.collection.createIndex/

here for locale/language information: https://docs.mongodb.com/manual/reference/collation-locales-defaults/#collation-languages-locales

strength: integer

Optional. The level of comparison to perform. Possible values are:

1: Primary level of comparison. Collation performs comparisons of the base characters only, ignoring other differences such as diacritics and case.

2: Secondary level of comparison. Collation performs comparisons up to secondary differences, such as diacritics. That is, collation performs comparisons of base characters (primary differences) and diacritics (secondary differences). Differences between base characters takes precedence over secondary differences.

3: Tertiary level of comparison. Collation performs comparisons up to tertiary differences, such as case and letter variants. That is, collation performs comparisons of base characters (primary differences), diacritics (secondary differences), and case and variants (tertiary differences). Differences between base characters takes precedence over secondary differences, which takes precedence over tertiary differences. This is the default level.

4: Quaternary Level. Limited for specific use case to consider punctuation when levels 1-3 ignore punctuation or for processing Japanese text.

5: Identical Level. Limited for specific use case of tie breaker.

Mongo 3.4 has collation, which allows users to specify language-specific rules for string comparison

Collation includes:

collation: {
   locale: <string>,
   caseLevel: <boolean>,
   caseFirst: <string>,
   strength: <int>,
   numericOrdering: <boolean>,
   alternate: <string>,
   maxVariable: <string>,
   backwards: <boolean>
}
like image 42
Joe Stogner Avatar answered Sep 25 '22 04:09

Joe Stogner