Given data that looks like this:
{'_id': 'foobar1',
'about': 'similarity in comparison',
'categories': ['one', 'two', 'three']}
{'_id': 'foobar2',
'about': 'perfect similarity in comparison',
'categories': ['one']}
{'_id': 'foobar3',
'about': 'partial similarity',
'categories': ['one', 'two']}
{'_id': 'foobar4',
'about': 'none',
'categories': ['one', 'two']}
I would like to find a way to get a similarity between a single item and all other items in the collection then return them in order of highest similarity. Similarity is based on number of words in common, there is already a function int similar(String one, String two)
For example: if I want the similarity list for about
field of foobar1
, it would return
[{'_id': 'foobar2'}, {'_id': 'foobar3'}, {'_id': 'foobar4'}]
I am doing this with morphia, but with just the mongoDB implementation, I could figure the rest out
If you need to compute text similarity on the about
field, one way to achieve this is to use text index.
For example (in the mongo
shell), if you create a text index on the about
field:
db.collection.createIndex({about: 'text'})
you could execute a query such as (example taken from https://docs.mongodb.com/manual/reference/operator/query/text/#sort-by-text-search-score):
db.collection.find({$text: {$search: 'similarity in comparison'}}, {score: {$meta: 'textScore'}}).sort({score: {$meta: 'textScore'}})
With your example documents, the query should return something like:
{
"_id": "foobar1",
"about": "similarity in comparison",
"score": 1.5
}
{
"_id": "foobar2",
"about": "perfect similarity in comparison",
"score": 1.3333333333333333
}
{
"_id": "foobar3",
"about": "partial similarity",
"score": 0.75
}
which are sorted by decreasing similarity score. Please note that unlike your example result, document foobar4
is not returned because none of the queried words are present in foobar4
.
Text indexes are considered a special type of index in MongoDB, and thus comes with some specific rules on its usage. For more details, please see:
$text
query operatorIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With