Situation: I have 1 million users and I want to search them by tags. Here's the schema: <pre class="prettyprint"><code>const Account = new Schema({ tags: { type: [String], default: ['cats'] } }) </code></pre> Here's the first query: <pre class="prettyprint"><code>const query = { tags: 'cats'} const fields = { _id: 0 } const options = { skip: 0, limit: 5 } Account.find(query, fields, options) </code></pre> After calling <code>find</code> method <code>Mongoose</code> will start searching and return the first 5 entries it matches. The performance in this case is optimal. Here's the second query: <pre class="prettyprint"><code>const query = { tags: 'cats'} const fields = { _id: 0 } const options = { skip: 500 000, limit: 5 } Account.find(query, fields, options) </code></pre> What going on in this case is of interest to me. Does <code>Mongoose</code> match 500 000 entries first and then returns the next 5 entries? Or does it somehow 'jump' to the 500 000 element without having to match them all beforehand? I'm trying to understand how efficient this proccess is and whether I can improve it somehow. Do you have any advice?

Yes, it is matching 500,000 entries first. From the MongoDB Definitive Guide 2nd Edition: <blockquote> $skip takes a number, n, and discards the first n documents from the result set. As with “normal” querying, it isn’t efficient for large skips, as it must find all of the matches that must be skipped and then discard them. </blockquote> You can examine your query in detail from the Mongo shell using explain(): https://docs.mongodb.com/manual/reference/method/cursor.explain/ Ex: <pre class="prettyprint"><code>// Create a dummy collection var tagTypes = ['cats', 'dogs', 'chickens'] for (i=0; i < 1000; i++) { let tag = tagTypes[Math.floor(Math.random()*tagTypes.length)] db.people.insert({"tags" : [tag]}) } </code></pre> then explain it <pre class="prettyprint"><code>db.people.find({"tags" : "cats"}).skip(100).limit(10).explain("executionStats”) </code></pre> totalDocsExamined is showing you that it is matching everything that it's skipping <pre class="prettyprint"><code>"executionStats" : { ... "totalDocsExamined" : 420 </code></pre> If you were to create an index on tags: <pre class="prettyprint"><code>db.people.ensureIndex({ "tags" : 1 }) </code></pre> Running it again you'd get: <pre class="prettyprint"><code>"executionStats" : { ... "totalDocsExamined" : 110 </code></pre>

How efficient is searching by tags?

Situation:

I have 1 million users and I want to search them by tags.

Here's the schema:

const Account = new Schema({
  tags: { type: [String], default: ['cats'] }
})

Here's the first query:

const query = { tags: 'cats'}
const fields = { _id: 0 }
const options = { skip: 0, limit: 5 }

Account.find(query, fields, options)

After calling find method Mongoose will start searching and return the first 5 entries it matches. The performance in this case is optimal.

Here's the second query:

const query = { tags: 'cats'}
const fields = { _id: 0 }
const options = { skip: 500 000, limit: 5 }

Account.find(query, fields, options)

What going on in this case is of interest to me.

Does Mongoose match 500 000 entries first and then returns the next 5 entries?

Or does it somehow 'jump' to the 500 000 element without having to match them all beforehand?

I'm trying to understand how efficient this proccess is and whether I can improve it somehow. Do you have any advice?

How do I store post tags in a database?

The easiest way: create a tag table, create an entity table, create a many-to-many table like { tag_id, entity_id }. For every tag/entity insert that pair into link table. The best way: do not store tag data in relational DB, use mongo, arangodb or other nosql db for this.

What are tags in a database?

In information systems, a tag is a keyword or term assigned to a piece of information (such as an Internet bookmark, multimedia, database record, or computer file). This kind of metadata helps describe an item and allows it to be found again by browsing or searching.

Interesting. My understanding is that Findhere is very similar in query functionalities to Select, wherein it will work inefficiently by first going through 500,000 values before skipping them.

Hence, what I suggested is implementing something known as a Multilevel index. Basically create a nested indexing system, which although taking up data space, is very efficient, especially for sequential search practices where you do not wish to implement a tree outright. Create a set of outer indices which are more general, then link them to inner indices respectively, before moving onto stored data blocks.

Implementing this should not be too difficult and should drastically improve the amount of traversing being done. Hope this helps.

Yes, it is matching 500,000 entries first.

From the MongoDB Definitive Guide 2nd Edition:

$skip takes a number, n, and discards the first n documents from the result set. As with “normal” querying, it isn’t efficient for large skips, as it must find all of the matches that must be skipped and then discard them.

You can examine your query in detail from the Mongo shell using explain():

https://docs.mongodb.com/manual/reference/method/cursor.explain/

Ex:

// Create a dummy collection
var tagTypes = ['cats', 'dogs', 'chickens']
    for (i=0; i < 1000; i++) {
        let tag = tagTypes[Math.floor(Math.random()*tagTypes.length)]
        db.people.insert({"tags" : [tag]})
    }

then explain it

db.people.find({"tags" : "cats"}).skip(100).limit(10).explain("executionStats”)

totalDocsExamined is showing you that it is matching everything that it's skipping

"executionStats" : {
...
    "totalDocsExamined" : 420

If you were to create an index on tags:

db.people.ensureIndex({ "tags" : 1 })

Running it again you'd get:

"executionStats" : {
...
    "totalDocsExamined" : 110

How efficient is searching by tags?

Tags:

mongoose

manidos

People also ask

2 Answers

Soumyajyoti Bhattacharya

drew

Recent Activity

Donate For Us

How efficient is searching by tags?

Tags:

mongoose

manidos

People also ask

2 Answers

Soumyajyoti Bhattacharya

drew

Related questions

Recent Activity

Donate For Us