Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search for text or expression in multiple fields

db.movies.find({"original_title" : {$regex: input_data, $options:'i'}}, function (err, datares){
            if (err || datares == false) {
                db.movies.find({"release_date" : {$regex: input_data + ".*", $options:'i'}}, function (err, datares){
                    if(err || datares == false){
                        db.movies.find({"cast" : {$regex: input_data, $options:'i'}}, function (err, datares){
                            if(err || datares == false){
                                db.movies.find({"writers" : {$regex: input_data, $options:'i'}}, function (err, datares){
                                    if(err || datares == false){
                                        db.movies.find({"genres.name" : {$regex: input_data, $options:'i'}}, function (err, datares){
                                            if(err || datares == false){
                                                db.movies.find({"directors" : {$regex: input_data, $options:'i'}}, function (err, datares){
                                                    if(err || datares == false){
                                                        res.status(451);
                                                        res.json({
                                                            "status" : 451,
                                                            "error code": "dataNotFound",
                                                            "description" : "Invalid Data Entry."
                                                        });
                                                        return;
                                                    } else{
                                                        res.json(datares);
                                                        return;
                                                    }
                                                });
                                            } else {
                                                res.json(datares);
                                                return;
                                            }
                                        });
                                    } else {
                                                res.json(datares);
                                                return;
                                    }
                                });
                            } else {
                                res.json(datares);
                                return;
                            }
                        });
                    } else {
                        res.json(datares);
                        return;
                    }
                });
            } else {
                res.json(datares);
                return;
            }
        });

I am trying to implement a so called "all-in-one" search so that whenever a user types in any kind of movie related information, my application tries to return all relevant information. However I have noticed that this transaction might be expensive on the backend and sometimes the host is really slow.

  1. How do I smoothly close the db connection and where should I use it?

I read here that it is best not to close a mongodb connection in node.js >>Why is it recommended not to close a MongoDB connection anywhere in Node.js code?

  1. Is the a proper way to implement a all-in-one search kind of a thing by using nested find commands?
like image 371
summerNight Avatar asked Aug 06 '15 15:08

summerNight


1 Answers

Your current approach is full of problems and is not necessary to do this way. All you are trying to do is search for what a can gather is a plain string within a number of fields in the same collection. It may possibly be a regular expression construct but I'm basing two possibilities on a plain text search that is case insensitive.

Now I am not sure if you came to running one query dependant on the results of another because you didn't know another way or though it would be better. Trust me on this, that is not a better approach than anything listed here nor is it really required as will be shown:

Regex query all at once

The first basic option here is to continue your $regex search but just in a singular query with the $or operator:

db.movies.find(
    {
        "$or": [
            { "original_title" : { "$regex": input_data, "$options":"i"} },
            { "release_date" :   { "$regex": input_data, "$options":"i"} }, 
            { "cast" :           { "$regex": input_data, "$options":"i"} }, 
            { "writers" :        { "$regex": input_data, "$options":"i"} }, 
            { "genres.name" :    { "$regex": input_data, "$options":"i"} }, 
            { "directors" :      { "$regex": input_data, "$options":"i"} }
        ]
    },
    function(err,result) {
        if(err) {
           // respond error
        } else {
           // respond with data or empty
        }
    }
);

The $or condition here effectively works like "combining queries" as each argument is treated as a query in itself as far as document selection goes. Since it is one query than all the results are naturally together.

Full text Query, multiple fields

If you are not really using a "regular expression" built from regular expression operations i.e ^(\d+)\bword$, then you are probably better off using the "text search" capabilities of MongoDB. This approach is fine as long as you are not looking for things that would be generally excluded, but your data structure and subject actually suggests this is the best option for what you are likely doing here.

In order to be able to perform a text search, you first need to create a "text index", specifically here you want the index to span multiple fields in your document. Dropping into the shell for this is probably easiest:

db.movies.createIndex({
   "original_title": "text",
   "release_date":   "text",
   "cast" :          "text",
   "writers" :       "text",
   "genres.name" :   "text",
   "directors" :     "text"
})

There is also an option to assign a "weight" to fields within the index as you can read in the documentation. Assigning a weight give "priority" to the terms listed in the search for the field that match in. For example "directors" might be assigned more "weight" than "cast" and matches for "Quentin Tarantino" would therefore "rank higher" in the results where he was a director ( and also a cast member ) of the movie and not just a cast member ( as in most Robert Rodriguez films ).

But with this in place, performing the query itself is very simple:

db.movies.find(
    { "$text": { "$search": input_data }  },
    function(err,result) {
        if(err) {
           // respond error
        } else {
           // respond with data or empty
        }
    }
);

Almost too simple really, but that is all there is to it. The $text query operator knows to use the required index ( there can only be one text index per collection ) and it will just then look through all of the defined fields.

This is why I think this is the best fit for your use case here.

Parallel Queries

The final alternate I'll give here is you still want to demand that you need to run separate queries. I still deny that you do need to only query if the previous query does not return results, and I also re-assert that the above options should be considered "first", with preference to text search.

Writing dependant or chained asynchronous functions is a pain, and very messy. Therefore I suggest leaning a little help from another library dependency and using the node-async module here.

This provides an aync.map.() method, which is perfectly suited to "combining" results by running things in parallel:

var fields = [
    "original_title",
    "release_date",
    "cast",
    "writers",
    "genres.name",
    "directors"
];

async.map(
    fields,
    function(field,callback) {
        var search = {},
            cond = { "$regex": input_data, "$options": "i" };

        search[field] = cond;   // assigns the field to search

        db.movies.find(search,callback);
    },
    function(err,result) {
        if(err) {
           // respond error
        } else {
           // respond with data or empty
        }
    }
);

And again, that is it. The .map() operator takes each field and transposes that into the query which in turn returns it's results. Those results are then accessible after all queries are run in the final section, "combined" as if they were a single result set, just as the other alternates do here.

There is also a .mapSeries() variant that runs each query in series, or .mapLimit() if you are otherwise worried about using database connections and concurrent tasks, but for this small size this should not be a problem.

I really don't think that this option is necessary, however if the Case 1 regular expression statements still apply, this "may" possibly provide a little performance benefit due to running queries in parallel, but at the cost of increased memory and resource consumption in your application.

Anyhow, the round up here is "Don't do what you are doing", you don't need to and there are better ways to handle the task you want to achieve. And all of them are mean cleaner and easier to code.

like image 176
Blakes Seven Avatar answered Oct 13 '22 11:10

Blakes Seven