Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB : Adding Days field to Date Type Field in DB and then comparing with the current date

The requirement is to count the number of Customer records having 'Expiration Date' greater than current date.

I have a collection of customers in MongoDB. In Customer document, there are two fields 'Contract date' and 'TERM'(Term in months).

![enter image description here][1]

There is no direct 'Expiration Date' field present in Mongo document which can be used But it can be Calculated as follow per record:

'Contract date' + 'TERM'(Term in months) = Expiration Date.

I need to calculate the expiration date per customer record at DB level and compare that date with the current date. How this can be achieved?

If there is a expirationDate present in DB, Then I can achieve it easily as follow:

        final BasicDBList fromList = new BasicDBList();
        fromList.add("$customer.expirationDate");
        fromList.add(fromDate);

        final BasicDBList cond1 = new BasicDBList();
        cond1.add(new BasicDBObject("$gt", fromList));
        cond1.add(1);
        cond1.add(0);

        DBObject count = new BasicDBObject("$sum", new BasicDBObject("$cond", cond1)))

        groupFields.put("count", count );

        BasicDBObject group = new BasicDBObject("$group", groupFields);
        AggregationOutput output = template.getDb().getCollection("customer").aggregate(match, group); 

Any help or suggestions are appreciated.

like image 816
Yogesh Avatar asked Dec 10 '25 18:12

Yogesh


1 Answers

This really is one of those cases where you have to make a stand and get something changed rather than try to "jump through hoops" to deal with the poor design decisions of someone else. The "olive branch" to extend here is that the initial design may have well not taken into consideration how the data was going to be used.

The query you are proposing without changing how the data is stored requires no small amount of effort. I'm leaving everything here in "shell" form with JSON notation or otherwise the raw JavaScript. JSON is easy to parse or translate to the methods available to construct BSON objects for the Java driver as with other languages.

So, moving on though, let's look at all the cases here and how to solve, along with the limitations and finally the benefits of making a change here.

Consider the following sample in our "expiring" collection:

{ "contractDate" : ISODate("2014-04-23T00:00:00Z"), "term" : 10 }
{ "contractDate" : ISODate("2014-04-23T00:00:00Z"), "term" : 7 }
{ "contractDate" : ISODate("2014-11-30T00:00:00Z"), "term" : 1 }

MongoDB has a $where operator which will run arbitrary JavaScript code ( supplied as a string for the JavaDriver ) on the server. The defined function must return true/false to determine if the query condition is met or not. Basically evaluate the "contractDate" + "term" to the current date, or one supplied by a variant that allows you to "scope" a variable into the evaluated JavaScript:

db.expiring.count({
  "$where": function () {

    var now = new Date(),
        today = new Date(
          now.valueOf() -
          ( now.valueOf() % ( 1000 * 60 * 60 * 24 ) )
        );

    var adjustedMonth =
      this.contractDate.getMonth() + 1 + this.term;

    var year = ( adjustedMonth > 12 ) ?
      this.contractDate.getFullYear() + 1
      : this.contractDate.getFullYear();

    var month = ( adjustedMonth > 12 ) ?
      adjustedMonth - 12 : adjustedMonth;

    var day = this.contractDate.getDate();

    var expiring = new Date( year + "-" + month + "-" + day );
    return expiring > today;
  }
})

This is horrible because you are both brute forcing the condition to be evaluated against every document in the collection as well as forcing server side evaluation and execution of JavaScript code for every item in the collection. Since it computes the evaluation you cannot use an index to improve anything.

You may also compute the dates and compare via the aggregation framework. For sake of a little readability ( and also doing my own head in ) the example here is given in two stages, but it can be done in a single $group stage:

db.expiring.aggregate([
  { "$project": {
    "contractDate": 1,
    "term": 1,
    "expires": {
      "year": {
        "$cond": [
          { "$gt": [ 
            { "$add": [{ "$month": "$contractDate" }, "$term" ] },
            12
          ]},
          { "$add": [{ "$year": "$contractDate" }, 1 ] },
          { "$year": "$contractDate" }
        ]
      },
      "month": {
        "$cond": [
          { "$gt": [
            { "$add": [{ "$month": "$contractDate" }, "$term" ] },
            12
          ]},
          { "$subtract": [
            { "$add": [{ "$month": "$contractDate" }, "$term" ] },
            12
          ]},
          { "$add": [{ "$month": "$contractDate" }, "$term" ] }
        ]
      },
      "day": { "$dayOfMonth": "$contractDate" }
    }
  }},
  { "$group": {
    "_id": null,
    "count": {
      "$sum": {
        "$cond": [
          { "$or": [
             { "$gt": [ "$expires.year", thisYear ] },
             { "$and": [
               { "$eq": [ "$expires.year", thisYear ] },
               { "$gt": [ "$expires.month", thisMonth ] },
             ]},
             { "$and": [
               { "$eq": [ "$expires.year", thisYear ] },
               { "$eq": [ "$expires.month", thisMonth ] },
               { "$gt": [ "$expires.day", thisDay ] }     
             ]}
          ]},
          1,
          0
        ]
      }
    }
  }}
])

Of course feeding in external variables when constructing to represent the current date. Here they are broken up to thisYear, thisMonth and thisDay to match the pattern shown. You can also use a "date math" approach similar to the JavaScript code.

Again, this is horrible. Even in a single pipeline stage this still needs to run through the whole collection. The native operators speed things a little more, but not that much more, and of course you still cannot use an index.

This is why you should change how the data is stored. Consider when the documents look like this instead:

{ 
    "contractDate" : ISODate("2014-04-23T00:00:00Z"), 
    "term" : 10,
    "expiry": ISODate("2015-02-23T00:00:00Z") 
}
{ 
    "contractDate" : ISODate("2014-04-23T00:00:00Z"), 
    "term" : 7,
    "expiry" : ISODate("2014-11-23T00:00:00Z"),         
}
{ 
    "contractDate" : ISODate("2014-11-30T00:00:00Z"),
    "term" : 1,
    "expiry": ISODate("2014-12-30T00:00:00Z")
}

Now also consider that the new expiry field is also indexed, and that now a really efficient way to get the count is very basic:

db.expiring.count({ "expiry": { "$gt": new Date("2014-12-30") } })

That's it! The only items touched where those greater than the index bounds specified and you just get the count of those still active without needing to calculate of evaluate anything.

So it is that my belief here is that the code that is maintaining this data needs to be changed to keep this additional field in the document and relative two the "contractDate" and "term" fields upon any changes.

The operation is simple and should not be hard, and should be talking about "very minor" change in the code maintaining this, plus a "one off" update to existing data to make it so. So the balance is either a "small change" or implementing a "major mess" just to report something that is not there.

I strongly suggest you show this to someone who can make the decision to make that change. It's going to save you time and everyone else as well. No one wants slow and long running queries. That costs money as well.

like image 123
Neil Lunn Avatar answered Dec 13 '25 07:12

Neil Lunn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!