MongoDB Database Structure and Best Practices Help

Tags:

I'm in the process of developing Route Tracking/Optimization software for my refuse collection company and would like some feedback on my current data structure/situation.

Here is a simplified version of my MongoDB structure:

Database: data

Collections:

“customers” - data collection containing all customer data.

  [
    {
        "cust_id": "1001",
        "name": "Customer 1",
        "address": "123 Fake St",
        "city": "Boston"
    },
    {
        "cust_id": "1002",
        "name": "Customer 2",
        "address": "123 Real St",
        "city": "Boston"
        },
    {
        "cust_id": "1003",
        "name": "Customer 3",
        "address": "12 Elm St",
        "city": "Boston"
    },
    {
        "cust_id": "1004",
        "name": "Customer 4",
        "address": "16 Union St",
        "city": "Boston"
        },
    {
        "cust_id": "1005",
        "name": "Customer 5",
        "address": "13 Massachusetts Ave",
        "city": "Boston"
    }, { ... }, { ... }, ...
]

“trucks” - data collection containing all truck data.

[
    {
        "truckid": "21",
        "type": "Refuse",
        "year": "2011",
        "make": "Mack",
        "model": "TerraPro Cabover",
        "body": "Mcneilus Rear Loader XC",
        "capacity": "25 cubic yards"
    },
    {
        "truckid": "22",
        "type": "Refuse",
        "year": "2009",
        "make": "Mack",
        "model": "TerraPro Cabover",
        "body": "Mcneilus Rear Loader XC",
        "capacity": "25 cubic yards"
    },
    {
        "truckid": "12",
        "type": "Dump",
        "year": "2006",
        "make": "Chevrolet",
        "model": "C3500 HD",
        "body": "Rugby Hydraulic Dump",
        "capacity": "15 cubic yards"
    }
]

“drivers” - data collection containing all driver data.

  [
    {
        "driverid": "1234",
        "name": "John Doe"
    },
    {
        "driverid": "4321",
        "name": "Jack Smith"
    },
    {
        "driverid": "3421",
        "name": "Don Johnson"
    }
]

“route-lists” - data collection containing all predetermined route lists.

   [
    {
        "route_name": "monday_1",
        "day": "monday",
        "truck": "21",
        "stops": [
            {
                "cust_id": "1001"
            },
            {
                "cust_id": "1010"
            },
            {
                "cust_id": "1002"
            }
        ]
    },
    {
        "route_name": "friday_1",
        "day": "friday",
        "truck": "12",
        "stops": [
            {
                "cust_id": "1003"
            },
            {
                "cust_id": "1004"
            },
            {
                "cust_id": "1012"
            }
        ]
    }
]

"routes" - data collections containing data for all active and completed routes.

[
    {
        "routeid": "1",
        "route_name": "monday1",
        "start_time": "04:31 AM",
        "status": "active",
        "stops": [
            {
                "customerid": "1001",
                "status": "complete",
                "start_time": "04:45 AM",
                "finish_time": "04:48 AM",
                "elapsed_time": "3"
            },
            {
                "customerid": "1010",
                "status": "complete",
                "start_time": "04:50 AM",
                "finish_time": "04:52 AM",
                "elapsed_time": "2"
            },
            {
                "customerid": "1002",
                "status": "incomplete",
                "start_time": "",
                "finish_time": "",
                "elapsed_time": ""
            },
            {
                "customerid": "1005",
                "status": "incomplete",
                "start_time": "",
                "finish_time": "",
                "elapsed_time": ""
            }
        ]
    }
]

Here is the process thus far:

Each day drivers begin by Starting a New Route. Before starting a new route drivers must first input data:

driverid
date
truck

Once all data is entered correctly the Start a New Route will begin:

Create new object in collection “routes”
Query collection “route-lists” for “day” + “truck” match and return "stops"
Insert “route-lists” data into “routes” collection

As driver proceeds with his daily stops/tasks the “routes” collection will update accordingly.

On completion of all tasks the driver will then have the ability to Complete the Route Process by simply changing “status” field to “active” from “complete” in the "routes" collection.

That about sums it up. Any feedback, opinions, comments, links, optimization tactics are greatly appreciated.

Thanks in advance for your time.

632

asked Jun 11 '11 19:06

j3ffz

1 Answers

You database schema looks like for me as 'classic' relational database schema. Mongodb good fit for data denormaliztion. I guess when you display routes you loading all related customers, driver, truck.

If you want make your system really fast you may embedd everything in route collection.

So i suggest following modifications of your schema:

customers - as-is
trucks - as-is
drivers - as-is

route-list:

Embedd data about customers inside stops instead of reference. Also embedd truck. In this case schema will be:

 {
     "route_name": "monday_1",
     "day": "monday",
     "truck": {
         _id = 1,
         // here will be all truck data
     },
     "stops": [{
         "customer": {
             _id = 1,
             //here will be all customer data
         }
     }, {
         "customer": {
             _id = 2,
             //here will be all customer data
         }
     }]
 }

routes:

When driver starting new route copy route from route-list and in addition embedd driver information:

 {
     //copy all route-list data (just make new id for the current route and leave reference to routes-list. In this case you will able to sync route with route-list.)
     "_id": "1",
     route_list_id: 1,
     "start_time": "04:31 AM",
     "status": "active",
     driver: {
         //embedd all driver data here
     },
     "stops": [{
         "customer": {
             //all customer data
         },
         "status": "complete",
         "start_time": "04:45 AM",
         "finish_time": "04:48 AM",
         "elapsed_time": "3"
     }]
 }

I guess you asking yourself what do if driver, customer or other denormalized data changed in main collection. Yeah, you need update all denormalized data within other collections. You will probably need update billions of documents (depends on your system size) and it's okay. You can do it async if it will take much time.

What benfits in above data structure?

Each document contains all data that you may need to display in your application. So, for instance, you no need load related customers, driver, truck when you need display routes.
You can make any difficult queries to your database. For example in your schema you can build query that will return all routes thats contains stops in stop of customer with name = "Bill" (you need load customer by name first, get id, and look by customer id in your current schema).

Probably you asking yourself that your data can be unsynchronized in some cases, but to solve this you just need build a few unit test to ensure that you update your denormolized data correctly.

Hope above will help you to see the world from not relational side, from document database point of view.

173

answered Sep 30 '22 04:09

Andrew Orsich

Related questions
                            
                                setting mongoid hash field values
                            
                                Datetime issues with Mongo and C#
                            
                                Meteor.js : How to run check() when arguments are Mongodb ObjectId's?
                            
                                mongodb - check if field is one of many values
                            
                                MongoDB - Can't canonicalize query: BadValue unknown operator: $meta
                            
                                How many documents can a single collection have in MongoDB?
                            
                                Spring MongoDB query documents if days difference is x days
                            
                                Mongodb, sharding and multiple windows services
                            
                                Mongodb match empty object in nested document
                            
                                How does Trello store data in MongoDB? (Collection per board?)
                            
                                Mongodb $in against a field of objects of array instead of objects of array
                            
                                Cassandra Or MongoDB For Our Location Based Application
                            
                                Real-time statistics: MySQL(/Drizzle) or MongoDB?
                            
                                NoSQL Database for ECommerce
                            
                                Mongoid Without Rails
                            
                                Merging array fields in MongoDB aggregation
                            
                                How can I create/find in Mongoose?
                            
                                Store date in MongoDB without considering the timezone
                            
                                If value of a property is null when updating then that property should not be added to the record
                            
                                How do Morphia, Mongo4j and Spring data for MongoDB compare? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MongoDB Database Structure and Best Practices Help

Tags:

database

mongodb

database-design

j3ffz

People also ask

1 Answers

Andrew Orsich

Recent Activity

Donate For Us