Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should I model my MongoDB collection for nested documents?

I'm managing a MongoDB database for a building products store. The most immediate collection is products, right? There are quite several products, however they all belong to one among a set of 5-8 categories and then to one subcatefory among a small set of subcategories.

For example:

-Electrical
  *Wires
    p1
    p2
    ..
  *Tools
    p5
    pn
    ..
  *Sockets
    p11
    p23
    ..
-Plumber
  *Pipes
    ..
  *Tools
    ..
  PVC
    ..

I will use Angular at web site client side to show whole products catalog, I think about AJAX for querying the right subset of products I want.

Then, I wonder whether I should manage one only collection like:

{
    
    MainCategory1: {


        SubCategory1: {
        {},{},{},{},{},{},{}
        }
        SubCategory2: {
        {},{},{},{},{},{},{}
        }
        SubCategoryn: {
        {},{},{},{},{},{},{}
        }               
    },
    MainCategory2: {


        SubCategory1: {
        {},{},{},{},{},{},{}
        }
        SubCategory2: {
        {},{},{},{},{},{},{}
        }
        SubCategoryn: {
        {},{},{},{},{},{},{}
        }               
    },  
    MainCategoryn: {


        SubCategory1: {
        {},{},{},{},{},{},{}
        }
        SubCategory2: {
        {},{},{},{},{},{},{}
        }
        SubCategoryn: {
        {},{},{},{},{},{},{}
        }               
    }   
}

Or a single collection per each category. The number of documents might not be higher than 500. However I care about a balance for:

  • quick DB answer,
  • easy server side DB querying, and
  • client-side Angular code for rendering results to html.

I'm using mongodb node.js module, not Mongoose now.

What CRUD operations will I do?

  • Inserts of products, I'd also like to have a way to obtain autogenerated ids (maybe sequential) per each new register. However, as it might seem natural I wouldn't offer the _id to the user.

  • Querying the whole documents set of a subcategory. Maybe just obtaining a few attributes at first.

  • Querying whole or a specific subset of attributes of a document (product) in particular.

  • Modifying a product's attributes values.

like image 967
diegoaguilar Avatar asked Feb 12 '23 22:02

diegoaguilar


1 Answers

I agree client side should get the easiest result to render. However, to nest categories into products is still a bad idea. The trade off is once you want to change, for example, the name of a category, it will be a disaster. And if you think about the possible usecases, for example:

  • list all categories
  • find all subcategories of a certain category
  • find all products in a certain category

You'll find it hard to do these stuff with your data structure.

I had same situation in my current project. So here's what I do for your reference.
First, categories should be in a separate collection. DON'T nest categories into each other, as it will complicate the procedure to find all subcategories. The traditional way for finding all subcategories is to maintain an idPath property. For example, your categories are divided into 3 levels:

{
    _id: 100,
    name: "level1 category"
    parentId: 0,  // means it's the top category
    idPath: "0-100"
}
{
    _id: 101,
    name: "level2 category"
    parentId: 100,
    idPath: "0-100-101"
}
{
    _id: 102,
    name: "level3 category"
    parentId: 101,
    idPath: "0-100-101-102"
}

Note with idPath, parentId is not necessary anymore. It's for you to understand the structure easier.
Once you need to find all subcategories of category 100, simply do the query:

db.collection("category").find({_id: /^0-100-/}, function(err, doc) {
    // whatever you want to do
})

With category stored in a separate collection, in your product you'll need to reference them by _id, just like when we use RDBMS. For example:

{
    ... // other fields of product
    categories: [100, 101, 102, ...]
}

Now if you want to find all products in a certain category:

db.collection("category").find({_id: new RegExp("/^" + idPath + "-/"}, function(err, categories) {
    var cateIds = _.pluck(categories, "_id"); // I'm using underscore to pluck category ids
    db.collection("product").find({categories: { $in: cateIds }}, function(err, products) {
        // products are here
    }
})

Fortunately, category collection is usually very small, with only hundreds of records inside (or thousands). And it doesn't varies a lot. So you can always store a live copy of categories inside memory, and it can be constructed as nested objects like:

[{
    id: 100,
    name: "level 1 category",
    ... // other fields
    subcategories: [{
        id: 101,
        ... // other fields
        subcategories: [...]
    }, {
        id: 103,
        ... // other fields
        subcategories: [...]
    },
    ...]
}, {
    // another top1 category
}, ...]

You may want to refresh this copy every several hours, so:

setTimeout(3600000, function() {
    // refresh your memory copy of categories.
});

That's all I get in mind right now. Hope it helps.

EDIT:

  • to provide int ID for each user, $inc and findAndModify is very useful. you may have a idSeed collection:

    {
        _id: ...,
        seedValue: 1,
        forCollection: "user"
    }
    

    When you want to get an unique ID:

    db.collection("idSeed").findAndModify({forCollection: "user"}, {}, {$inc: {seedValue: 1}}, {}, function(err, doc) {
        var newId = doc.seedValue;
    });
    

    The findAndModify is an atomic operator provided by mongodb. It will guarantee thread safety. and the find and modify actually happens in a "transaction".

  • 2nd question is in my answer already.
  • query subsets of properties is described with mongodb Manual. NodeJS API is almost the same. Read the document of projection parameter.
  • update subsets is also supported by $set of mongodb operator.
like image 130
yaoxing Avatar answered Feb 15 '23 10:02

yaoxing