Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the best practice for converting mysql lookup tables to mongodb [closed]

I am in the process of converting a database from mysql to mongdb. My question is in regards to lookup tables. Say I have a recipe that can have multiple categories. In mysql, this is achieved with a many-to-many lookup between the recipes and categories tables. In mongodb, I could simply have a recipes collection with a field of categories that contains an array of categories. Text is actually stored in that field, not referenced, like so:

recipe: {
  _id: xxxxx,
  title: "a recipe title",
  categories: ['catone', 'cattwo']
}

So, is this best practice? I prefer this method as it seems easier to maintain than a separate collection and referencing it. But I see two drawbacks. One, if a category name changes, you have to update all records, instead of just updating a single lookup record as in mysql. Second, it would seem inefficient to get a list of available categories to use on say a list of checkboxes on an interface.

I know many others have had to deal with this issue, so I was just trying to find what is considered best practice.

like image 311
kidbrax Avatar asked Feb 12 '14 18:02

kidbrax


1 Answers

What you are saying is correct. You have to update (find and modify) the embedded categories whenever the name of a category changes. It's your call. You have to decide based on facts like

  • How many times in production the category names will change
  • How many categories you have

Based on your scenario, in production I think there won't be frequent change in the name of the category. It should only happen rarely, so I suggest you to embed it and have mechanism to update it, but instead of directly embedding the categories alone, I would embed with Id like below. Otherwise, it will be a performance problem when you want to update and you have to do a string search in order update. If you have id, you can have index on this as well.

So have it like this

recipe: {
    _id: xxxxx,
    title: "a recipe title",
    categories: [{ id: 1,'catone'}, {id:2,'cattwo'}]
}

Also, you have to see how are you going to handle when you have several millions or few billions of records to update. In some cases, you have to update several millions of records not just in one collection but in multiple collections. If you have such a situation, then you have to handle that using a separate update service or something like that. Because if you try to do synchronously it will take a lot of time and you don't want your user to wait this all to complete.

Also remember the embedded array can't be very long like having several hundreds/thousands of items. Mongo will still allow it, but you'll face problems. One more thing to consider is whether additional categories will be added to array after it's created. If so many additional items can added in update, if it's going to many then you many want to read about Padding Factor. Hope this helps.

like image 163
Anand Dayalan Avatar answered Nov 17 '22 13:11

Anand Dayalan