Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

best database strategy for a client-based website (Ruby on Rails)

I've built a nice website system that caters to the needs of a small niche market. I've been selling these websites over the last year by deploying copies of the software using Capistrano to my web server.

It occurs to me that the only difference in these websites is the database, the CSS file, and a small set of images used for the individual client's graphic design.

Everything else is exactly the same, or should be... Now that I have about 20 of these sites deployed, it is getting to be a hassle to keep them all updated with the same code. And this problem will only get worse.

I am thinking that I should refactor this system, so that I can use one set of deployed ruby code, dynamically selecting the correct database, etc, by the URL of the incoming request.

It seems that there are two ways of handling the database:

  • using multiple databases, one for each client
  • using one database, with a client_id field in each table, and an extra 'client' table

The multiple database approach would be the simplest for me at the moment, since I wouldn't have to refactor every model in my application to add the client_id field to all CRUD operations.

However, it would be a hassle to have to run 'rake db:migrate' for tens or hundreds of different databases, every time I want to migrate the database(s). Obviously this could be done by a script, but it doesn't smell very good.

On the other hand, every client will have 20K-50K items in an 'items' table. I am worried about the speed of fulltext searches when the items table has a half million or million items in it. Even with an index on the client_id field, I suspect that searches would be faster if the items were separated into different client databases.

If anyone has an informed opinion on the best way to approach this problem, I would very much like to hear it. Thanks much in advance...

-- John

like image 501
John Avatar asked Dec 08 '08 15:12

John


People also ask

Which database is used with Ruby on Rails?

Rails comes with built-in support for SQLite, which is a lightweight serverless database application. While a busy production environment may overload SQLite, it works well for development and testing. Rails defaults to using a SQLite database when creating a new project, but you can always change it later.

What databases support rails?

Rails supports many DBMSs; at the time of this writing, DB2, Firebird, FrontBase, MySQL, OpenBase, Oracle, PostgreSQL, SQLite, Microsoft SQL Server, and Sybase are supported.

What is the database of Ruby?

SQLite is a default Ruby database – it comes in the package with Ruby itself. So good news – its integration takes only several commands. Note: the Linux package of Ruby comes with SQLite, so you can use commands to manage it.


2 Answers

Thanks for the great comments. I have decided to go with the multiple database approach. This is the easiest path for me, since I don't have to rework the entire application.

What I'm going to do is to add a before_filter in application_controller, so it applies to all controllers... something like this:

before_filter :client_db         # switch to client's db

Then, in application_controller.rb, I'll include something like this:

 def client_db
    @client = Client.find(params[:client_id]) 
    spec = Client.configurations[RAILS_ENV] 
    new_spec = spec.clone 
    new_spec["database"] = @client.database_name
    ActiveRecord::Base.establish_connection(new_spec) 
  end

Then, a URL like example.com?client_id=12345 will select the correct database.

Since I am using Apache as a proxy in front of Mongrel, Apache will add the correct client_id to all requests, based on the client's website URL. So the client_id won't actually be part of the URL that users see. It will only be passed between Apache and Mongrel. I'm not sure if I'm explaining this properly, but it works and keeps things clean and simple.

If I decide I need to use a single database in the future, I can refactor all the code then. At the moment, this seems to be the simplest approach.

Anyone see any problem with this approach?

-- John

like image 114
John Avatar answered Sep 19 '22 19:09

John


There are advantages to using separate DBs (including those you already listed):

  • Fulltext searches will become slow (depending on your server's capabilities) when you have millions of large text blobs to search.
  • Separating the DBs will keep your table indexing speed quicker for each client. In particular, it might upset some of your earlier adopting clients if you take on a new, large client. Suddenly their applications will suffer for (to them) no apparent reason. Again, if you stay under your hardware's capacity, this might not be an issue.
  • If you ever drop a client, it'd be marginally cleaner to just pack up their DB than to remove all of their associated rows by client_id. And equally clean to restore them if they change their minds later.
  • If any clients ask for additional functionality that they are willing to pay for, you can fork their DB structure without modifying anyone else's.
  • For the pessimists: Less chance that you accidentally destroy all client data by a mistake rather than just one client's data. ;)

All that being said, the single DB solution is probably better given:

  • Your DB server's capabilities makes the large single table a non-issue.
  • Your client's databases are guaranteed to remain identical.
  • You aren't worried about being able to keep everyone's data compartmentalized for purposes of archiving/restoring or in case of disaster.
like image 25
Adam Bellaire Avatar answered Sep 19 '22 19:09

Adam Bellaire