Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best database design for thousand rows

I'm about to start a Database Design that will simply manage users under companies.

  • Each company will have a admin area that can manage users
  • Each company will have around 25.000 users
  • Client believes to have around 50 companies to start

My main question is

Should I create tables based on Companies? like

users_company_0001 users_company_0002 users_company_0003 ...

as each company will never use "other" users and nothing will need to sum/count different tables in all user_company (a simple JOIN will do the trick, though it's more expensive (time) it will work as having the main picture, this will never be needed.

or should I just create a users table to have (50 x 25000) 1 250 000 users (and growing).

I'm thinking about the first option, though, I'm not sure how would I use Entity Framework on such layout... I would probably need to go back to the 90's and generate my Data Logic Layer by hand.

has it will be a simple call to Store Procedures containing the Company Id

What will you suggest?

The system application will be ASP.NET (probably MVC, I'm still trying to figure this out as all my knowledge is on webforms, though I saw Scott Hanselman MVC videos - seams easy - but I know it will not be that easy as problems will come and I will take more time to fix them), plus Microsoft SQL.

like image 963
balexandre Avatar asked Sep 28 '10 20:09

balexandre


2 Answers

Even though you've described this as a 1-many relationship, I'd still design the DB as many-to-many to guard against a future change in requirements. Something like:

alt text

like image 176
Joe Stefanelli Avatar answered Oct 12 '22 00:10

Joe Stefanelli


Having worked with a multi-terabyte SQL Server database, and having experience with hundreds of tables over the course of my career with multi-million rows, I can tell you with full assurance that SQL Server can handle a your company and users tables without partitioning. It's always there when you need it, but your worry shouldn't be about your tables - pick the simplest schema that meets your needs. If you want to do something to optimize performance, your bottleneck will almost assuredly be your disks. Don't buy large, slow disks. Get yourself a bunch of small, high RPM disks and spread your data out across them as much as possible, and don't share disks with your logs and your data. With databases, you're almost always better off achieving performance with good hardware, a good disk subsystem, and proper indexing. Don't compromise and over complicate your schema trying to anticipate performance - you'll regret it. I've seen really big databases where that sort of thing was necessary, but yours ain't it.

like image 23
mattmc3 Avatar answered Oct 12 '22 01:10

mattmc3