Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting up a large mySql table into smaller ones - is it worth it?

Tags:

mysql

I have about 28 million records to import into a mySql database. The record contains personal information about members in the US and will be searchable by states.

My question is, is it more efficient to break up the table into smaller tables as opposed to keeping everything in one big table? What I had in mind was to split them up into 50 seperate tables representing the 50 states something like this: members_CA, members_AZ, members_TX, etc;

This way I could do a query like this:

'SELECT * FROM members_' . $_POST['state'] . ' WHERE members_name LIKE "John Doe" ';

This way I only have to deal with data for a given state at once. Intuitively it makes a lot of sense but I would be curious to hear other opinions.

Thanks in advance.

like image 875
higgenkreuz Avatar asked Jul 14 '11 15:07

higgenkreuz


3 Answers

I posted as a comment initially but I'll post as an answer now.

Never, ever think of creating X tables based on a difference in attribute. That's not how things are done.

If your table will have 28 million rows, think of partitioning to split it into smaller logical sets.

You can read about partitioning at MySQL documentation.

The other thing is choosing the right db design and choosing your indexes properly.

The third thing would be that you avoid terrible idea of using $_POST directly in your query, as you probably wouldn't want someone to inject SQL and drop your database, tables or what not.

The final thing is choosing appropriate hardware for the task, you don't want such an app running on VPS with 500 mb of ram or 1 gig of ram.

like image 114
Michael J.V. Avatar answered Sep 28 '22 23:09

Michael J.V.


Do Not do that. Keep the similar data in 1 table itself. You will have heavy problems in implementing logical decisions and query making when the decision spans many states. Moreover if you need to change the database definition like adding columns, then you will have to perform the same operation over all the numerous(seemingly infinite) tables.

Use indexing to increase performance but stick to single table!!!

You can increase the memory cache also, for performance hit. Follow this article to do so.

like image 45
Jayesh Avatar answered Sep 28 '22 22:09

Jayesh


If you create an index on the state column a select on all members of one state will be as efficient as the use of separate tables. Splittimg the table has a lot of disadvantages. If you add columns you have to add them in 50 tables. If you want data from different states you have to use union statements that will be very ugly and inefficient. I strongly recommend sticking at one table.

like image 36
phlogratos Avatar answered Sep 28 '22 23:09

phlogratos