Can I optimize my database by splitting one big table into many small ones?

Tags:

Assume that I have one big table with three columns: "user_name", "user_property", "value_of_property". Lat's also assume that I have a lot of user (let say 100 000) and a lot of properties (let say 10 000). Then the table is going to be huge (1 billion rows).

When I extract information from the table I always need information about a particular user. So, I use, for example where user_name='Albert Gates'. So, every time the mysql server needs to analyze 1 billion lines to find those of them which contain "Albert Gates" as user_name.

Would it not be wise to split the big table into many small ones corresponding to fixed users?

264

asked Nov 07 '10 11:11

Roman

2 Answers

No, I don't think that is a good idea. A better approach is to add an index on the user_name column - and perhaps another index on (user_name, user_property) for looking up a single property. Then the database does not need to scan all the rows - it just need to find the appropriate entry in the index which is stored in a B-Tree, making it easy to find a record in a very small amount of time.

If your application is still slow even after correctly indexing it can sometimes be a good idea to partition your largest tables.

One other thing you could consider is normalizing your database so that the user_name is stored in a separate table and use an integer foriegn key in its place. This can reduce storage requirements and can increase performance. The same may apply to user_property.

120

answered Sep 22 '22 01:09

Mark Byers

you should normalise your design as follows:

drop table if exists users;
create table users
(
user_id int unsigned not null auto_increment primary key,
username varbinary(32) unique not null
)
engine=innodb;

drop table if exists properties;
create table properties
(
property_id smallint unsigned not null auto_increment primary key,
name varchar(255) unique not null
)
engine=innodb;

drop table if exists user_property_values;
create table user_property_values
(
user_id int unsigned not null,
property_id smallint unsigned not null,
value varchar(255) not null,
primary key (user_id, property_id),
key (property_id)
)
engine=innodb;

insert into users (username) values ('f00'),('bar'),('alpha'),('beta');

insert into properties (name) values ('age'),('gender');

insert into user_property_values values 
(1,1,'30'),(1,2,'Male'),
(2,1,'24'),(2,2,'Female'),
(3,1,'18'),
(4,1,'26'),(4,2,'Male');

From a performance perspective the innodb clustered index works wonders in this similar example (COLD run):

select count(*) from product
count(*)
========
1,000,000 (1M)

select count(*) from category
count(*)
========
250,000 (500K)

select count(*) from product_category
count(*)
========
125,431,192 (125M)

select
 c.*,
 p.*
from
 product_category pc
inner join category c on pc.cat_id = c.cat_id
inner join product p on pc.prod_id = p.prod_id
where
 pc.cat_id = 1001;
0:00:00.030: Query OK (0.03 secs)

answered Sep 19 '22 01:09

Jon Black

Related questions
                            
                                Which data type saves more space TINYTEXT or VARCHAR for variable data length in MySQL?
                            
                                Is it more efficient to always run a delete query, or to check if that information exists first
                            
                                What should a Django user know when moving from MySQL to PostgreSQL?
                            
                                How do I use MySQL as data source in Microsoft SQL Server Analysis Services?
                            
                                adding count( ) column on each row
                            
                                Can I create my own database from PHP?
                            
                                Inserting more than one record with a single insert statement
                            
                                Is a primary key automatically indexed?
                            
                                Memory allocation error from MySql ODBC 5.1 driver in C# application on insert statement
                            
                                Does a Join table (association table) have a primary key ? many to many relationship
                            
                                mySql - creating a join using a list of comma separated values
                            
                                Does a 'unique' column field imply an index with MySQL, and if so, why?
                            
                                MySQL: Two n:1 relations, but not both at once
                            
                                MySQL foreign key question
                            
                                Select results from the middle of a sorted list?
                            
                                MySQL - Convert MM/DD/YY to Unix timestamp
                            
                                Is there any way to insert a large value in a mysql DB without changing max_allowed_packet?
                            
                                Selecting a Python Web Framework
                            
                                Where can I find a good open source user account management class/library in PHP?
                            
                                Select second largest from a table without limit

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can I optimize my database by splitting one big table into many small ones?

Tags:

optimization

split

mysql

Roman

People also ask

2 Answers

Mark Byers

Jon Black

Recent Activity

Donate For Us