I need to create a table in MySQL version 5.5
this table will have information like:
Here's what i think:
create table statistics (
browser varchar(255) not null,
version float not null,
ip varchar(40) not null,
dateandtime datetime,
referrer varchar(255)
);
I read on mysql.com that I need to use indexes to make my query fast but now my problem is what index should I create in order to make that table fast to query?
I need to query all the fields eg:
Thanks
The MySQL OPTIMIZE table helps you to optimize the table storage space. It reorganizes the storage data in a way that increases the Input Output efficiency and reduces the storage space. To execute this statement, you need SELECT and INSERT privileges.
Remove any unnecessary indexes on the table, paying particular attention to UNIQUE indexes as these disable change buffering. Don't use a UNIQUE index unless you need it; instead, employ a regular INDEX. Take a look at your slow query log every week or two. Pick the slowest three queries and optimize those.
In most setups, you need not run OPTIMIZE TABLE at all. Even if you do a lot of updates to variable-length rows, it is not likely that you need to do this more than once a week or month and only on certain tables. Based on this article on Table Optimization.
Optimizing table straight away takes over 3 hours, while dropping indexes besides primary key, optimizing table and adding them back takes about 10 minutes, which is close than 20x speed difference and more compact index in the end.
I would recommend this:
Use intergers instead of chars/varchars. this way you index faster (except the referrer). Also, I can recommend to get summary tables. Although it's not really normalized but the query will be executed instantly - specially if you have a big organization with lots of traffic.
So here's the tables:
create table statistics (
browser tinyint(3) UNSIGNED not null default 0,
version float(4,2) not null default 0,
ip INT(10) UNSIGNED not null default 0,
createdon datetime,
referrer varchar(5000),
key browserdate (browser, createdon),
key ipdate (ip, createdon),
// etc..
);
browser 0 = unknow, 1 = firefox etc.. This can be done in your code (so you load the same code for inserting and selecting). i dont use enum here because if you need to alter the table and you have millions of records this can be painful. new browser = new number in the code which is way faster to change.
this table can be used to resummarized all the other tables if something happens. so you create an index for the inline summary table (example browser)
Now the summary table:
create table statistics_browser_2011_11 (
browser tinyint(3) UNSIGNED not null default 0,
version float(4,2) not null default 0,
number bigint(20) not null default 0,
createdon datetime,
unique key browserinfo (createdon, browser, version)
); // browsers stats for november 2011
This way when you inserts (you get the date of the user when he accessed the site and create a $string that match with the table name) into this table you only have to use the on duplicate key number = number +1
. this way when you retrieve the browser statistics is super fast.
now here you will have to create a merge table because if you are the second of the month and you want to query the last 7 days, you will need the current month and the last month table. here's more info: http://dev.mysql.com/doc/refman/5.1/en/merge-storage-engine.html
and you repeat the process for the other information: ip, referrer etc...
in order to maintain these tables, you will have to create a cronjob that creates tables for the next month. simple PHP script that gets the current year/month and then create the table for the next month if it does not exists and then merge them)
this might be a little of work but this is how i do it at work (with similar data) with 12 terabytes of data and 5,000 employees that fetch the databases. my average load time for each query is approx 0.60 seconds per requests.
I think your schema can be improved to
create table statistics
(
browser enum('Firefox','IE','Opera','Chrome','Safari','Others') not null
default 'Others',
// major browser family only
// instead of using free-form of varchar
user_agent text,
// to store the complete user agents
// mainly for reference purpose only
version float not null,
ip varchar(40) not null,
dateandtime datetime not null,
referer varchar(2000)
// 255 is no sufficient for referer
);
Index key
datetime, browser
query 1
select browser, count(*) from statistics
where dateandtime between ? and ?
group by browser;
query 2
select count(*) from statistics
where dateandtime between ? and ?;
query 3
select referer from statistics
where dateandtime between ? and ?;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With