Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use timestamp(or datetime) as part of primary key (or part of clustered index)

I use following query frequently:

SELECT * FROM table WHERE Timestamp > [SomeTime] AND Timestamp < [SomeOtherTime] and publish = 1 and type = 2 order by Timestamp

I would like to optimize this query, and I am thinking about put timestamp as part of primary key for clustered index, I think if timestamp is part of primary key , data inserted in table has write to disk sequentially by timestamp field.Also I think this improve my query a lot, but am not sure if this would help.

table has 3-4 million+ rows.
timestamp field never changed.
I use mysql 5.6.11

Anothet point is : if this is improve my query , it is better to use timestamp(4 byte in mysql 5.6) or datetime(5 byte in mysql 5.6)?

like image 922
Arash Mousavi Avatar asked May 31 '13 06:05

Arash Mousavi


2 Answers

Four million rows isn't huge.

A one-byte difference between the data types datetime and timestamp is the last thing you should consider in choosing between those two data types. Review their specs.

Making a timestamp part of your primary key is a bad, bad idea. Think about reviewing what primary key means in a SQL database.

Put an index on your timestamp column. Get an execution plan, and paste that into your question. Determine your median query performance, and paste that into your question, too.

Returning a single day's rows from an indexed, 4 million row table on my desktop computer takes 2ms. (It returns around 8000 rows.)

like image 93
Mike Sherrill 'Cat Recall' Avatar answered Oct 03 '22 15:10

Mike Sherrill 'Cat Recall'


1) If values of timestamp are unique you can make it primary key. If not, anyway create index on timestamp column as you frequently use it in "where".

2) using BETWEEN clause looks more natural here. I suggest you use TREE index (default index type) not HASH.

3) when timestamp column is indexed, you don't need call order by - it already sorted. (of course, if your index is TREE not HASH).

4) integer unix_timestamp is better than datetime both from memory usage side and performance side - comparing dates is more complex operation than comparing integer numbers.

Searching data on indexed field takes O(log(rows)) tree lookups. Comparison of integers is O(1) and comparison of dates is O(date_string_length). So, difference is (number of tree lookups) * (difference_comparison) = O(date_string_length)/O(1))* O(log(rows)) = O(date_string_length)* O(log(rows))

like image 31
Baurzhan Avatar answered Oct 03 '22 15:10

Baurzhan