I have a table that contains 1.5 million rows, has 39 columns, contains sales data of around 2 years, and grows every day. I had no problems with it until we moved it to a new server, we probably have less memory now.
Queries are currently taking a very long time. Someone suggested partitioning the large table that is causing most of the performance issues but I have a few questions.
Does the partition take a long time to perform? I worry that with the slow performance, something would happen midway through and I would lose the data.
Should I be partioning it to years or months? (we usually look at the numbers within the month, but sometimes we take weeks or years). And should I also partition the columns? (We have some columns that we rarely or never use, but we might want to use them later)
(I agree with Bill's answer; I will approach the Question in a different way.)
When is it time to partion my tables?
Probably never.
is it likely to improve its performance?
It is more likely to decrease performance a little.
I have a table that contains 1.5 million rows
Not big enough to bother with partitioning.
Queries are currently taking a very long time
Usually that is due to the lack of a good index, probably a 'composite' one. Secondly is the formulation of the query. Please show us a slow query, together with SHOW CREATE TABLE
.
data of around 2 years, and grows every day
Will you eventually purge "old" data? If so, the PARTITION BY RANGE(TO_DAYS(..))
is an excellent idea. However, it only helps during the purge. This is because DROP PARTITION
is a lot faster than DELETE...
.
we probably have less memory now.
If you are mostly looking at "recent" data, then the size of memory (cf innodb_buffer_pool_size
) may not matter. This is due to caching. However, it sounds like you are doing table scans, perhaps unnecessarily.
will I have to make changes to my current INSERT or SELECT
No. But you probably need to change what column(s) are in the PRIMARY KEY
and secondary key(s).
Does the partition take a long time to perform?
Slow - yes, because it will copy the entire table over. Note: that means extra disk space, and the partitioned table will take more disk.
something would happen midway through and I would lose the data.
Do not worry. The new table is created, then a very quick RENAME TABLE
swaps it into place.
Should I be partioning it to years or months?
Rule of thumb: aim for about 50 partitions. With "2 years and growing", a likely choice is "monthly".
we usually look at the numbers within the month, but sometimes we take weeks or years
Smells like a typical "Data Warehouse" dataset? Build and incrementally augment a "Summary table" with daily stats. With that table, you can quickly get weekly/monthly/yearly stats -- possibly 10 times as fast. Ditto for any date range. This also significantly helps with "low memory".
And should I also partition the columns? (We have some columns that we rarely or never use, but we might want to use them later)
You should 'never' use SELECT *
; instead, specify the columns you actually need. "Vertical partitioning" is the term for your suggestion. It is sometimes practical. But we need to see SHOW CREATE TABLE
with realistic column names to discuss further.
More on partitioning: http://mysql.rjweb.org/doc.php/partitionmaint
More on Summary tables: http://mysql.rjweb.org/doc.php/summarytables
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With