Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How will partitioning affect my current queries in MySQL? When is it time to partition my tables?

I have a table that contains 1.5 million rows, has 39 columns, contains sales data of around 2 years, and grows every day. I had no problems with it until we moved it to a new server, we probably have less memory now.

Queries are currently taking a very long time. Someone suggested partitioning the large table that is causing most of the performance issues but I have a few questions.

  1. Is it wise to partition the table I described and is it likely to improve its performance?
  2. If I do partition it, will I have to make changes to my current INSERT or SELECT statements or will they continue working the same way?
  3. Does the partition take a long time to perform? I worry that with the slow performance, something would happen midway through and I would lose the data.

  4. Should I be partioning it to years or months? (we usually look at the numbers within the month, but sometimes we take weeks or years). And should I also partition the columns? (We have some columns that we rarely or never use, but we might want to use them later)

like image 509
Andri Avatar asked Mar 04 '23 06:03

Andri


1 Answers

(I agree with Bill's answer; I will approach the Question in a different way.)

When is it time to partion my tables?

Probably never.

is it likely to improve its performance?

It is more likely to decrease performance a little.

I have a table that contains 1.5 million rows

Not big enough to bother with partitioning.

Queries are currently taking a very long time

Usually that is due to the lack of a good index, probably a 'composite' one. Secondly is the formulation of the query. Please show us a slow query, together with SHOW CREATE TABLE.

data of around 2 years, and grows every day

Will you eventually purge "old" data? If so, the PARTITION BY RANGE(TO_DAYS(..)) is an excellent idea. However, it only helps during the purge. This is because DROP PARTITION is a lot faster than DELETE....

we probably have less memory now.

If you are mostly looking at "recent" data, then the size of memory (cf innodb_buffer_pool_size) may not matter. This is due to caching. However, it sounds like you are doing table scans, perhaps unnecessarily.

will I have to make changes to my current INSERT or SELECT

No. But you probably need to change what column(s) are in the PRIMARY KEY and secondary key(s).

Does the partition take a long time to perform?

Slow - yes, because it will copy the entire table over. Note: that means extra disk space, and the partitioned table will take more disk.

something would happen midway through and I would lose the data.

Do not worry. The new table is created, then a very quick RENAME TABLE swaps it into place.

Should I be partioning it to years or months?

Rule of thumb: aim for about 50 partitions. With "2 years and growing", a likely choice is "monthly".

we usually look at the numbers within the month, but sometimes we take weeks or years

Smells like a typical "Data Warehouse" dataset? Build and incrementally augment a "Summary table" with daily stats. With that table, you can quickly get weekly/monthly/yearly stats -- possibly 10 times as fast. Ditto for any date range. This also significantly helps with "low memory".

And should I also partition the columns? (We have some columns that we rarely or never use, but we might want to use them later)

You should 'never' use SELECT *; instead, specify the columns you actually need. "Vertical partitioning" is the term for your suggestion. It is sometimes practical. But we need to see SHOW CREATE TABLE with realistic column names to discuss further.

More on partitioning: http://mysql.rjweb.org/doc.php/partitionmaint
More on Summary tables: http://mysql.rjweb.org/doc.php/summarytables

like image 59
Rick James Avatar answered Apr 05 '23 21:04

Rick James