Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mySQL partitioning multi-file vs. one-file performance?

When partitioning a large table, I have a choice to set the flag -innodb_file_per_table to TRUE or FALSE. True will create many files (one per partition) and greatly increase my disk usage, but allows me to spread partitions on different volumes (which I do not plan to do). FALSE will keep the table as one big file. Assuming I keep all files on the same logical volume, can I expect any significant query performance difference between the two options? Or, more generally, are there any issues to consider when making the choice between the two options besides disk usage and management?

Some stats:

  • total number of tables: 20 (only a few I am interested in paritioning - see my other question)
  • largest tables have 100M records.
  • total db size is about 60G.
like image 318
Paul Avatar asked Mar 16 '12 18:03

Paul


People also ask

Does MySQL partitioning improve performance?

Furthermore, using partition pruning to design partitioned tables and queries on these tables can dramatically improve performance. Performance with LOAD DATA: LOAD DATA in MySQL 8.0 uses buffering to improve performance. It's important to note that the buffer uses 130 KB of memory per partition to accomplish this.

When should I use MySQL partition?

MySQL partitioning is about altering – ideally, optimizing – the way the database engine physically stores data. It allows you to distribute portions of table data (a.k.a. partitions) across the file system based on a set of user-defined rules (a.k.a. the “partitioning function”).

What are the advantages of using partitioning?

Partitioning improves the performance of the delete old version shell script since all old version data is in a separate partition. By having all current version data in a separate partition, more current version data is available in database memory that results in efficient use of database buffer pools.

How many partitions can a MySQL table have?

Maximum number of partitions. Prior to MySQL 5.6. 7, the maximum possible number of partitions for a given table not using the NDB storage engine was 1024. Beginning with MySQL 5.6. 7, this limit is increased to 8192 partitions.


1 Answers

As you've already stated -innodb_file_per_table will decide whether one table will be stored in one file or (if partitioned) in many files.

Here are some pros and con's of each approach (not necessary a complete list).

Single file per table                    Multiple files per (partitioned) table
--------------------------------------   --------------------------------------
+ System uses less filehandles           - System uses more filehandles
+ One one fsync per second per table     - Possibly many more fsync calls (bottleneck)
  (less fs overhead (journal etc))         (more fs overhead)
+ Single file uses less space overall    - Much larger disk space usage
- Single file fragments badly            + Less fragmentation 
- Optimize table (et al) takes longer    + You can choose to optimize just one file
- One file = one filesystem              + You can put heavy traffic files on a fast fs
                                           (e.g. on a solid state disk)
- Impossible to reclaim disk space       + possible to emergency-reclaim disk space 
  in a hurry (truncate table takes long)   fast (just delete a file)
- ALTER TABLE can use large % of disk-   + rebuilding with ALTER TABLE will use less
  space for temp tables while rebuilding   temp disk space

In general I would not recommend multiple files.
If however your workload leads to heavy fragmentation and optimize table takes too long, using multiple files will make sense.

Forget about reclaiming space
Some people make a lot of fuss about the fact that in InnoDB table files always grow and never shrink, leading to wasted space if rows are deleted.
Then they come up with schemes to reclaim that space so as to not run out of free disk space. (truncate table x).
This will work much faster with multiple files, however all of this is nonsense, because databases almost always grow and (almost) never shrink, so all that reclaiming of space will waste lots of time (CPU and IO) during with your table will be fully locked (no reads and no writes allowed).
Only to find that your 90% full disk (50% after reclaim) will be 99% full after next months data additions.

However when using ALTER TABLE beware...
Consider the following scenario:
- Disk is 60% full.
- database takes up 50%, other files takes up 10%.
If you do an alter table on any table, you will run out of disk space if you have all tables in one file.
If you have it in multiple files, you should not have problems (other than caffeine overdose from all that waiting).

like image 187
Johan Avatar answered Oct 27 '22 09:10

Johan