In MySql InnoDB, is there an performance advantage of partitioning the table compared to simply using an index?
Common considerations:
An index isn’t always the right tool. At a high level, keep in mind that indexes are most effective when they help the storage engine find rows without adding more work than they avoid. For very small tables, it is often more effective to simply read all the rows in the table. For medium to large tables, indexes can be very effective. For enormous tables, the overhead of indexing, as well as the work required to actually use the indexes, can start to add up. In such cases you might need to choose a technique that identifies groups of rows that are interesting to the query, instead of individual rows. You can use partitioning for this purpose. If you have lots of tables, it can also make sense to create a metadata table to store some characteristics of interest for your queries. For example, if you execute queries that perform aggregations over rows in a multitenant application whose data is partitioned into many tables, you can record which users of the system are actually stored in each table, thus letting you simply ignore tables that don’t have information about those users. These tactics are usually useful only at extremely large scales. In fact, this is a crude approximation of what Infobright does. At the scale of terabytes, locating individual rows doesn’t make sense; indexes are replaced by per-block metadata.Is an Index the Best Solution?
(many thanks to High Performance MySQL great book)
99% of cases I have looked at do not benefit from PARTITIONing
as much as from INDEXing
.
My Rules of Thumb for using Partitioning are in http://mysql.rjweb.org/doc.php/partitionmaint . Also, that lists the only 4 use cases where partitioning improves performance.
OK, I can't say "exactly" 99%, but it is very close to that. I do believe strongly in the "4" -- I have been searching since partitioning was added to MySQL many years ago.
For Data Warehousing, the usual performance solution is to create and maintain "Summary tables". This works nicely for 'most' DW applications.
"Very large BTrees don't work"? Bull. A million-row index will have a BTree depth of about 3. A trillion rows -- about 6. Where's the "won't work"? A "point query" on a trillion row table will touch twice as many nodes in the BTree, and more of them are unlikely to be cached. But it "will work".
Infobright, with its "columnar storage", has its niche. TokuDB, with its "fractal indexing", has its niche. Neither one can say "we are better than BTrees most of the time". (Both those engines get part of their speed by compression.)
Bottom Line: Use an index. Probably a "composite" index. (More indexing tips: http://mysql.rjweb.org/doc.php/index_cookbook_mysql )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With