Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

InnoDB row size changing exponentially while table is growing?

I have a huge InnoDB Table with three columns (int, mediumint, int). The innodb_file_per_table setting is on and there is only a PRIMARY KEY of the first two columns

The table schema is:

CREATE TABLE `big_table` (
  `user_id` int(10) unsigned NOT NULL,
  `another_id` mediumint(8) unsigned NOT NULL,
  `timestamp` int(10) unsigned NOT NULL,
  PRIMARY KEY (`user_id`,`another_id `)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

MySQL Version is 5.6.16

Currently I am multi-inserting over 150 rows per second. No deletion, and no updates. There are no significant rollbacks or other transaction aborts, that would cause wasted space usage.

MySQL shows a calculated size of 75,7GB on that table.

.ibd size on disc: 136,679,784,448 byte (127.29 GiB)

Counted rows: 2,901,937,966 (47.10 byte per row)

2 days later MySQL shows also a calculated size of 75.7 GB on that table.

.ibd size on disc: 144,263,086,080 byte (135.35 GiB)

Counted rows: 2,921,284,863 (49.38 byte per row)

Running SHOW TABLE STATUS for the table shows:

Engine | Version | Row_format | Rows       | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Collation 
InnoDB |      10 | Compact    | 2645215723 |             30 | 81287708672 |               0 |            0 |   6291456 | utf8_unicode_ci

Here are my Questions:

  • Why is the disc usage growing disproportionally to the row count?
  • Why is the Avg_row_length and Data_length totally wrong?

Hope someone can help me, that the disc usage will not grow like this anymore. I have not noticed that as the table was smaller.

like image 691
Stefan Avatar asked Feb 26 '14 09:02

Stefan


People also ask

How do I fix MySQL row size too large 8126?

Row size too large (> 8126). Changing some columns to TEXT or BLOB or using ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPRESSED may help. In current row format, BLOB prefix of 768 bytes is stored inline.

What is the maximum row length in InnoDB for a page size of 16KB?

For example, the maximum row size for the default innodb_page_size of 16KB is about 8000 bytes. However, for an InnoDB page size of 64KB, the maximum row size is approximately 16000 bytes.

What is dynamic row format in MySQL?

The DYNAMIC row format is based on the idea that if a portion of a long data value is stored off-page, it is usually most efficient to store the entire value off-page. With DYNAMIC format, shorter columns are likely to remain in the B-tree node, minimizing the number of overflow pages required for a given row.

What is maximum length of column of InnoDB table type?

LONGBLOB and LONGTEXT columns must be less than 4GB, and the total row size, including BLOB and TEXT columns, must be less than 4GB. If a row is less than half a page long, all of it is stored locally within the page.


1 Answers

I am assuming that your table hasn't grown to its present ~2.9 billion rows organically, and that you either recently loaded this data or have caused the table to be re-organized (using ALTER TABLE or OPTIMIZE TABLE, for instance). So it starts off quite well-packed on disk.

Based on your table schema (which is fortunately very simple and straightforward), each row (record) is laid out as follows:

(Header)              5 bytes
`user_id`             4 bytes
`another_id`          3 bytes
(Transaction ID)      6 bytes
(Rollback Pointer)    7 bytes
`timestamp`           4 bytes
=============================
Total                29 bytes

InnoDB will never actually fill pages to more than approximately ~15/16 full (and normally never less than 1/2 full). With all of the extra overhead in various places the full-loaded cost of a record is somewhere around 32 bytes minimum and 60 bytes maximum per row in leaf pages of the index.

When you bulk-load data through an import or through an ALTER TABLE or OPTIMIZE TABLE, the data will normally be loaded (and the indexes created) in order by PRIMARY KEY, which allows InnoDB to very efficiently pack the data on disk. If you then continue writing data to the table in random (or effectively random) order, the efficiently-packed index structures must expand to accept the new data, which in B+Tree terms means splitting pages in half. If you have an ideally-packed 16 KiB page where records consume ~32 bytes on average, and it is split in half to insert a single row, you now have two half-empty pages (~16 KiB wasted) and that new row has "cost" 16 KiB.

Of course that's not really true. Over time the index tree would settle down with pages somewhere between 1/2 full and 15/16 full -- it won't keep splitting pages forever, because the next insert that must happen into the same page will find that plenty of space already exists to do the insert.

This can be a bit disconcerting if you initially bulk load (and thus efficiently pack) your data into a table and then switch to organically growing it, though. Initially it will seem as though the tables are growing at an insane pace, but if you track the growth rate over time it should slow down.

You can read more about InnoDB index and record layout in my blog posts: The physical structure of records in InnoDB, The physical structure of InnoDB index pages, and B+Tree index structures in InnoDB.

like image 143
jeremycole Avatar answered Sep 23 '22 20:09

jeremycole