Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why we still need innodb redo log when mysql binlog has been enabled?

In my understanding, mysql binlog can fully function as InnoDB's redo log.

So, after the binlog is enabled, why does InnoDB have to write a redo log at the same time instead of just switching to use the binlog? Doesn't this significantly slow down database write performance?

In addition to simplifying design and implementation, is there any benefit to doing this?

AFAIK, to enable two logs at the same time as the ACID compliance is guaranteed, the following problems will occur:

  1. Each log record with the same meaning must be written twice separately.
  2. Flush two logs each time a transaction or transaction group commits.
  3. To ensure consistency between the two log files, a complex and inefficient way such as XA (2PC) is used.

Therefore, all other products seem to use only one set of logs (SQL Server called Transaction log, ORACLE called redo log, and PostgreSQL called WAL) to do all the relevant work. Is it only MySQL that must open two sets of logs at the same time to ensure both ACID compliance and strong consistent master-slave replication?

Is there a way to implement ACID compliance and strong consistent semi-synchronous replication while only one of them is enabled?

like image 850
ASBai Avatar asked Sep 18 '19 00:09

ASBai


People also ask

Why do we need redo logs?

Redo entries record data that you can use to reconstruct all changes made to the database, including the undo segments. Therefore, the redo log also protects rollback data.

What is InnoDB redo log?

The redo log is a disk-based data structure used during crash recovery to correct data written by incomplete transactions. During normal operations, the redo log encodes requests to change table data that result from SQL statements or low-level API calls.

What is the use of Binlog in MySQL?

The binary log is a set of log files that contain information about data modifications made to a MySQL server instance. The log is enabled by starting the server with the --log-bin option. The binary log was introduced in MySQL 3.23.

Can I remove Binlog MySQL?

Can I Remove MySQL Binary Log Yes, as long as the data is replicated to Slave server, it's safe to remove the file. It's recommend only remove MySQL Binary Log older than 1 month. Besides, if Recovery of data is the main concern, it's recommend to archive MySQL Binary Log.


1 Answers

This is an interesting topic. For a long time, I have been advocating the idea of merging the InnoDB write-ahead log and the binlog. The biggest motivation for that would be that the need to synchronize two separate logs would go away. But, I am afraid that this might not happen any time soon.

At MariaDB, we are taking some steps to reduce the fsync() overhead. The idea of MDEV-18959 Engine transaction recovery through persistent binlog is to guarantee that the binlog is never behind the InnoDB redo log, and by this, to allow a durable, crash-safe transaction commit with only one fsync() call, on the binlog file.

While the binlog implements logical logging, the InnoDB redo log implements physical logging (covering changes to persistent data pages that implement undo logs and index trees). As I explained in M|18 Deep Dive: InnoDB Transactions and Write Paths, a user transaction is divided into multiple mini-transactions, each of which can atomically modify multiple data pages.

The redo log is the ‘glue’ that makes changes to multiple data pages atomic. I think that the redo log is absolutely essential for implementing atomic changes of update-in-place data structures. Append-only data file structures, such as LSM trees, could be logs by themselves and would not necessarily need a separate log.

For an InnoDB table that contains secondary indexes, every single row operation is actually divided into multiple mini-transactions, operating on each index separately. Thus, the transaction layer requires more ‘glue’ that makes the indexes of a table consistent with each other. That ‘glue’ is provided by the undo log, which is implemented in persistent data pages.

InnoDB performs changes to the index pages upfront, and commit is a quick operation, merely changing the state of the transaction in the undo log header. But rollback is very expensive, because the undo log will have to be replayed backwards (and more redo log will be written to cover those index page changes).

In MariaDB Server, MyRocks is another transactional storage engine, which does the opposite: Buffer changes in memory until the very end, and at commit, apply them to the data files. This makes rollback very cheap, but the size of a transaction is limited by the amount of available memory. I have understood that MyRocks could be made to work in the way that you propose.

like image 129
Marko Mäkelä Avatar answered Oct 14 '22 03:10

Marko Mäkelä