In one of my previous posts, “How to resize InnoDB logs?”, I gave the advice on how to safely change the size of transaction logs. This time, I will explain why doing it may become necessary.
A brief introduction to InnoDB transaction logs
The transaction logs handle REDO logging, which means they keep the record of all recent modifications performed by queries in any InnoDB table. But they are a lot more than just an archive of transactions. The logs play important part in the process of handling writes. When a transaction commits, InnoDB synchronously makes a note of any changes into the log, while updating the actual table files happens asynchronously and may take place much later. Each log entry is assigned a Log Sequence Number – an incremental value that always uniquely identifies a change.
Such design serves database in two ways.
First, it optimizes MySQL performance. It enables InnoDB to use light sequential I/O to store the modifications on disk as transactions commit and delay expensive random I/O required for data and index updates for when it is more convenient. Buffered updates may be then rearranged or merged in order to further optimize disk access.
And second, after a crash InnoDB can use the logs contents to perform recovery. As updates to the data files are done asynchronously, in the event of an unclean shut down, any modifications that existed only in the buffer pool would be lost. The ability to replay changes from the log deals with the problem.
InnoDB logs are circular, which means that they create a loop. When database reaches the end of the last file (commonly there are two files), it begins writing into the first one again.
They are also fixed size, so in order to prevent them from filling, InnoDB implements a background mechanism called checkpointing, which manages the process of synchronizing modified pages to disk (step (3) in the first illustration). As it runs, it marks exactly one log record as checkpoint, which states that all modifications carrying a younger LSN value were already safely stored in data files. This means the log contents prior to that LSN is no longer needed for recovery or any other purpose. As the checkpoint progresses, continuously chasing the most recent LSN, it keeps freeing the log space behind it, which can then be re-used by future writes.
What are the main challenges of choosing the right size for InnoDB logs?
Making them small enough to avoid unnecessarily long distance between checkpoint LSN and the most recent LSN to avoid needlessly long recovery times.
With newer MySQL versions this should not be a significant factor anymore as the recovery process has been greatly optimized in MySQL 5.1.46 and in 5.5.
In older versions, however, the process used to be slow and would often needed hours to replay transactions from even medium-sized logs such as 512MB or 1GB, so it is a risk that needs to be evaluated.
Making them large enough to fit writes over sufficiently long period of time, so that InnoDB has some room to maneuver in deciding when to flush some of the buffer pool contents.
When the transaction logs are set too small for given workload, MySQL performance may suffer. During busier periods incoming writes may start pushing LSN faster than checkpoint can progress and the log space will start filling up. After a threshold is exceeded, checkpointing becomes very intense as InnoDB spots the upcoming problem that it may be short of free log space soon. When this doesn’t help, the engine may need to start blocking queries as it is flushing the buffer pool contents and advancing the checkpoint.
What size works?
When installing a new database, the transaction logs often need to be configured without too much knowledge about future workload. One can simply try choosing a reasonable size such as 64M for a pretty average database. There is no reason to go below this value. Of course, the more writes this new database is expected to take, the larger they may need to be. In any case it is extremely important to always change the default size, which is just two files of 5MB. Such configuration is not sufficient for any serious purpose, so do not ever never allow it even on a development server.
With database already running in testing or in production, the necessary size can be calculated based on the rate at which data is written into the transaction logs as this information is available from MySQL. A good rule says that the logs should be able to hold at least one hour worth of changes. In order to come up with a number, simply check how quickly Log Sequence Number progresses. Be sure not to check this during quiet periods as this has to be tuned for the peak usage.
Calculating the size
The information can be found in InnoDB status output, which you can obtain with
SHOW ENGINE INNODB STATUS:
mysql> \P grep 'Log sequence number' PAGER set to 'grep 'Log sequence number'' mysql> SHOW ENGINE INNODB STATUS\G Log sequence number 21060750647056 1 row in set (0.06 sec)
Using the command line pager helped limiting the output to only the relevant information.
In reality, LSN value represents an offset from byte zero of the transaction log, so since the database has been initialized. Seeing how it changes essentially means seeing how much data was written into it over a period of time. So to figure out the amount of changes happening to a database:
- check the most recent LSN using the method shown above
- wait some time
- check the LSN again
- subtract the two values
Here is how to do it in practice:
mysql> \P grep 'Log sequence number' PAGER set to 'grep 'Log sequence number'' mysql> SHOW ENGINE INNODB STATUS\G SELECT SLEEP(60); SHOW ENGINE INNODB STATUS\G 21057170602213 1 row in set (0.06 sec) 1 row in set (1 min 0.00 sec) 21057190468976 1 row in set (0.05 sec)
The two LSN values are 21057170602213 and 21057190468976. Let’s calculate the difference:
mysql> select ROUND((21057190468976 - 21057170602213)/ 1024 / 1024) as MB; +------+ | MB | +------+ | 19 | +------+ 1 row in set (0.00 sec)
19 megabytes were written into the log file in one minute. Using this information, how large the logs should be to keep at least one hour worth of writes? 19MB * 60 minutes / 2 files = 570MB. The division by two comes from the fact that InnoDB uses two log files by default, while we need to set size for each individual file. Therefore we calculated that this database needs
innodb_log_file_size set to at least 512MB.
In MySQL 5.1 or newer, a query against INFORMATION_SCHEMA.GLOBAL_STATUS can be used instead of looking at the InnoDB status output.
SELECT @a1 := variable_value AS a1 FROM information_schema.global_status WHERE variable_name = 'innodb_os_log_written' UNION ALL SELECT Sleep(60) UNION ALL SELECT @a2 := variable_value AS a2 FROM information_schema.global_status WHERE variable_name = 'innodb_os_log_written'; SELECT ROUND((@a2-@a1) * 60 / 1024 / 1024 / @@innodb_log_files_in_group) as MB;