Monday, May 13, 2013

Delayed row-based replication with large tables lacking a primary key

I configure all our master databases to use row-based binary logging where I work. In my opinion it is a much safer option than statement-based replication. The advantages and disadvantages of both types of MySQL replication are detailed in the online documentation here. You can't view the events a slave is applying directly with 'show processlist' but by issuing 'show open tables where in use' you can detect what table is receiving the attention of the SQL thread. If you need more information the mysqlbinlog command must be used to decode the slaves relay logs or masters binary logs.

Our developers often change a lot of rows with a single update statement. This usually results in some reasonable replication lag on downstream slaves. Occasionally the lag continues to grow and eventually nagios complains. Investigating the lag I sometimes discover the root of the problem is due to millions of rows updated on a table with no primary key. Putting a primary key constraint on a table is just good practice, especially on InnoDB tables. This is really necessary for any large table that will be replicated.

To show what replication is actually doing I have cloned the sakila.film table, omitting the indexes, inserted all its rows and finally updated the description column for all 1,000 rows in the new table.



I have edited the output of mysqlbinlog to only show the entries related to the first row created by the insert and update statements above. The @ symbol followed by a number is a mapping to the columns in the table. So @1=film_id, @2=title, @3=description and so on. Note that the update statement records a before and after picture of the row. This is can be used in a pinch to fix mistaken updates if the damage is small instead of having to restore from backups.



So row-based replication is performing as named and creating a binary log event for each row affected. My single insert and update statement on the master then became 1,000 separate events on the slave.

Digging in to the MySQL source code I was unable to confirm exactly how the SQL thread applies relay log events on a slave. I assume it is similar to what happens when a normal update statement is executed on a table with no indexes. The server must perform a full table scan to locate a single row. For a table with a million plus rows a million full table scans is expensive! A primary key or suitable unique index will prevent this type of problem.

8 comments:

  1. This is a perfect case to give a warning message from MySQL for a situation like this, However any such warning needs to be rate-limited to ensure that the warnings do not fill your logs with information you've just been told about.

    ReplyDelete
    Replies
    1. Thanks Simon, I would use a feature like that. The solution will likely continue to be user education on MySQL best practices.

      Delete
  2. 5.6 improves this. It can apply rows in batches and use a HASH scan when an index is not available if the option is enabled:
    http://dev.mysql.com/doc/refman/5.6/en/replication-options-slave.html#option_mysqld_slave-rows-search-algorithms

    ReplyDelete
    Replies
    1. Thanks for the advice Justin. I have done some reading on 5.6 but was not aware of the slave hash algorithm or update batches. Those would be useful options for us in addition to setting the binary log row image to minimal and cutting down on disk space used.

      Delete
  3. You might find this bug interesting:
    http://bugs.mysql.com/bug.php?id=69223

    Tables without primary key are know to cause bad performance with mysql replication, InnoDB and with Galera.

    ReplyDelete
    Replies
    1. Great bug Daniël! I think sql_mode is a perfect way to handle it.

      Delete
  4. And I think your second sentence should read "In my opinion it is a much safer option than statement based replication."

    ReplyDelete
    Replies
    1. Good catch. I have corrected the sentence to properly read statement-based replication now. Thank you very much.

      Delete