Sunday, June 2, 2013

DDL statements in MySQL 5.x with row-based replication

In the replication topology I manage there are many layers of replication filters that prune data at the database and in a few places table level. The way MySQL replicates Data Definition Language (create, alter, drop) statements differs from how Data Manipulation Language (insert, update, delete) statements are handled with row-based replication. I often need to fix broken replication due to a lack of understanding of these subtle differences.

With row-based replication DML statements focus directly on the table being modified. DDL on the other hand always uses statement-based replication and is tied to what is known in MySQL as the "default database". The default database is the schema/database currently in use when a DDL statement is executed. The only information I was able to find regarding this in the MySQL documentation is the second to last paragraph here. If do-db or ignore-db filters are configured on slaves, replication can break when DDL statements are used in certain scenarios. Table level or wildcard replication filters can also add to the potential breakage but I will save that for a separate blog post.

A few examples should make things clear. The slave in the following examples is configured with "replicate-ignore-db = test". That means it will replicate all objects from its master except for anything in the "test" schema/database. The test environment is reset between examples. I have changed the prompt via the my.cnf to show the default database and statement counter: prompt = '\h(\d):\c> '


Example #1


On the master we use the test schema and create a table named ddl_test.


The ddl_test table does not replicate to the test schema which is the expected behavior.


Example #2


This time on the master we use the world schema, create the ddl_test table in the test schema by way of a fully qualified table name and insert a row to it.


In this case the table creation is replicated because create is a DDL statement and the world schema was the default database when the statement was issued. Even though the default database was world, an insert does not replicate to the slave because it is a DML statement and the slave is configured to ignore events on objects in the test schema.


Example #3


Now we create the ddl_test table while test is the default database, then alter the table to add a column while world is the default database.


Ouch! The result is broken replication on the slave. The reason is the original table create did not replicate but the additional alter statement does.


The error message shows exactly what went wrong. Table does not exist is pretty clear. The default database is world, the query references the test.ddl_test table and shows the alter table command that was attempted. Show slave status output also shows us the Replicate_Ignore_DB configuration has been set on the slave to ignore all DDL and DML statements against the test schema.

The next course of action is to get replication going again. We have deduced the alter table statement should not have replicated so we simply skip the event using the sql_slave_skip_counter variable.


Not directly related to the theme of this post but worth mentioning is the unique schema in MySQL known as none or null. This schema can not be used directly since it is the absence of having a default database. The only way I know of acquiring null schema is to connect to mysql without the -D option or by dropping the schema that is currently in use. How DDL and DML statements are handled remains the same with or without a default database.

Hopefully the examples above make the rules that govern replication of DDL and DML statements clearer. The potential impact on a dba really depends on the size and configuration of their replication topology. The more databases and filters in place to limit and/or direct what is replicated, the greater the chance replication can break. Number of users manipulating database objects, their knowledge of MySQL and the replication topology they are working within is also a big factor.