Sunday, June 2, 2013

DDL statements in MySQL 5.x with row-based replication

In the replication topology I manage there are many layers of replication filters that prune data at the database and in a few places table level. The way MySQL replicates Data Definition Language (create, alter, drop) statements differs from how Data Manipulation Language (insert, update, delete) statements are handled with row-based replication. I often need to fix broken replication due to a lack of understanding of these subtle differences.

With row-based replication DML statements focus directly on the table being modified. DDL on the other hand always uses statement-based replication and is tied to what is known in MySQL as the "default database". The default database is the schema/database currently in use when a DDL statement is executed. The only information I was able to find regarding this in the MySQL documentation is the second to last paragraph here. If do-db or ignore-db filters are configured on slaves, replication can break when DDL statements are used in certain scenarios. Table level or wildcard replication filters can also add to the potential breakage but I will save that for a separate blog post.

A few examples should make things clear. The slave in the following examples is configured with "replicate-ignore-db = test". That means it will replicate all objects from its master except for anything in the "test" schema/database. The test environment is reset between examples. I have changed the prompt via the my.cnf to show the default database and statement counter: prompt = '\h(\d):\c> '


Example #1


On the master we use the test schema and create a table named ddl_test.
master((none)):1> use test;
Database changed
master(test):2> show tables;
Empty set (0.00 sec)
master(test):3> create table ddl_test(id tinyint unsigned auto_increment primary key);
Query OK, 0 rows affected (0.27 sec)
master(test):4> show tables;
+----------------+
| Tables_in_test |
+----------------+
| ddl_test |
+----------------+
1 row in set (0.00 sec)
master(test):5>


The ddl_test table does not replicate to the test schema which is the expected behavior.
slave((none)):1> use test;
Database changed
slave(test):2> show tables;
Empty set (0.00 sec)
slave(test):3>


Example #2


This time on the master we use the world schema, create the ddl_test table in the test schema by way of a fully qualified table name and insert a row to it.
master((none)):1> use world;
Database changed
master(world):2> create table test.ddl_test(id tinyint unsigned auto_increment primary key);
Query OK, 0 rows affected (0.28 sec)
master(world):3> insert into test.ddl_test(id) values(1);
Query OK, 1 row affected (0.00 sec)
master(world):4> use test;
Database changed
master(test):5> show tables;
+----------------+
| Tables_in_test |
+----------------+
| ddl_test |
+----------------+
1 row in set (0.00 sec)
master(test):6> select * from ddl_test;
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.00 sec)
master(test):7>


In this case the table creation is replicated because create is a DDL statement and the world schema was the default database when the statement was issued. Even though the default database was world, an insert does not replicate to the slave because it is a DML statement and the slave is configured to ignore events on objects in the test schema.
slave((none)):1> use test;
Database changed
slave(test):2> show tables;
+----------------+
| Tables_in_test |
+----------------+
| ddl_test |
+----------------+
1 row in set (0.00 sec)
slave(test):3> select * from ddl_test;
Empty set (0.00 sec)
slave(test):4>


Example #3


Now we create the ddl_test table while test is the default database, then alter the table to add a column while world is the default database.
master((none)):1> use test;
Database changed
master(test):2> create table test.ddl_test(id tinyint unsigned auto_increment primary key);
Query OK, 0 rows affected (0.30 sec)
master(test):3> use world;
Database changed
master(world):4> alter table test.ddl_test add column newcol varchar(1);
Query OK, 0 rows affected (0.48 sec)
Records: 0 Duplicates: 0 Warnings: 0
master(world):5> show create table test.ddl_test\G
*************************** 1. row ***************************
Table: ddl_test
Create Table: CREATE TABLE `ddl_test` (
`id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
`newcol` varchar(1) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.06 sec)
master(world):6>


Ouch! The result is broken replication on the slave. The reason is the original table create did not replicate but the additional alter statement does.
slave((none)):1> use test;
Database changed
slave(test):2> show tables;
Empty set (0.00 sec)
slave(test):3> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000002
Read_Master_Log_Pos: 1132
Relay_Log_File: slave-relay-bin.000010
Relay_Log_Pos: 1164
Relay_Master_Log_File: master-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1146
Last_Error: Error 'Table 'test.ddl_test' doesn't exist' on query. Default database: 'world'. Query: 'alter table test.ddl_test add column newcol varchar(1)'
Skip_Counter: 0
Exec_Master_Log_Pos: 1014
Relay_Log_Space: 1589
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1146
Last_SQL_Error: Error 'Table 'test.ddl_test' doesn't exist' on query. Default database: 'world'. Query: 'alter table test.ddl_test add column newcol varchar(1)'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
slave(test):4>


The error message shows exactly what went wrong. Table does not exist is pretty clear. The default database is world, the query references the test.ddl_test table and shows the alter table command that was attempted. Show slave status output also shows us the Replicate_Ignore_DB configuration has been set on the slave to ignore all DDL and DML statements against the test schema.

The next course of action is to get replication going again. We have deduced the alter table statement should not have replicated so we simply skip the event using the sql_slave_skip_counter variable.
slave(test):4> set global sql_slave_skip_counter=1;
Query OK, 0 rows affected (0.00 sec)
slave(test):5> start slave;
Query OK, 0 rows affected (0.00 sec)
slave(test):6> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.101
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000002
Read_Master_Log_Pos: 1132
Relay_Log_File: slave-relay-bin.000010
Relay_Log_Pos: 1282
Relay_Master_Log_File: master-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: test
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1132
Relay_Log_Space: 1589
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
slave(test):7>


Not directly related to the theme of this post but worth mentioning is the unique schema in MySQL known as none or null. This schema can not be used directly since it is the absence of having a default database. The only way I know of acquiring null schema is to connect to mysql without the -D option or by dropping the schema that is currently in use. How DDL and DML statements are handled remains the same with or without a default database.

Hopefully the examples above make the rules that govern replication of DDL and DML statements clearer. The potential impact on a dba really depends on the size and configuration of their replication topology. The more databases and filters in place to limit and/or direct what is replicated, the greater the chance replication can break. Number of users manipulating database objects, their knowledge of MySQL and the replication topology they are working within is also a big factor.

Monday, May 13, 2013

Delayed row-based replication with large tables lacking a primary key

I configure all our master databases to use row-based binary logging where I work. In my opinion it is a much safer option than statement-based replication. The advantages and disadvantages of both types of MySQL replication are detailed in the online documentation here. You can't view the events a slave is applying directly with 'show processlist' but by issuing 'show open tables where in use' you can detect what table is receiving the attention of the SQL thread. If you need more information the mysqlbinlog command must be used to decode the slaves relay logs or masters binary logs.

Our developers often change a lot of rows with a single update statement. This usually results in some reasonable replication lag on downstream slaves. Occasionally the lag continues to grow and eventually nagios complains. Investigating the lag I sometimes discover the root of the problem is due to millions of rows updated on a table with no primary key. Putting a primary key constraint on a table is just good practice, especially on InnoDB tables. This is really necessary for any large table that will be replicated.

To show what replication is actually doing I have cloned the sakila.film table, omitting the indexes, inserted all its rows and finally updated the description column for all 1,000 rows in the new table.

beefeater(test)> show master status;
+----------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+----------------------+----------+--------------+------------------+
| beefeater-bin.000001 | 107 | | |
+----------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
beefeater(test)> use test;
Database changed
beefeater(test)> CREATE TABLE test.film (
-> film_id smallint(5) unsigned NOT NULL,
-> title varchar(255) NOT NULL,
-> description text,
-> release_year year(4) DEFAULT NULL,
-> language_id tinyint(3) unsigned NOT NULL,
-> original_language_id tinyint(3) unsigned DEFAULT NULL,
-> rental_duration tinyint(3) unsigned NOT NULL DEFAULT '3',
-> rental_rate decimal(4,2) NOT NULL DEFAULT '4.99',
-> length smallint(5) unsigned DEFAULT NULL,
-> replacement_cost decimal(5,2) NOT NULL DEFAULT '19.99',
-> rating enum('G','PG','PG-13','R','NC-17') DEFAULT 'G',
-> special_features set('Trailers','Commentaries','Deleted Scenes','Behind the Scenes') DEFAULT NULL,
-> last_update timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
-> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.23 sec)
beefeater(test)> insert into test.film select * from sakila.film;
Query OK, 1000 rows affected (0.37 sec)
Records: 1000 Duplicates: 0 Warnings: 0
beefeater(test)> show master status;
+----------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+----------------------+----------+--------------+------------------+
| beefeater-bin.000001 | 137389 | | |
+----------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
beefeater(test)> update test.film set description = '';
Query OK, 1000 rows affected (0.05 sec)
Rows matched: 1000 Changed: 1000 Warnings: 0
beefeater(test)> show master status;
+----------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+----------------------+----------+--------------+------------------+
| beefeater-bin.000001 | 313847 | | |
+----------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
view raw blog_05122013_1 hosted with ❤ by GitHub


I have edited the output of mysqlbinlog to only show the entries related to the first row created by the insert and update statements above. The @ symbol followed by a number is a mapping to the columns in the table. So @1=film_id, @2=title, @3=description and so on. Note that the update statement records a before and after picture of the row. This is can be used in a pinch to fix mistaken updates if the damage is small instead of having to restore from backups.

mysqlbinlog --base64-output=auto beefeater-bin.000001 | more
...
### INSERT INTO test.film
### SET
### @1=1
### @2='ACADEMY DINOSAUR'
### @3='A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian R
ockies'
### @4=2006
### @5=1
### @6=NULL
### @7=6
### @8=990000000
### @9=86
### @10=000000020.990000000
### @11=2
### @12=b'00001100'
### @13=1139997822
...
root@beefeater:/var/lib/mysql# mysqlbinlog --base64-output=auto --start-position=137389 beefeater-bin.000001 | more
...
### UPDATE test.film
### WHERE
### @1=1
### @2='ACADEMY DINOSAUR'
### @3='A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian R
ockies'
### @4=2006
### @5=1
### @6=NULL
### @7=6
### @8=990000000
### @9=86
### @10=000000020.990000000
### @11=2
### @12=b'00001100'
### @13=1139997822
### SET
### @1=1
### @2='ACADEMY DINOSAUR'
### @3=''
### @4=2006
### @5=1
### @6=NULL
### @7=6
### @8=990000000
### @9=86
### @10=000000020.990000000
### @11=2
### @12=b'00001100'
### @13=1368396106
...
view raw blog_05122013_2 hosted with ❤ by GitHub


So row-based replication is performing as named and creating a binary log event for each row affected. My single insert and update statement on the master then became 1,000 separate events on the slave.

Digging in to the MySQL source code I was unable to confirm exactly how the SQL thread applies relay log events on a slave. I assume it is similar to what happens when a normal update statement is executed on a table with no indexes. The server must perform a full table scan to locate a single row. For a table with a million plus rows a million full table scans is expensive! A primary key or suitable unique index will prevent this type of problem.