MySQL replication, a.k.a MySQL database replication provides the facility to make replicas of databases. The ability to make exact copies of databases and keep them in real-time sync as changes are made at the “master” provides a number of advantages.
In summary these are:
- Scaling out a database application
- Reducing database backup impact
- Facilitate reporting without affecting production load
- Failover/High Availability
I’m going to go into detail about these in my next post.
How does MySQL Replication Work?
Firstly we have to understand the architecture. At a minimum, in order to establish MySQL database replication, we have a master and a slave.
Typically, these would be two different servers.
The master is where all the changes happen. All database updates occur here, from adding, updating or deleting table records to creating functions, stored procedures or making table changes.
The slave server receives a copy of the changes applied at the master server. This all happens very quickly in order that the slave is always in sync with the master.
But how does this happen?
Replication events are written to a special log called the binary log on the master. The events are then read by the slave (a.k.a replica) and applied to the databases on slave server.
Replication Events
Two types:
Statement based – these are the write operations. They written to the binary log as SQL statements.
Row based – these are the row changes and they are written to the binary log as complete copies of the rows.
I will cover the differences of these and all of the advantages and disadvantages in a later post. When replication is set up, there is a choice between statement based, row based or mixed for the binlog-format variable to determine how the events are written to the binary log.
On the master
As updates occur, the events are written sequentially to the binary log for the slave to read later. Writes on the master are actually buffered to improve performance for when the slave is reading them as well as being written to the binary log.
The binlog dump thread – when the slave connects to the master, the master opens up a thread for the connection from the slave. This thread is just like any other thread however the difference will be that it will be typically opened using a dedicated user set up for the replication process and it just handles notifying the slave of changes and sending the binary log contents to it as it connects.
On the slave
When replication is started, two threads are created on the slave:
The IO thread – this thread connects to the master and reads the events as they come in. The replication events are then written to a local log file called the relay log which will be read by the second thread created on the slave.
The SQL thread – this thread reads the relay log (written by the IO thread) and applies the changes to the database as quickly as possible.
If you want to check the status of the IO thread, run “show slave status\G” on the slave:
- Master_Log_File – the last binary log file copied from the master.
- Read_Master_Log_Pos – the binary log from the master is copied over to the relay log on the slave up to the position listed.
If you want to see the status of the SQL thread on the slave, again using “show slave status\G” , check the following:
- Relay_Master_Log_File – this is the relay log (binary log from the master) that SQL thread is currently processing.
- Exec_Master_Log_Pos – the position in the binary log currently being executed.
I hope you found this post useful. Please share if you liked it. Thanks 🙂
Rachit Saxena says
Awesome explanation.