On Slave Usage

Slaves can be used for:

  1. Horizontal read scalability — take the load off a master database by spreading reads to a replicated slave.

  2. Disaster recovery — some disasters, such as a hardware failure, can be solved by having a slave ready to propagate to a master. This technique also works to make offline changes to a master/slave pair without having database downtime (see below).

  3. Consistent Backups — without disrupting production usage, a slave can be used to take a consistent backup by stopping the database and copying the database files. This is a free and uncomplicated way to get consistent backups (innodb hot backup is not free, and using a snapshotting tool (such as LVM’s snapshot capability) can be complex. Not everyone wants to manage snapshots.)

Be careful when using a slave for more than one purpose. Using a slave for more than one purpose can be done, but carefully. For example:

If one slave instance serves the needs of 1 and 2 — Where do the reads go when the slave is promoted to a master (or taken offline for maintenance)?

If one slave instance serves the needs of 1 and 3 — can the read slave be down for the length of time it takes to take a backup? If not, this solution is not appropriate.

If one slave instance serves the needs of 2 and 3 — Where does the backup get taken from when the slave is promoted to master (or taken offline for maintenance)?

There are certainly workarounds. We either create a propagation procedure (whether for maintenance or in an emergency) or assess an existing one, to cover these cases to make sure any dual-use of a slave does not leave a client hanging. We regularly test whether slaves are in sync, to ensure accuracy no matter what purposes replication slaves are used for. We also go through actual slave-to-master propagations. This can be useful as a drill, and can also be useful to avoid downtime.

Avoiding Downtime with Replication

In many cases, using replication can avoid downtime during an offline database change (for example, an OPTIMIZE TABLE to defragment a table). To implement this, the slave is taken out of service, the change performed on the slave, the slave is put back into service and propagated to be the master. Then the previous master is taken offline, the change applied to the previous master, and the previous master can return to service (either as a slave of the current master, or by propagating the previous master to be the current master.

Slaves can be used for:

  1. Horizontal read scalability — take the load off a master database by spreading reads to a replicated slave.

  2. Disaster recovery — some disasters, such as a hardware failure, can be solved by having a slave ready to propagate to a master. This technique also works to make offline changes to a master/slave pair without having database downtime (see below).

  3. Consistent Backups — without disrupting production usage, a slave can be used to take a consistent backup by stopping the database and copying the database files. This is a free and uncomplicated way to get consistent backups (innodb hot backup is not free, and using a snapshotting tool (such as LVM’s snapshot capability) can be complex. Not everyone wants to manage snapshots.)

Be careful when using a slave for more than one purpose. Using a slave for more than one purpose can be done, but carefully. For example:

If one slave instance serves the needs of 1 and 2 — Where do the reads go when the slave is promoted to a master (or taken offline for maintenance)?

If one slave instance serves the needs of 1 and 3 — can the read slave be down for the length of time it takes to take a backup? If not, this solution is not appropriate.

If one slave instance serves the needs of 2 and 3 — Where does the backup get taken from when the slave is promoted to master (or taken offline for maintenance)?

There are certainly workarounds. We either create a propagation procedure (whether for maintenance or in an emergency) or assess an existing one, to cover these cases to make sure any dual-use of a slave does not leave a client hanging. We regularly test whether slaves are in sync, to ensure accuracy no matter what purposes replication slaves are used for. We also go through actual slave-to-master propagations. This can be useful as a drill, and can also be useful to avoid downtime.

Avoiding Downtime with Replication

In many cases, using replication can avoid downtime during an offline database change (for example, an OPTIMIZE TABLE to defragment a table). To implement this, the slave is taken out of service, the change performed on the slave, the slave is put back into service and propagated to be the master. Then the previous master is taken offline, the change applied to the previous master, and the previous master can return to service (either as a slave of the current master, or by propagating the previous master to be the current master.