Binlog Server at Facebook - 0

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

Binlog Server at Facebook

Santosh Banda
Teng Li
Database Engineering Team, Facebook, Inc.
Agenda
1 Motivation

2 Use cases

3 Design

4 Operational Commands
Binlog Storage at Facebook

MySQL master Copy binlog files HDFS


Binlog Backups
Binlogs
▪ Shared by replica-set
▪ Always copied from current master
▪ GTID makes replay safer
▪ Retention time: Weeks

MySQL slave MySQL slave

………….
Binlogs Binlogs

* Binlog retention time: Hours


Replication Catchup

MySQL master Change Master To MySQL slave

ER_MASTER_HAS_PURGED
GTID_PURGED GTID_EXECUTED
_REQUIRED_GTIDS
uuid: 1-1000 uuid: 1-500

Binlog retention time: Hours


Binlog Replay
mysqlbinlog
-- exclude-gtids=uuid:1-500
MySQL master
Automation tools

ER_MASTER_HAS_PURGED
GTID_PURGED Let’s fetch binary logs
_REQUIRED_GTIDS
uuid: 1-1000

Binlog retention time: Hours


Binlog Server
HDFS Retention time: Weeks
Binlog Backups

Automation tools/
Change Master To / MySQLBinlog
MySQL slave
Binlog Server
Let’s fetch old binary
Binlog data stream logs
Serves Binlogs Using MySQL Protocol
Motivation
▪ Unified solution for binlog retrieve and replay

▪ Reduce binlog partition size on MySQL machines


Facebook Vs MaxScale

Facebook MaxScale

Binlog proxy (Intermediate replica) Yes Yes

Easy Failover Yes Yes

GTID support Yes No

Pluggable storage systems Yes No

Open Source No Yes


Use Cases
Online Shard Migration
▪ Moves shard across replica sets

ReplicaSet.1 ReplicaSet.2
Copy

db1 db3
db2 db4
Online Shard Migration
▪ Copies the database using MysqlDump

ReplicaSet.2
ReplicaSet.1 Dump and load
db2_copy
db1
db3
db2
db4
Online Shard Migration
▪ Replay the binlogs using mysqlbinlog from ReplicaSet.1

ReplicaSet.2
ReplicaSet.1
Replay binlogs
db2_copy
db1
db3
db2
db4
Online Shard Migration
▪ Copy time is greater than local binlog retention time
▪ Retry

ReplicaSet.2
ReplicaSet.1 ER_MASTER_HAS_PURGED
_REQUIRED_GTIDS
db2_copy
db1
db3
db2
db4
Online Shard Migration
▪ Retry OLM

ReplicaSet.2
ReplicaSet.1 1. Dump and Load
2. Replay binlog
db2_copy
db1
db3
db2
db4
Online Shard Migration
▪ Retry… Retry…
▪ Failure !! Reach on-call

ReplicaSet.2
ReplicaSet.1 ER_MASTER_HAS_PURGED
_REQUIRED_GTIDS
db2_copy
db1
db3
db2
db4
Online Shard Migration
▪ Replay using binlog server.
▪ Copy time doesn’t affect migration

ReplicaSet.2
Success
Binlog Server db2
db3
db4
Creating New Replicas
▪ Replay using Binlog Server. Switch back to actual MySQL master
▪ No retries !!

Binlog Server
…… Success New
……
MySQL Slave
Switch to
MySQL
Binlog Server in Failover
▪ Binlog server used as semi-sync log tailers

MySQL master
Binlog Server
A

Semi-sync tailer

MySQL slave MySQL slave


Binlog Server Binlog Server
B C
Binlog Server in Failover
▪ Dead master promotion is triggered

Dead master
Binlog Server
A

Semi-sync tailer

MySQL slave MySQL slave


Binlog Server Binlog Server
B C
Binlog Server in Failover
▪ Stop binlog server’s tailing to node fence dead master

Dead master
Binlog Server
A

Semi-sync tailer

MySQL slave MySQL slave


Binlog Server Binlog Server
B C
Binlog Server in Failover
▪ Pick a MySQL slave to promote

Dead master
Binlog Server
A

MySQL slave MySQL slave


Binlog Server Binlog Server
B C
Binlog Server in Failover
▪ Catchup server C from binlog server using CHANGE MASTER

Dead master
Binlog Server
A

CHANGE MASTER

MySQL slave MySQL slave


Binlog Server Binlog Server
B C
Binlog Server in Failover
▪ Promote server C as the new master

MySQL master
Binlog Server
C

Semi-sync tailer

MySQL slave
Binlog Server
B
Binlog Server in Failover
▪ Recover dead master

MySQL master
Binlog Server
C

Semi-sync tailer

MySQL slave MySQL slave


Binlog Server Binlog Server
B A
And more …
▪ Point in time recovery of a single shard

▪ Disaster recovery of full MySQL instances


▪ Binlog replay through replication is simpler, safer and reliable

▪ Binlog replay during Online Schema Change


▪ Currently we are using table triggers to track deltas. With RBR, it is
possible to replay per table binlog updates
Design of Binlog Server
Binlog Server Design
Binlog Server Architecture
Client Connection

Frontend Connection Handler

Command
Binlog 
 Responses MySQL Parser Query Results
Event 

Packets
MySQL Replication
Handler Query Processor

Request Binlog Dump GTID

MySQL Binlog Storage Service


Binlog Server Design
Handling MySQL Client Connections
Client Connection

Frontend Connection Handler

MySQL Parser

▪ Built on the existing framework


▪ MySQL connection/handshake handler
▪ A compact MySQL parser
Binlog Server Design
Processing Replication Queries
Client Connection

Frontend Connection Handler

MySQL Parser Query Results

Process Replication Queries


▪ SELECT Query Processor
▪ SERVER_ID, UNIX_TIMESTAMP, GTID_MODE, etc…
▪ SHOW
▪ rpl_semi_sync_master_enabled, SERVER_UUID
▪ SET
▪ SLAVE_UUID, MASTER_HEARTBEAT_PERIOD
Binlog Server Design
Processing Replication Commands
Client Connection

Frontend Connection Handler

Command
Responses MySQL Parser Query Results

Handling MySQL Replication


Replication
Commands Handler Query Processor

▪ Enabling MySQL replication protocol


▪ COM_REGISTER_SLAVE, and COM_BINLOG_DUMP_GTID
Binlog Server Design
Handling Binlog Dump Requests
Client Connection

Frontend Connection Handler

Command
Binlog 
 Responses MySQL Parser Query Results
Event 

Packets
MySQL Replication
Handler Query Processor

Request Binlog Dump GTID

MySQL Binlog Storage Service


Binlog Server Design
MySQL Binlog Storage Service
▪ A library to plug binlog storage features
▪ Implemented the majority of MySQL replication protocol in GTID mode
▪ Components:
▪ Binlog reader to fetch binlogs on different storage medias
▪ Binlog locator
▪ Binlog writer in semi-sync/async mode
Binlog Server Design
Binlog Server Operation Modes
▪ HDFS mode
▪ Binlogs are backed up to HDFS with long retention time
▪ Serving binlog backups on HDFS as a master


▪ Log-tailer mode
▪ Backing up each MySQL instance’s binlogs as a semi-sync tailer
▪ Serving log-tailer’s binlogs as a master
Binlog Server Design
Components in HDFS mode
▪ Binlog reader/sender from HDFS
▪ A customized HDFS version of “binlog dump thread”
▪ HDFS binlog locator
▪ Uses info stored in locator DB for each replicaset
▪ HDFS binlog paths
▪ Previous GTID sets of each binlog
▪ Locates the list of required HDFS binlogs
▪ With a given GTID set
Binlog Server Design
Binlog Server in HDFS mode

Replicaset 12345

slave status:

Error Msg:
Replication
Master has
Master Greatly purged the
Lagged Slave required
binary logs
Binlog Server Design
Binlog Server in HDFS mode

Replicaset 12345

Binlog Reader/Sender
(1) change master to Binlog Server;
start slave; Greatly
Binlog Locator Lagged Slave

Binlog Server
Binlog Server Design
Binlog Server in HDFS mode

Replicaset 12345

(2) Locate the list of binlog paths to Binlog Reader/Sender


send based on the slave’s GTID set
Binlog
Locator DB Binlog Locator
HDFS file path Prev GTID set
hdfs://***.binlog-1.gz UUID:1-20 Binlog Server
hdfs://***.binlog-2.gz UUID:1-50
hdfs://***.binlog-3.gz UUID:1-70
hdfs://***.binlog-4.gz UUID:1-90
Binlog Server Design
Binlog Server in HDFS mode

Replicaset 12345

Binlog Reader/Sender
HDFS
Cluster (3) Read the binlogs on HDFS and
prepare binlog packet streams Binlog Locator

Binlog Server
Binlog Server Design
Binlog Server in HDFS mode

Replicaset 12345

(4) Sending binlog packet by packet


Binlog Reader/Sender

Greatly
Binlog Locator Lagged Slave
Binlog Server
Binlog Server Design
Components in Log-tailer mode
▪ Binlog writer with acknowledgment capability
▪ Connecting to the MySQL as a semi-sync slave
▪ Writing binlogs to the Disk
▪ Acknowledge the MySQL when requested by the master
▪ Binlog reader/sender from Disk
▪ A customized version of “binlog dump thread”
Binlog Server Design
Binlog Server in Log-tailer mode

Binlog Server Log-tailer


Semi-sync- Promotion
Replication Catchup
Semi-sync Binlog Slave
Master Writer/Acker
Binlog Reader

To be
promoted
Local Disk when
master dies

Binlog Binlog Binlog

log.index 00001.bin 00002.bin 00003.bin


Operational Commands
Operational commands
Show Master Status
▪ HDFS mode
binlog_server> show master status\G
*************************** 1. row ***************************
File: hdfs://******.binary-logs-xxxxxx.xxxxxx.gz
Position: 4
Executed_Gtid_Set: 6c597fb0-d3a4-4aab-ba93-2286a75727ed:1-81669,
765a6781-d959-492b-8091-e6adeac313ee:1-53168

▪ Log-tailer mode
binlog_server>show master status\G
*************************** 1. row ***************************
File: binary-logs-3306.007965
Position: 13366
Executed_Gtid_Set: 49f5e0ca-80d2-4616-be83-d1aeb5e973bc:1-902909,
73707584-d9d1-49f1-b2bf-0ffb5e603b2d:1-81669

Operational commands
Show Slave Status in Log-tailer mode
binlog_server> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: HOSTNAME
Master_Port: PORT
Connect_Retry: 0
Master_Log_File: binary-logs-xxxxxx.007964
Read_Master_Log_Pos: 97115
Binlog_File: binary-logs-xxxxxx.007964
Binlog_Pos: 97115
Last_IO_Errno: 0
Master_Server_Id: 3695980966
Executed_Gtid_Set: ea4a5e01-b3e4-4273-a25e-88d06db8d1a5:1-902842,
b29a87bd-d60b-4455-9ab8-90d7b720f169:1-81669
Mysql_Replicaset: REPLICA_SET_NAME
Replicaset_Tier_Version: VERSION_NUM
Semisync_Slave: Yes
Operational commands
Show Master Logs in Log-tailer mode

binlog_server> show master logs; binlog_server> show master logs with gtid\G

+-------------------------+-----------+ *************************** 1. row


| Log_name | File_size | Log_name: binary-logs-3306.007963
+-------------------------+-----------+ File_size: 131261
| binary-logs-3306.007962 | 124002 | Prev_gtid_set: 561d1725-ed2e-458a-a496-77c65701e6d7:1-902253,
| binary-logs-3306.007963 | 131261 | 1e407547-ca35-4838-a19c-e3c90e33ebd4:1-81669
| binary-logs-3306.007964 | 15707 |
| binary-logs-3306.007964 | 110983 | *************************** 2. row
| binary-logs-3306.007965 | 127464 | Log_name: binary-logs-3306.007964
| binary-logs-3306.007966 | 135975 | File_size: 110983
+-------------------------+-----------+ Prev_gtid_set: 561d1725-ed2e-458a-a496-77c65701e6d7:1-902590,
1e407547-ca35-4838-a19c-e3c90e33ebd4:1-81669

……
Operational commands
Purging Logs in Log-tailer mode
binlog_server> show master logs;
+-------------------------+-----------+
| Log_name | File_size |
+-------------------------+-----------+
| binary-logs-3306.007962 | 124002 |
| binary-logs-3306.007963 | 131261 |
| binary-logs-3306.007964 | 15707 |
+-------------------------+-----------+

binlog_server> purge logs to binary-logs-3306.007963;


Query OK, 0 rows affected (0.00 sec)

binlog_server> show master logs;


+-------------------------+-----------+
| Log_name | File_size |
+-------------------------+-----------+
| binary-logs-3306.007963 | 131261 |
| binary-logs-3306.007964 | 69083 |
+-------------------------+-----------+
Operational commands
Start/Stop Slave in Log-tailer mode
binlog_server> start slave
binlog_server> show slave status\G
*************************** 1. row
Slave_IO_State: Waiting for master to send event
Master_Host: HOSTNAME
Master_Port: 3336
Connect_Retry: 0

……

binlog_server> stop slave


binlog_server> show slave status\G
*************************** 1. row
Slave_IO_State: Stopped
Master_Host: HOSTNAME
Master_Port: 3336
Connect_Retry: 0

……


Questions?

You might also like