且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

MySQL-MMM高可用配置过程

更新时间:2022-08-20 15:16:12

原创作品,允许转载,转载时请务必以超链接形式标明文章 原始出处 、作者信息和本声明。否则将追究法律责任。http://koumm.blog.51cto.com/703525/1768381

MMM配置官方文档: http://mysql-mmm.org/mmm2:guide

本文环境:    
主库:CentOS6.7 x64 192.168.0.201 mysql-5.6.29     
主库:CentOS6.7 x64 192.168.0.202 mysql-5.6.29     
备库:CentOS6.7 x64 192.168.0.203 mysql-5.6.29     
备库:CentOS6.7 x64 192.168.0.204 mysql-5.6.29     
管理:CentOS6.7 x64 192.168.0.205

MMM服务器角色对应关系:    
node1(主1)     192.168.0.201   db1    1     
node2(主2)     192.168.0.202   db2    2     
node3(从1)     192.168.0.203   db3    11     
node4(从1)     192.168.0.204   db4    12     
node5(监控)    192.168.0.205   mon    -

配置完成后,使用下面的VIP访问MySQL Cluster。    
192.168.0.211  writer  
192.168.0.212  reader  
192.168.0.213  reader     
192.168.0.214  reader     
192.168.0.215  reader

 

一、双主环境,以及从服务器环境准备

1,node1(主1) /etc/my.cnf

[mysqld]    
server-id              = 1     
binlog-format          = ROW     
log-bin                = master-bin     
log-bin-index          = master-bin.index     
log-slave-updates      = true     
auto_increment_offset  = 1     
auto_increment_increment = 2


mysql> show master status;     
+-------------------+----------+--------------+------------------+-------------------+     
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |     
+-------------------+----------+--------------+------------------+-------------------+     
| master-bin.000001 |      120 |              |                  |                   |     
+-------------------+----------+--------------+------------------+-------------------+     
1 row in set (0.00 sec)

mysql>

 

2. node1(主2) /etc/my.cnf

[mysqld]    
server-id              = 2     
binlog-format          = ROW     
log-bin                = master-bin     
log-bin-index          = master-bin.index     
log-slave-updates      = true     
auto_increment_offset  = 2     
auto_increment_increment = 2


mysql> show master status;     
+-------------------+----------+--------------+------------------+-------------------+     
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |     
+-------------------+----------+--------------+------------------+-------------------+     
| master-bin.000001 |      120 |              |                  |                   |     
+-------------------+----------+--------------+------------------+-------------------+     
1 row in set (0.00 sec)

mysql>

 

3. node1,node2 双主服务器配置同步复制帐号

grant replication slave on *.* to 'repl'@'%' identified by '123456';    
flush privileges;

 

4, node3(从)/etc/my.cnf

[mysqld]    
binlog-format                = ROW     
log-bin                      = mysql-bin     
relay-log                    = slave-relay-bin     
relay-log-index              = slave-relay-bin.index     
log-slave-updates            = true     
server-id                    = 11     
skip-name-resolve

 

5, node4(从)/etc/my.cnf

[mysqld]    
binlog-format                = ROW     
log-bin                      = mysql-bin     
relay-log                    = slave-relay-bin     
relay-log-index              = slave-relay-bin.index     
log-slave-updates            = true     
server-id                    = 12     
skip-name-resolve

 

6. node1从库配置同步过程

(1) 配置同步,手动执行同步参数,该配置会写入master.info文件中。

mysql >

CHANGE MASTER TO    
MASTER_HOST='192.168.0.202',     
MASTER_PORT=3306,     
MASTER_USER='repl',     
MASTER_PASSWORD='123456',     
MASTER_LOG_FILE='master-bin.000001',     
MASTER_LOG_POS=120;

mysql> start slave;

mysql> show slave status\G;    
*************************** 1. row ***************************     
               Slave_IO_State: Waiting for master to send event     
                  Master_Host: 192.168.0.202     
                  Master_User: repl     
                  Master_Port: 3306     
                Connect_Retry: 60     
              Master_Log_File: master-bin.000001     
          Read_Master_Log_Pos: 407     
               Relay_Log_File: node1-relay-bin.000002     
                Relay_Log_Pos: 571     
        Relay_Master_Log_File: master-bin.000001     
             Slave_IO_Running: Yes     
            Slave_SQL_Running: Yes     
              Replicate_Do_DB:

 

7. node1从库配置同步过程

(1) 配置同步,手动执行同步参数,该配置会写入master.info文件中。

mysql >

CHANGE MASTER TO    
MASTER_HOST='192.168.0.201',     
MASTER_PORT=3306,     
MASTER_USER='repl',     
MASTER_PASSWORD='123456',     
MASTER_LOG_FILE='master-bin.000001',     
MASTER_LOG_POS=120;

mysql> start slave;    
mysql> show slave status\G;     
*************************** 1. row ***************************     
               Slave_IO_State: Waiting for master to send event     
                  Master_Host: 192.168.0.201     
                  Master_User: repl     
                  Master_Port: 3306     
                Connect_Retry: 60     
              Master_Log_File: master-bin.000001     
          Read_Master_Log_Pos: 694     
               Relay_Log_File: node2-relay-bin.000002     
                Relay_Log_Pos: 571     
        Relay_Master_Log_File: master-bin.000001     
             Slave_IO_Running: Yes     
            Slave_SQL_Running: Yes     
              Replicate_Do_DB:

 

8. 测试双主相互复制

(1) node1上创建数据库

mysql> create database mydb;    
Query OK, 1 row affected (0.00 sec)

mysql> use mydb;    
Database changed     
mysql> show tables;     
+----------------+     
| Tables_in_mydb |     
+----------------+     
| user           |     
+----------------+     
1 row in set (0.01 sec)

mysql> select * from user;    
+----+----------+----------+     
| id | username | password |     
+----+----------+----------+     
| 1  | koumm    | 123456   |     
+----+----------+----------+     
1 row in set (0.01 sec)

mysql>


(2) node2上的操作如下:     
mysql> show databases;     
+--------------------+     
| Database           |     
+--------------------+     
| information_schema |     
| mydb               |     
| mysql              |     
| performance_schema |     
+--------------------+     
4 rows in set (0.00 sec)

mysql> use mydb;    
Database changed     
mysql> CREATE TABLE `user` (     
    ->   `id` varchar(20) NOT NULL,     
    ->   `username` varchar(20) NOT NULL,     
    ->   `password` char(32) NOT NULL,     
    ->   PRIMARY KEY (`id`)     
    -> ) ;     
Query OK, 0 rows affected (0.14 sec)

mysql> INSERT INTO user  VALUES ('1', 'koumm', '123456');    
Query OK, 1 row affected (0.00 sec)

mysql>

 

二、配置node3, node4同步node1(主)

1,node1主库锁表,导数据库

(1) 主库锁表

mysql> flush tables with read lock;    
Query OK, 0 rows affected (0.00 sec)


mysql> show master status;     
+-------------------+----------+--------------+------------------+-------------------+     
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |     
+-------------------+----------+--------------+------------------+-------------------+     
| master-bin.000001 |     1226 |              |                  |                   |     
+-------------------+----------+--------------+------------------+-------------------+     
1 row in set (0.00 sec)


(2) 主库备份

[root@master ~]# mysqldump -uroot -p -B mydb > mydb.sql    
说明:-B参数有建库语句。


(3) 主库解开锁表功能

mysql> unlock tables;    
Query OK, 0 rows affected (0.00 sec)     
mysql>

 

2. node3,node4从库导入数据库

# mysql -uroot -padmin < mydb.sql

 

3. node3,node4 从库配置同步过程

(1) 配置同步,手动执行同步参数,该配置会写入master.info文件中。

mysql >

CHANGE MASTER TO    
MASTER_HOST='192.168.0.201',     
MASTER_PORT=3306,     
MASTER_USER='repl',     
MASTER_PASSWORD='123456',     
MASTER_LOG_FILE='master-bin.000001',     
MASTER_LOG_POS=1226;


mysql> start slave;     
Query OK, 0 rows affected (0.04 sec)

mysql> show slave status\G;    
*************************** 1. row ***************************     
               Slave_IO_State: Waiting for master to send event     
                  Master_Host: 192.168.1.121     
                  Master_User: repl     
                  Master_Port: 3306     
                Connect_Retry: 60     
              Master_Log_File: master-bin.000005     
          Read_Master_Log_Pos: 730     
               Relay_Log_File: slave-relay-bin.000002     
                Relay_Log_Pos: 284     
        Relay_Master_Log_File: master-bin.000005     
             Slave_IO_Running: Yes     
            Slave_SQL_Running: Yes     
              Replicate_Do_DB:     
          Replicate_Ignore_DB:


可以简单测试一下同步数据情况。

 

三、配置MMM过程

1, 安装mmm软件包

(1) node1,node2,node3,node4,node5节点    
# wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm     
# rpm -ivh epel-release-6-8.noarch.rpm

(2) node5(监控节点)    
# yum install mysql-mmm mysql-mmm-agent mysql-mmm-tools mysql-mmm-monitor

(3) node1,node2,node3,node4 (mysql节点)    
# yum install mysql-mmm-agent

 

2,所有的MySQL节点添加以下两个用户,这里可以在主库做该操作,会自动同步到其它节点。

mysql> grant replication client on *.* to 'mmm_monitor'@'192.168.0.%' identified by '123456';    
mysql> grant super, replication client, process on *.* to 'mmm_agent'@'192.168.0.%' identified by '123456';     
mysql> flush privileges;

 

3,所有主机node1,node2,node3,node4,node5上创建如下文件

[root@node1 ~]# vi /etc/mysql-mmm/mmm_common.conf    
active_master_role      writer

<host default>    
    cluster_interface       eth0     
    pid_path                /var/run/mysql-mmm/mmm_agentd.pid     
    bin_path                /usr/libexec/mysql-mmm/     
    replication_user        repl     
    replication_password    123456     
    agent_user              mmm_agent     
    agent_password          123456     
</host>

<host db1>    
    ip      192.168.0.201     
    mode    master     
    peer    db2     
</host>

<host db2>    
    ip      192.168.0.202     
    mode    master     
    peer    db1     
</host>

<host db3>    
    ip      192.168.0.203     
    mode    slave     
</host>

<host db4>    
    ip      192.168.0.204     
    mode    slave     
</host>

<role writer>    
    hosts   db1, db2     
    ips     192.168.0.211     
    mode    exclusive     
</role>

<role reader>    
    hosts   db1, db2, db3, db4     
    ips     192.168.0.212,192.168.0.213,192.168.0.214,192.168.0.215     
    mode    balanced     
</role>

复制该配置文件到所有节点上:    
[root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node2:/etc/mysql-mmm/     
[root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node3:/etc/mysql-mmm/     
[root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node4:/etc/mysql-mmm/     
[root@node1 ~]# scp /etc/mysql-mmm/mmm_common.conf node5:/etc/mysql-mmm/

 

4. 配置agent代理节点,所有mysql主从服务器均为代理节点

(1) node1,node2,node3,node4 的MySQL节点配置mmm_agent.conf

# vi /etc/mysql-mmm/mmm_agent.conf    
include mmm_common.conf     
this db1   #在四台mysql节点上设置对应的db,分别为db1、db2、db3、db4


(2) 默认为启用,可以不用修改     
# cat /etc/default/mysql-mmm-agent     
# mysql-mmm-agent defaults     
ENABLED=1

(3) mysql节点启动服务

# chkconfig mysql-mmm-agent on    
# /etc/init.d/mysql-mmm-agent start

 

5,配置MMM管理监控节点node5

(1) 配置监控    
[root@node5 ~]# vi /etc/mysql-mmm/mmm_mon.conf     
include mmm_common.conf

<monitor>    
    ip                  127.0.0.1     
    pid_path            /var/run/mysql-mmm/mmm_mond.pid     
    bin_path            /usr/libexec/mysql-mmm     
    status_path         /var/lib/mysql-mmm/mmm_mond.status     
    ping_ips            192.168.0.201,192.168.0.202,192.168.0.203,192.168.0.204     
    auto_set_online     30

    # The kill_host_bin does not exist by default, though the monitor will    
    # throw a warning about it missing.  See the section 5.10 "Kill Host     
    # Functionality" in the PDF documentation.     
    #     
    # kill_host_bin     /usr/libexec/mysql-mmm/monitor/kill_host     
    #     
</monitor>

<host default>    
    monitor_user        mmm_monitor     
    monitor_password    123456     
</host>

debug 0


(2) mysql节点启动服务

# chkconfig mysql-mmm-monitor on    
# /etc/init.d/mysql-mmm-monitor start

 

6, 查看状态

(1) 管理服务器上查看状态

[root@node1 ~]# mmm_control show    
  db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.215), writer(192.168.0.211)     
  db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213)     
  db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.212)     
  db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)

[root@node1 ~]# mmm_control checks all    
db4  ping         [last change: 2016/04/27 16:45:49]  OK     
db4  mysql        [last change: 2016/04/27 16:45:49]  OK     
db4  rep_threads  [last change: 2016/04/27 16:45:49]  OK     
db4  rep_backlog  [last change: 2016/04/27 16:45:49]  OK: Backlog is null     
db2  ping         [last change: 2016/04/27 16:45:49]  OK     
db2  mysql        [last change: 2016/04/27 16:45:49]  OK     
db2  rep_threads  [last change: 2016/04/27 16:45:49]  OK     
db2  rep_backlog  [last change: 2016/04/27 16:45:49]  OK: Backlog is null     
db3  ping         [last change: 2016/04/27 16:45:49]  OK     
db3  mysql        [last change: 2016/04/27 16:45:49]  OK     
db3  rep_threads  [last change: 2016/04/27 16:45:49]  OK     
db3  rep_backlog  [last change: 2016/04/27 16:45:49]  OK: Backlog is null     
db1  ping         [last change: 2016/04/27 16:45:49]  OK     
db1  mysql        [last change: 2016/04/27 16:45:49]  OK     
db1  rep_threads  [last change: 2016/04/27 16:45:49]  OK     
db1  rep_backlog  [last change: 2016/04/27 16:45:49]  OK: Backlog is null     
[root@node1 ~]#


(2) 服务器的启动VIP地址

[root@node1 ~]# ip a    
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN     
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00     
    inet 127.0.0.1/8 scope host lo     
    inet6 ::1/128 scope host     
       valid_lft forever preferred_lft forever     
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000     
    link/ether 00:0c:29:ff:33:6a brd ff:ff:ff:ff:ff:ff     
    inet 192.168.0.201/24 brd 192.168.0.255 scope global eth0     
    inet 192.168.0.215/32 scope global eth0     
    inet 192.168.0.211/32 scope global eth0     
    inet6 fe80::20c:29ff:feff:336a/64 scope link     
       valid_lft forever preferred_lft forever     
[root@node1 ~]#

[root@node2 ~]# ip a    
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN     
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00     
    inet 127.0.0.1/8 scope host lo     
    inet6 ::1/128 scope host     
       valid_lft forever preferred_lft forever     
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000     
    link/ether 00:0c:29:8c:60:58 brd ff:ff:ff:ff:ff:ff     
    inet 192.168.0.202/24 brd 192.168.0.255 scope global eth0     
    inet 192.168.0.213/32 scope global eth0     
    inet6 fe80::20c:29ff:fe8c:6058/64 scope link     
       valid_lft forever preferred_lft forever     
[root@node2 ~]#

服务器读写采有VIP地址进行读写,出现故障时VIP会漂移到其它节点,由其它节点提供服务。


四、MMM高可用测试

首先查看整个集群的状态,可以看到整个集群状态正常。
[root@node5 ~]# mmm_control show      
  db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212), writer(192.168.0.211)
  db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213)
  db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.215)
  db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)
[root@192.168.0.30 ~]#

1. 模拟node2宕机,手动停止mysql服务,观察monitor日志node2由关闭到启动的日志如下:

[root@192.168.0.30 ~]# tail -f /var/log/mysql-mmm/mmm_mond.log 
2016/04/29 09:58:33 FATAL State of host 'db2' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)
2016/04/29 10:00:54 FATAL State of host 'db2' changed from HARD_OFFLINE to AWAITING_RECOVERY
2016/04/29 10:01:24 FATAL State of host 'db2' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(30 seconds). It was in state AWAITING_RECOVERY for 30

seconds

重新查看集群的最新状态:

[root@node5 ~]# mmm_control show
  db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212), writer(192.168.0.211)
  db2(192.168.0.202) master/HARD_OFFLINE. Roles:   #从日志发现db2的状态有ONLINE转换为HARD_OFFLINE
  db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.213), reader(192.168.0.215)
  db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)

[root@node5 ~]# mmm_control checks all
db4  ping         [last change: 2016/04/29 09:01:08]  OK
db4  mysql        [last change: 2016/04/29 09:01:08]  OK
db4  rep_threads  [last change: 2016/04/29 09:01:08]  OK
db4  rep_backlog  [last change: 2016/04/29 09:01:08]  OK: Backlog is null
db2  ping         [last change: 2016/04/29 09:01:08]  OK
db2  mysql        [last change: 2016/04/29 09:58:33]  ERROR: Connect error (host = 192.168.0.202:3306, user = mmm_monitor)! Lost connection to MySQL server at 'reading

initial communication packet', system error: 111   能ping通,说明只是服务死掉了。
db2  rep_threads  [last change: 2016/04/29 09:01:08]  OK
db2  rep_backlog  [last change: 2016/04/29 09:01:08]  OK: Backlog is null
db3  ping         [last change: 2016/04/29 09:01:08]  OK
db3  mysql        [last change: 2016/04/29 09:01:08]  OK
db3  rep_threads  [last change: 2016/04/29 09:01:08]  OK
db3  rep_backlog  [last change: 2016/04/29 09:01:08]  OK: Backlog is null
db1  ping         [last change: 2016/04/29 09:01:08]  OK
db1  mysql        [last change: 2016/04/29 09:56:33]  OK
db1  rep_threads  [last change: 2016/04/29 09:01:08]  OK
db1  rep_backlog  [last change: 2016/04/29 09:01:08]  OK: Backlog is null

[root@node5 ~]#

启动node2,node2由HARD_OFFLINE转到AWAITING_RECOVERY状态。这里db2再次接管读请求。

[root@node5 ~]# mmm_control show      
  db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212), writer(192.168.0.211)
  db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213)
  db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.215)
  db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)
[root@node5 ~]#


2. 模拟node1主库宕机,查看集群状态。

[root@node5 ~]# mmm_control show
  db1(192.168.0.201) master/HARD_OFFLINE. Roles: 
  db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213), writer(192.168.0.211)
  db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.212), reader(192.168.0.215)
  db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)

node1主库再启动:

[root@node5 ~]# mmm_control show
  db1(192.168.0.201) master/ONLINE. Roles: reader(192.168.0.212)
  db2(192.168.0.202) master/ONLINE. Roles: reader(192.168.0.213), writer(192.168.0.211)
  db3(192.168.0.203) slave/ONLINE. Roles: reader(192.168.0.215)
  db4(192.168.0.204) slave/ONLINE. Roles: reader(192.168.0.214)

可以看到主库启动用会自动转到读的角色,不会接管主,只到现有的主再次宕机。

[root@192.168.0.30 ~]# tail -f /var/log/mysql-mmm/mmm_mond.log 
2016/04/29 10:03:25 FATAL State of host 'db1' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)
2016/04/29 10:06:20 FATAL State of host 'db1' changed from HARD_OFFLINE to AWAITING_RECOVERY
2016/04/29 10:06:51 FATAL State of host 'db1' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(30 seconds). It was in state AWAITING_RECOVERY for 31

seconds

3. 测试小结:
(1)node2备选主节点宕机不影响集群的状态,就是移除了node2备选节点的读状态。
(2)node1主节点宕机,由node2备选主节点接管写角色,node3,node4指向新node2主库进行复制,node3,node4会自动change master到node2.
(3)问题来了:
如果node1主库宕机,node2复制应用又落后于node1时就变成了主可写状态,这进的数据主无法保证一致性。
如果node2,node3,node4延迟于node1主,这个时node1宕机,node3,node4将会等待数据追上db1后,再重新指向新的主node2进行复制操作,这时的数据也无法保证同步的一致性。
(4)如查采用MMM高可用架构,主,主备选节点机器配置一样,而且开启半同步进一步提高安全性或采用MariaDB进行多线程从复制,提高复制的性能。

   
说明:MMM(Master-Master Replication Manager for MySQL)主主复制管理器, MMM集群套件具有良好的稳定性、高可用性和可扩展性。当活动的Master节点出现故障时,备用Master节点可以立即接管,而其他的Slave节点也能自动切换到备用Master节点继续进行同步复制,而无需人为干涉;MMM架构需要多个节点、多个IP,对服务器数量有要求,在读写非常繁忙的业务系统下表现不是很稳定,可能会出现复制延时、切换失效等问题。MMM方案并不太适应于对数据安全性要求很高,并且读、写繁忙的环境中。

本文出自 “koumm的linux技术博客” 博客,请务必保留此出处http://koumm.blog.51cto.com/703525/1768381