1、MHA的介绍:
由日本DeNA公司youshimaton开发。
是一套优秀的实现MySQL高可用的解决方案。
数据库的自动故障切换操作能做到在0~30秒之内完成。
MHA能确保在故障切换过程中最大限度保证数据的一致性,以达到真正意义上的高可用。
2、MHA集群架构:
数据库服务器至少需要3台以上,才能实现数据的高可用。
主和从服务器之间有一个vip地址,当主出现问题时,从服务器会自动转换成主服务器,为了保证数据库服务器完全正常,使用2台从数据库服务器。
3、MHA工作原理:
当管理数据库集群的服务器manager定时探测集群中的master节点。
当master故障时,manager自动将拥有最新数据的从服务器提升为新的master服务器。相当于manager为一个监控与执行服务器。
4、MHA的缺点:
必须要有vip地址,宕机的主服务器需要手动添加到集群里,还需要手动同步宕机期间的数据,管理服务发现主服务器宕机后,会调用故障切换脚本,把vip地址部署在新的主数据库服务器上。管理服务会自动停止,需要手动启动管理服务器,才能监视新的主数据服务器,故障切换期间会有数据丢失的情况。
5、配置MAH集群
管理端软件(管理节点):安装管理集群主机上的软件
数据端软件(数据节点):安装在数据库服务器上的软件
链接:https://pan.baidu.com/s/1vD-Br7x5Ovs2Ck_apaZJPA
提取码:xbjy
说明:zip格式,可以使用unzip解压
5.1、环境准备
数据服务器(3台):192.168.4.10 | 20 | 30,配置全新的数据库服务
管理服务器(1台):192.168.4.40,不需要有数据库服务
客 户 端(1台):192.168.4.50,有连接命令mysql即可
所有服务器关闭防火墙、selinux服务
5.2、三台数据服务器的部署
配置192.168.4.10服务器
[mysqld]
plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl_semi_sync_slave_enabled=1
rpl_semi_sync_master_enabled=1
relay_log_purge=0
server_id=10
log_bin=master10
[root@mysql10 ~]# systemctl restart mysqld
[root@mysql10 ~]# grep "password" /var/log/mysqld.log
2022-05-18T03:16:56.610678Z 1 [Note] A temporary password is generated for root@localhost: 2f;?Aha*qNUd
[root@mysql10 ~]# mysql -uroot -p"2f;?Aha*qNUd"
mysql> alter user root@"localhost" identified by "JY1987...zy2011";
Query OK, 0 rows affected (10.01 sec)
mysql> exit
[root@mysql10 ~]# mysql -uroot -pJY1987...zy2011
mysql> grant replication slave on *.* to repluser@"%" identified by "JY1987...zy2011";
Query OK, 0 rows affected, 1 warning (0.01 sec)
配置192.168.4.20服务器
[root@mysql20 ~]# vim /etc/my.cnf
[mysqld]
plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl_semi_sync_master_enabled=1
rpl_semi_sync_slave_enabled=1
relay_log_purge=0
server_id=20
log_bin=master20
[root@mysql20 ~]# systemctl restart mysqld
[root@mysql20 ~]# grep "password" /var/log/mysqld.log
2022-05-18T03:22:14.083847Z 1 [Note] A temporary password is generated for root@localhost: e>5/0DGNP5.T
[root@mysql20 ~]# mysql -uroot -p"e>5/0DGNP5.T"
mysql> alter user root@"localhost" identified by "JY1987...zy2011";
Query OK, 0 rows affected (10.00 sec)
mysql> exit
[root@mysql20 ~]# mysql -uroot -pJY1987...zy2011
mysql> grant replication slave on *.* to repluser@"%" identified by "JY1987...zy2011";
Query OK, 0 rows affected, 1 warning (0.00 sec)
配置192.168.4.30服务器
[root@mysql30 ~]# vim /etc/my.cnf
[mysqld]
plugin-load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl_semi_sync_master_enabled=1
rpl_semi_sync_slave_enabled=1
relay_log_purge=0
server_id=30
log_bin=master30
[root@mysql30 ~]# systemctl restart mysqld
[root@mysql30 ~]# grep "password" /var/log/mysqld.log
2022-05-18T03:22:31.746573Z 1 [Note] A temporary password is generated for root@localhost: fo448kwl=&4O
[root@mysql30 ~]# mysql -uroot -p"fo448kwl=&4O"
mysql> alter user root@"localhost" identified by "JY1987...zy2011";
Query OK, 0 rows affected (10.01 sec)
mysql> exit
[root@mysql30 ~]# mysql -uroot -pJY1987...zy2011
mysql> grant replication slave on *.* to repluser@"%" Identified by "JY1987...zy2011";
Query OK, 0 rows affected, 1 warning (0.00 sec)
5.3、数据库服务器配置ssh免密登录
[root@mysql10 ~]# ssh-keygen 遇到提示就回车
[root@mysql10 ~]# ssh-copy-id root@192.168.4.20
[root@mysql10 ~]# ssh-copy-id root@192.168.4.30
[root@mysql20 ~]# ssh-keygen 遇到提示就回车
[root@mysql20 ~]# ssh-copy-id root@192.168.4.10
[root@mysql20 ~]# ssh-copy-id root@192.168.4.30
[root@mysql30 ~]# ssh-keygen 遇到提示就回车
[root@mysql30 ~]# ssh-copy-id root@192.168.4.10
[root@mysql30 ~]# ssh-copy-id root@192.168.4.20
5.4、部署一主从多的架构,mysql20和mysq30都做如下配置,配置为mysql10的从
mysql> change master to master_host="192.168.4.10";
Query OK, 0 rows affected (0.12 sec)
mysql> change master to master_host="192.168.4.10",
-> master_user="repluser",
-> master_password="JY1987...zy2011",
-> master_log_file="master10.000002",
-> master_log_pos=685;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.4.10
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master10.000002
Read_Master_Log_Pos: 685
Relay_Log_File: mysql20-relay-bin.000002
Relay_Log_Pos: 319
Relay_Master_Log_File: master10.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
5.4、管理服务器配置ssh免密登录
[root@mah40 ~]# ssh-keygen 遇到提示就回车
[root@mah40 ~]# ssh-copy-id root@192.168.4.10
[root@mah40 ~]# ssh-copy-id root@192.168.4.20
[root@mah40 ~]# ssh-copy-id root@192.168.4.30
5.5、管理服务器安装相关软件及配置
[root@mah40 ~]# ls
anaconda-ks.cfg mha.zip
[root@mah40 ~]# unzip mha.zip
[root@mah40 ~]# cd mha/
[root@mah40 mha]# yum -y install mha4mysql-node-0.56-0.el6.noarch.rpm
[root@mah40 mha]# yum -y install perl-*.rpm
[root@mah40 mha]# yum -y install perl-ExtUtils-* perl-CPAN*
[root@mah40 ~]# tar -xf mha4mysql-manager-0.56.tar.gz
[root@mah40 ~]# cd mha4mysql-manager-0.56/
[root@mah40 mha4mysql-manager-0.56]# perl Makefile.PL
*** Module::AutoInstall version 1.03
*** Checking for Perl dependencies...
[Core Features]
- DBI ...loaded. (1.627)
- DBD::mysql ...loaded. (4.023)
- Time::HiRes ...loaded. (1.9725)
- Config::Tiny ...loaded. (2.14)
- Log::Dispatch ...loaded. (2.41)
- Parallel::ForkManager ...loaded. (1.18)
- MHA::NodeConst ...loaded. (0.56)
*** Module::AutoInstall configuration finished.
Checking if your kit is complete...
Looks good
Writing Makefile for mha4mysql::manager
Writing MYMETA.yml and MYMETA.json
[root@mah40 mha4mysql-manager-0.56]# make && make install
[root@mah40 mha4mysql-manager-0.56]# masterha_
masterha_check_repl masterha_check_status masterha_manager masterha_master_switch masterha_stop
masterha_check_ssh masterha_conf_host masterha_master_monitor masterha_secondary_check
5.6、创建并编辑管理服务的主配置文件
[root@mah40 ~]# mkdir /etc/mha
[root@mah40 ~]# cd mha4mysql-manager-0.56/
[root@mah40 mha4mysql-manager-0.56]# cp samples/conf/app1.cnf /etc/mha/
[root@mah40 ~]# vim /etc/mha/app1.cnf # 编辑配置文件
[server default]
manager_workdir=/etc/mha
manager_log=/etc/mha/manager.log
master_ip_failover_script=/etc/mha/master_ip_failover
ssh_user=root
ssh_port=22
repl_user=repluser
repl_password=JY1987...zy2011
user=jy
password=JY1987...zy2011
[server1]
hostname=192.168.4.10
port=3306
candidate_master=1
[server2]
hostname=192.168.4.20
port=3306
candidate_master=1
[server3]
hostname=192.168.4.30
port=3306
candidate_master=1
# 创建故障切换脚本并指定vip地址部署在哪块网卡上
[root@mah40 ~]# cd mha
[root@mah40 mha]# mv master_ip_failover /etc/mha/
[root@mah40 mha]# chmod +x /etc/mha/master_ip_failover
[root@mah40 mha]# vim +35 /etc/mha/master_ip_failover
my $vip = '192.168.4.100/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
5.7、配置三台数据库服务器,数据库都有ifconfig 命令
1)把故障切换脚本里指定的vip地址,配置在当前主从结构种的主数据库服务器mysql10主机上
[root@mysql10 ~]# which ifconfig || yum -y install net-tools
[root@mysql10 ~]# ifconfig eth0:1 192.168.4.100/24
[root@mysql10 ~]# ifconfig eht0:1
eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.4.100 netmask 255.255.255.0 broadcast 192.168.4.255
ether 00:0c:29:34:46:2e txqueuelen 1000 (Ethernet)
2)三台数据库服务器都要安装以下安软件
[root@mysql10 mha]# yum -y install perl-*.rpm
[root@mysql10 mha]# yum -y install mha4mysql-node-0.56-0.el6.noarch.rpm
3)添加监控用户,在master服务器添加用户,在slave服务器查看用户
[root@mysql10 ~]# mysql -uroot -pJY1987...zy2011
mysql> grant all on *.* to jy@"%" identified by "JY1987...zy2011";
Query OK, 0 rows affected, 1 warning (0.00 sec)
[root@mysql20 ~]# mysql -uroot -pJY1987...zy2011 -e "select user from mysql.user where user='jy'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+------+
| user |
+------+
| jy |
+------+
[root@mysql30 ~]# mysql -uroot -pJY1987...zy2011 -e "select user from mysql.user where user='jy'"
mysql: [Warning] Using a password on the command line interface can be insecure.
+------+
| user |
+------+
| jy |
+------+
5.8、在管理主机对配置做测试
[root@mah ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf # 测试免密登录
Wed May 18 00:17:51 2022 - [info] All SSH connection tests passed successfully. # 提示成功
[root@mah ~]# masterha_check_repl --conf=/etc/mha/app1.cnf # 测试主从同步
MySQL Replication Health is NOT OK! # 提示成功
1)测试成功,才能启动服务
[root@mah ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover 2> /dev/null &
[1] 5575
[root@mah ~]# jobs
[1]+ 运行中 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover 2> /dev/null &
[root@mah40 ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:11846) is running(0:PING_OK), master:192.168.4.10
[root@mysql10 ~]# ifconfig eth0:1
eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.4.100 netmask 255.255.255.0 broadcast 192.168.4.255
ether 00:0c:29:cc:dc:7a txqueuelen 1000 (Ethernet)
5.9、集群测试
1)主服务器mysql10添加访问数据的连接用户
mysql> create database xb;
Query OK, 1 row affected (0.00 sec)
mysql> create table xb.zy(id int);
Query OK, 0 rows affected (0.01 sec)
mysql> grant select ,insert on xb.* to jj@"%" identified by "JY1987...zy2011";
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> insert into xb.zy(id) values(99);
Query OK, 1 row affected (10.07 sec)
mysql> select * from xb.zy;
+------+
| id |
+------+
| 99 |
+------+
2)客户端50连接vip地址访问集群
[root@host50 ~]# mysql -h192.168.4.100 -ujj -pJY1987...zy2011;
MySQL [(none)]> select * from xb.zy;
Empty set (0.00 sec)
MySQL [(none)]> exit
3)两台从服务器查看数据
[root@mysql20 ~]# mysql -uroot -pJY1987...zy2011 -e "select * from xb.zy"
mysql: [Warning] Using a password on the command line interface can be insecure.
+------+
| id |
+------+
| 99 |
+------+
[root@mysql30 ~]# mysql -uroot -pJY1987...zy2011 -e "select * from xb.zy"
mysql: [Warning] Using a password on the command line interface can be insecure.
+------+
| id |
+------+
| 99 |
+------+
4)停止主服务器mysql10的数据库服务
[root@mysql10 ~]# systemctl stop mysqld.service # 停止主服务的数据库服务
5) 客户端50再次连接vip地址访问集群,验证集群的高可靠
[root@host50 ~]# mysql -h192.168.4.100 -ujj -pJY1987...zy2011;
MySQL [(none)]> select * from xb.zy;
+------+
| id |
+------+
| 99 |
+------+
1 row in set (0.01 sec)
6)在两台从服务器查看vip地址,vip地址192.168.4.100会切换到新的主数据库服务器上
[root@mysql20 ~]# ifconfig eth0:1
eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.4.100 netmask 255.255.255.0 broadcast 192.168.4.255
ether 00:0c:29:69:e0:6a txqueuelen 1000 (Ethernet)
[root@mysql30 ~]# ifconfig eth0:1
eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 00:0c:29:77:20:31 txqueuelen 1000 (Ethernet)
[root@mysql30 mha]# mysql -uroot -pJY1987...zy2011 -e 'show slave status\G' | grep -i yes
mysql: [Warning] Using a password on the command line interface can be insecure.
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[root@mysql30 mha]# mysql -uroot -pJY1987...zy2011 -e 'show slave status\G' | grep -i master_host
mysql: [Warning] Using a password on the command line interface can be insecure.
Master_Host: 192.168.4.20
7)在mha管理主机40上查看配置文件和服务状态
[root@mah40 ~]# grep "server[1,2,3]" /etc/mha/app1.cnf # 主数据库服务器的信息没有了
[server2]
[server3]
[root@mah40 ~]# masterha_check_status --conf=/etc/mha/app1.cnf # 出现故障后,服务会停掉,要重新启动服务
app1 is stopped(2:NOT_RUNNING).
[root@mah40 ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover 2> /dev/null &
[1] 12893
[root@mah40 ~]# jobs
[1]+ 运行中 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover 2> /dev/null &
[root@mah40 ~]# masterha_check_status --conf=/etc/mha/app1.cnf # 服务监控新的主数据库服务器
app1 (pid:12893) is running(0:PING_OK), master:192.168.4.20
5.10、修复宕机的10数据库服务器
1)把192.168.4.10配置为当前master服务器192.168.4.20的slave
# 在master主机上做数据备份,拷贝给10服务器
[root@mysql20 mha]# mysqldump -uroot -pJY1987...zy2011 --master-data -B xb > xb.sql
[root@mysql20 mha]# scp xb.sql 192.168.4.10:/root/
# 数据同步后,把10服务器配置为master数据库服务的slave
[root@mysql10 ~]# grep master20 xb.sql
CHANGE MASTER TO MASTER_LOG_FILE='master20.000003', MASTER_LOG_POS=154;
[root@mysql10 ~]# systemctl restart mysqld
[root@mysql10 ~]# mysql -uroot -pJY1987...zy2011 < xb.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@mysql10 ~]# mysql -uroot -pJY1987...zy2011
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> change master to master_host="192.168.4.20",
-> master_user="repluser",
-> master_password="JY1987...zy2011",
-> master_log_file="master20.000003",
-> master_log_pos=154;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.4.20
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master20.000003
Read_Master_Log_Pos: 403
Relay_Log_File: mysql10-relay-bin.000002
Relay_Log_Pos: 568
Relay_Master_Log_File: master20.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes