MySQL MHA主库高可用方案学习
环境
- VMWare 3台虚拟机,CentOS 7
- MySQL 5.7.12
- MHA 0.56
安装
MySQL采用RPM包直接装,初始密码为grep 'temporary password' /var/log/mysqld.log
。如果需要的话,可以清空/var/lib/mysql
目录,然后执行mysqld --initialize
重新初始化数据库目录。
MHA源码安装,RPM安装会出问题:Can't locate MHA/NodeConst.pm in @INC
。另外需要安装一些perl的依赖:yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager cpan
。
MHA配置
1、我的配置文件:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[server default]
manager_workdir=/home/roketyyang/Learn/MHA/work
manager_log=/home/roketyyang/Learn/MHA/work/manager.log
user=mha_manager
password=mhapass
ssh_user=roketyyang
ping_interval=1
repl_user=repl
repl_password=slavepass
[server1]
hostname=192.168.244.128
candidate_master=1
[server2]
hostname=192.168.244.129
candidate_master=1
[server3]
hostname=192.168.244.130
candidate_master=1
MHA会使用mha_manager连上数据库,做一些操作,其创建:
1
2
3
4
CREATE USER 'mha_manager'@'192.168.%' IDENTIFIED BY 'mhapass';
GRANT ALL ON *.* TO 'mha_manager'@'192.168.%';
CREATE USER 'repl'@'192.168.%' IDENTIFIED BY 'slavepass';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.%';
mha_manager需要有STOP SLAVE;CHANGE MASTER TO;RESET SLAVE等的权限,由于RESET SLAVE这个权限从官方文档中没找到要怎么GRANT,所以这里只能选择GRANT ALL了。所以mha_manager的权限级别算是很高的了= =
2、配置3台机器SSH免密互登录,MHA的Manager会先逐个ssh上Node机器,然后再在Node机器上ssh其他Node机器来测试是否相通。SSH公钥免密码登录
接着执行masterha_check_ssh --conf=./app.cnf
,结果:
3、配置主从同步,MySQL GTID的主从同步配置可参考我之前的文章:MySQL Gtid复制方案学习,然后执行masterha_check_repl --conf=./app.cnf
,结果:
4、启动masterha_manager
5、简单模拟failover,停掉master
6、从work/manager.log中可以看到failover的整个过程,如果成功的话,最后显示的会是:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
……
----- Failover Report -----
app: MySQL Master failover 192.168.244.128(192.168.244.128:3306) to 192.168.244.129(192.168.244.129:3306) succeeded
Master 192.168.244.128(192.168.244.128:3306) is down!
Check MHA Manager logs at localhost.localdomain:/home/roketyyang/Learn/MHA/work/manager.log for details.
Started automated(non-interactive) failover.
Selected 192.168.244.129(192.168.244.129:3306) as a new master.
192.168.244.129(192.168.244.129:3306): OK: Applying all logs succeeded.
192.168.244.130(192.168.244.130:3306): OK: Slave started, replicating from 192.168.244.129(192.168.244.129:3306)
192.168.244.129(192.168.244.129:3306): Resetting slave info succeeded.
Master failover to 192.168.244.129(192.168.244.129:3306) completed successfully
登录128机器查看slave信息:
1
2
3
4
5
6
7
8
9
10
11
12
13
Slave_IO_State | Waiting for master to send event
Master_Host | 192.168.244.129
Master_User | repl
Master_Port | 3306
Connect_Retry | 60
Master_Log_File | master.000003
Read_Master_Log_Pos | 194
Relay_Log_File | slave-2.000002
Relay_Log_Pos | 361
Relay_Master_Log_File | master.000003
Slave_IO_Running | Yes
Slave_SQL_Running | Yes
……
关于mha的实现原理,可查看其官方文档,也可看看这个PPT:Automated master failover。
结合 VIP 实现 failover 时,能继续提供写服务
在MHA的配置文件中配置master_ip_failover_script。
1、如何配置VIP:Create a Virtual IP Address in Linux
2、配置SSH用户有权限执行ifconfig、arping命令
在visudo中,添加roketyyang ALL= NOPASSWD: /sbin/ifconfig, /sbin/arping
。
这里需要注意的一点是ssh_start_vip中的 arping 命令,该命令向网段广播 ARP 请求包,并告知自己是 VIP 的拥有者。这样,同一网段的其他服务器就能够更新 ARP 缓存了。否则会因为 ARP 缓存的原因导致failover 后,一段时间内无法连接到 New Master 上。
参考:Allow A Normal User To Run Commands As root Under Linux / UNIX Operating Systems
3、master_ip_failover_script脚本
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
use MHA::DBHelper;
my (
$command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password
);
# 增加的部分代码
my $vip = '192.168.244.200';
my $gateway = '192.168.244.2';
my $ssh_start_vip = "sudo ifconfig eno16777736:1 $vip;sudo arping -c 3 -I eno16777736 -s $vip $gateway";
my $ssh_stop_vip = 'sudo ifconfig eno16777736:1 down';
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
&stop_vip();
# updating global catalog, etc
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
# If you want to continue failover, exit 10.
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the status of the script: ssh -t $ssh_user\@$orig_master_host \"$ssh_start_vip\"\n";
`ssh -t $ssh_user\@$orig_master_host \"$ssh_start_vip\"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
sub start_vip() {
print "Checking the start of the script: ssh -t $ssh_user\@$new_master_host \"$ssh_start_vip\"\n";
`ssh -t $ssh_user\@$new_master_host \"$ssh_start_vip\"`;
}
sub stop_vip() {
print "Checking the stop/stopssh of the script: ssh -t $ssh_user\@$orig_master_host \"$ssh_stop_vip\"\n";
`ssh -t $ssh_user\@$orig_master_host \"$ssh_stop_vip\"`;
}
模拟一遍failover,可看到master_ip_failover_script的调用流程:
1、masterha_manager启动时,会调用
1
Checking the status of the script: ssh -t roketyyang@192.168.244.129 "sudo ifconfig eno16777736:1 192.168.244.200;sudo arping -c 3 -I eno16777736 -s 192.168.244.200 192.168.244.2"
这时,在master设置VIP。
2、在failover阶段Phase 2: Dead Master Shutdown Phase..
,会调用
1
Checking the stop/stopssh of the script: ssh -t roketyyang@192.168.244.129 "sudo ifconfig eno16777736:1 down"
这时,是停掉了Dead Master上的VIP。
3、在failover阶段* Phase 3.3: New Master Recovery Phase..
,会调用
1
Checking the start of the script: ssh -t roketyyang@192.168.244.128 "sudo ifconfig eno16777736:1 192.168.244.200;sudo arping -c 3 -I eno16777736 -s 192.168.244.200 192.168.244.2"
这时,在New Master上设置VIP。
以上,整个流程都可以从manager.log看到。