02 May 2016

环境


  • VMWare 3台虚拟机,CentOS 7
  • MySQL 5.7.12
  • MHA 0.56

安装


MySQL采用RPM包直接装,初始密码为grep 'temporary password' /var/log/mysqld.log。如果需要的话,可以清空/var/lib/mysql目录,然后执行mysqld --initialize重新初始化数据库目录。

MHA源码安装,RPM安装会出问题:Can't locate MHA/NodeConst.pm in @INC。另外需要安装一些perl的依赖:yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager cpan

MHA配置


1、我的配置文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[server default]
manager_workdir=/home/roketyyang/Learn/MHA/work
manager_log=/home/roketyyang/Learn/MHA/work/manager.log
user=mha_manager
password=mhapass
ssh_user=roketyyang
ping_interval=1
repl_user=repl
repl_password=slavepass

[server1]
hostname=192.168.244.128
candidate_master=1

[server2]
hostname=192.168.244.129
candidate_master=1

[server3]
hostname=192.168.244.130
candidate_master=1

MHA会使用mha_manager连上数据库,做一些操作,其创建:

1
2
3
4
CREATE USER 'mha_manager'@'192.168.%' IDENTIFIED BY 'mhapass';
GRANT ALL ON *.* TO 'mha_manager'@'192.168.%';
CREATE USER 'repl'@'192.168.%' IDENTIFIED BY 'slavepass';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.%';

mha_manager需要有STOP SLAVE;CHANGE MASTER TO;RESET SLAVE等的权限,由于RESET SLAVE这个权限从官方文档中没找到要怎么GRANT,所以这里只能选择GRANT ALL了。所以mha_manager的权限级别算是很高的了= =

2、配置3台机器SSH免密互登录,MHA的Manager会先逐个ssh上Node机器,然后再在Node机器上ssh其他Node机器来测试是否相通。SSH公钥免密码登录

接着执行masterha_check_ssh --conf=./app.cnf,结果:

3、配置主从同步,MySQL GTID的主从同步配置可参考我之前的文章:MySQL Gtid复制方案学习,然后执行masterha_check_repl --conf=./app.cnf,结果:

4、启动masterha_manager

5、简单模拟failover,停掉master

6、从work/manager.log中可以看到failover的整个过程,如果成功的话,最后显示的会是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
……
----- Failover Report -----

app: MySQL Master failover 192.168.244.128(192.168.244.128:3306) to 192.168.244.129(192.168.244.129:3306) succeeded

Master 192.168.244.128(192.168.244.128:3306) is down!

Check MHA Manager logs at localhost.localdomain:/home/roketyyang/Learn/MHA/work/manager.log for details.

Started automated(non-interactive) failover.
Selected 192.168.244.129(192.168.244.129:3306) as a new master.
192.168.244.129(192.168.244.129:3306): OK: Applying all logs succeeded.
192.168.244.130(192.168.244.130:3306): OK: Slave started, replicating from 192.168.244.129(192.168.244.129:3306)
192.168.244.129(192.168.244.129:3306): Resetting slave info succeeded.
Master failover to 192.168.244.129(192.168.244.129:3306) completed successfully

登录128机器查看slave信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
Slave_IO_State                | Waiting for master to send event
Master_Host                   | 192.168.244.129
Master_User                   | repl
Master_Port                   | 3306
Connect_Retry                 | 60
Master_Log_File               | master.000003
Read_Master_Log_Pos           | 194
Relay_Log_File                | slave-2.000002
Relay_Log_Pos                 | 361
Relay_Master_Log_File         | master.000003
Slave_IO_Running              | Yes
Slave_SQL_Running             | Yes
……

关于mha的实现原理,可查看其官方文档,也可看看这个PPT:Automated master failover

结合 VIP 实现 failover 时,能继续提供写服务


在MHA的配置文件中配置master_ip_failover_script

1、如何配置VIP:Create a Virtual IP Address in Linux

2、配置SSH用户有权限执行ifconfig、arping命令

在visudo中,添加roketyyang ALL= NOPASSWD: /sbin/ifconfig, /sbin/arping

这里需要注意的一点是ssh_start_vip中的 arping 命令,该命令向网段广播 ARP 请求包,并告知自己是 VIP 的拥有者。这样,同一网段的其他服务器就能够更新 ARP 缓存了。否则会因为 ARP 缓存的原因导致failover 后,一段时间内无法连接到 New Master 上。

参考:Allow A Normal User To Run Commands As root Under Linux / UNIX Operating Systems

3、master_ip_failover_script脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;

my (
  $command,        $ssh_user,         $orig_master_host,
  $orig_master_ip, $orig_master_port, $new_master_host,
  $new_master_ip,  $new_master_port,  $new_master_user,
  $new_master_password
);

# 增加的部分代码
my $vip = '192.168.244.200';
my $gateway = '192.168.244.2';
my $ssh_start_vip = "sudo ifconfig eno16777736:1 $vip;sudo arping -c 3 -I eno16777736 -s $vip $gateway";
my $ssh_stop_vip = 'sudo ifconfig eno16777736:1 down';
GetOptions(
  'command=s'             => \$command,
  'ssh_user=s'            => \$ssh_user,
  'orig_master_host=s'    => \$orig_master_host,
  'orig_master_ip=s'      => \$orig_master_ip,
  'orig_master_port=i'    => \$orig_master_port,
  'new_master_host=s'     => \$new_master_host,
  'new_master_ip=s'       => \$new_master_ip,
  'new_master_port=i'     => \$new_master_port,
  'new_master_user=s'     => \$new_master_user,
  'new_master_password=s' => \$new_master_password,
);

exit &main();

sub main {
  if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
    # If you manage master ip address at global catalog database,
    # invalidate orig_master_ip here.
    my $exit_code = 1;
    eval {
          &stop_vip();
      # updating global catalog, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {

    # all arguments are passed.
    # If you manage master ip address at global catalog database,
    # activate new_master_ip here.
    # You can also grant write access (create user, set read_only=0, etc) here.
    my $exit_code = 10;
    eval {
          &start_vip();
      $exit_code = 0;
    };
    if ($@) {
      warn $@;

      # If you want to continue failover, exit 10.
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {
        print "Checking the status of the script: ssh -t $ssh_user\@$orig_master_host \"$ssh_start_vip\"\n";
        `ssh -t $ssh_user\@$orig_master_host \"$ssh_start_vip\"`;
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

sub usage {
  print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

sub start_vip() {
    print "Checking the start of the script: ssh -t $ssh_user\@$new_master_host \"$ssh_start_vip\"\n";
    `ssh -t $ssh_user\@$new_master_host \"$ssh_start_vip\"`;
}
sub stop_vip() {
    print "Checking the stop/stopssh of the script: ssh -t $ssh_user\@$orig_master_host \"$ssh_stop_vip\"\n";
    `ssh -t $ssh_user\@$orig_master_host \"$ssh_stop_vip\"`;
}

模拟一遍failover,可看到master_ip_failover_script的调用流程:

1、masterha_manager启动时,会调用

1
Checking the status of the script: ssh -t roketyyang@192.168.244.129 "sudo ifconfig eno16777736:1 192.168.244.200;sudo arping -c 3 -I eno16777736 -s 192.168.244.200 192.168.244.2"

这时,在master设置VIP。

2、在failover阶段Phase 2: Dead Master Shutdown Phase..,会调用

1
Checking the stop/stopssh of the script: ssh -t roketyyang@192.168.244.129 "sudo ifconfig eno16777736:1 down"

这时,是停掉了Dead Master上的VIP。

3、在failover阶段* Phase 3.3: New Master Recovery Phase..,会调用

1
Checking the start of the script: ssh -t roketyyang@192.168.244.128 "sudo ifconfig eno16777736:1 192.168.244.200;sudo arping -c 3 -I eno16777736 -s 192.168.244.200 192.168.244.2"

这时,在New Master上设置VIP。

以上,整个流程都可以从manager.log看到。