Keepalived入门

  • Post author:
  • Post category:其他


Keepalived通过VRRP协议(Virtual Router Redundancy Protocol,虚拟路由器冗余协议)实现高可用功能,解决静态路由单点故障问题,保证个别节点宕机时,整个网络能不间断地运行。



1 安装和基本操作



1.1 准备

1、在VMWare上安装带有用户界面的CentOS7系统,下载地址:

https://www.centos.org/download/

版本:CentOS-7-x86_64-Everything-1804.iso

2、Keepalived安装包,下载地址:

http://www.keepalived.org/download.html

版本:keepalived-2.0.8.tar.gz

本文安装两个CentOS虚拟机,分别为keepalived1(IP:192.168.197.146)、keepalived2(IP:192.168.197.147),虚拟IP为192.168.197.148。



1.2 安装

安装Keepalived,将Keepalived安装包上传至Linux服务器,解压:

[root@keepalived1 ~]# tar -zxvf keepalived-2.0.8.tar.gz
[root@keepalived1 ~]# cd keepalived-2.0.8

检查配置:

[root@keepalived1 keepalived-2.0.8]# ./configure --prefix=/usr/local/keepalived

其中,/usr/local/keepalived为安装路径。

检查过程中可能发生报错:

configure: error: 
  !!! OpenSSL is not properly installed on your system. !!!
  !!! Can not include OpenSSL headers files.            !!!

该报错由于缺少openssl-devel包导致,安装可解决:

yum install openssl-devel

检查通过后,编译并安装:

[root@keepalived1 keepalived-2.0.8]# make
[root@keepalived1 keepalived-2.0.8]# make install



1.3 配置

安装完成后,在安装目录下存在etc文件夹,其中已经预存放着各种应用场景下的示例配置文件。

本文首先配置以最简单的配置文件,后文将通过逐渐增加配置,演示配置代表的功能,最终完成一份实用的Keepalived配置。

keepalived1服务器:

! Configuration File for keepalived

global_defs {
   # 标识本节点的名称,用以告警时进行区分
   router_id SERVER_146
}

vrrp_instance VI_1 {
    # 初始状态,有MASTER和BACKUP两种状态,需全部大写,其中MASTER为工作状态,BACKUP为备用状态
    state MASTER

    # 对外提供服务的网卡接口,即虚拟IP绑定的网卡接口,在选择网卡接口时,要核实清楚,可通过ifconfig指令查看本机的网卡情况
    interface ens32

    # 虚拟路由的ID号,每组中各个节点设置必须一样,可选择IP最后一段使用,相同的 VRID 为一个组,他将决定多播的 MAC 地址
    virtual_router_id 148

    # 节点优先级,取值范围0~254,MASTER要比BACKUP高
    priority 100

    # MASTER与BACKUP节点间同步检查的时间间隔,单位为秒
    advert_int 1
    
     # 虚拟IP地址池,可以有多个IP,每个IP占一行,不需要指定子网掩码
    virtual_ipaddress {
        192.168.197.148
    }
}

keepalived2服务器:

! Configuration File for keepalived

global_defs {
   router_id SERVER_147
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens32
    virtual_router_id 148
    priority 90
    advert_int 1
    virtual_ipaddress {
        192.168.197.148
    }
}



1.4 启动停止

进入安装目录的sbin文件夹,直接运行keepalived启动:

[root@keepalived1 sbin]# keepalived

或使用服务命令启动:

service keepalived start

使用以上两种方式启动,都是以默认的配置文件路径启动的:/etc/keepalived/keepalived.conf。需在该路径下手工创建配置文件,否则将以默认的配置启动。

指定配置文件启动,需加上-f参数:

[root@keepalived1 sbin]# keepalived -f /usr/local/keepalived/etc/keepalived/keepalived.conf

停止Keepalived:

pkill keepalived

Keepalived启停过程中,可在Linux的系统日志/var/log/messages中查看相关日志信息。启动时加上“-D”参数,会记录更详细的日志:

[root@keepalived1 sbin]# keepalived -D -f /usr/local/keepalived/etc/keepalived/keepalived.conf

以1.3节的配置,依次启动两台服务器上的Keepalived(启动之前注意关闭两个服务器的防火墙)。

以XShell连接地址192.168.197.148,可以看到,连接到的服务器为keepalived1(可通过主机名判断当前连接的是哪台服务器);停止keepalived1上的Keepalived进程,或重启这台服务器,XShell连接将断开,再次连接地址192.168.197.148,连接到的服务器为keepalived2,即虚拟IP指向了备机。

实际使用时,主备机两台服务器都运行相同的程序,以虚拟IP对外提供服务,当主机异常时,备机可立即接替虚拟IP,对于外部的程序来说,服务没有中断(或中断时间极短),以此实现高可用。



2 常用配置



2.1 非抢占模式

默认情况下,当主机异常时,备机成为主机,接替虚拟IP对外提供服务,当原主机恢复后,将根据节点优先级priority进行判断,若原主机的优先级高于原备机时,将再次发生切换,原主机重新成为主机,成为主机的原备机又自动降为备机。

这样的场景通常是不希望发生的,因为再次切换意味着对外的服务将再一次中断。希望达到的效果是,若原主机恢复后,能继续保持当前的主备机状态,原主机作为备机,原备机继续作为主机。

此时就需要用到配置非抢占模式nopreempt,配置有nopreempt的服务器恢复后,不会抢占MASTER权限。

服务器keepalived1:

! Configuration File for keepalived

global_defs {
   router_id SERVER_146
}

vrrp_instance VI_1 {
    state BACKUP
    # 非抢占模式
    nopreempt
    interface ens32
    virtual_router_id 148
    priority 100
    advert_int 1
    virtual_ipaddress {
        192.168.197.148
    }
}

服务器keepalived2:

! Configuration File for keepalived

global_defs {
   router_id SERVER_147
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens32
    virtual_router_id 148
    priority 90
    advert_int 1
    virtual_ipaddress {
        192.168.197.148
    }
}

若keepalived1异常、keepalived2成为主机,当keepalived1恢复时,由于其配置有非抢占模式nopreempt,不会抢占成为主机;若keepalived2异常、keepalived1成为主机,当keepalived2恢复时,由于其优先级较低,也不会成为主机。如此,就保证了异常的服务器恢复时,不会发生主备机切换。

注意:非抢占模式只对初始状态为BACKUP的服务器有效,故将keepalived1和keepalived2两台服务器都设置为了BACKUP。



2.2 密码验证

由Keepalived构建的高可用集群,通常需要做一定保护,否则外界的任何服务器都可以加入该集群,争抢主机权力,会干扰该集群的正常运行。

可为各节点设置密码验证,只有密码相同的各节点才能进行正常通信。

服务器keepalived1:

! Configuration File for keepalived

global_defs {
   router_id SERVER_146
}

vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface ens32
    virtual_router_id 148
    priority 100
    advert_int 1
    virtual_ipaddress {
        192.168.197.148
    }
    authentication {
        auth_type PASS
        # 密码无需设置过长,Keepalived只会用到前8个字符
        auth_pass abc123
    }
}

服务器keepalived2:

! Configuration File for keepalived

global_defs {
   router_id SERVER_147
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens32
    virtual_router_id 148
    priority 90
    advert_int 1
    virtual_ipaddress {
        192.168.197.148
    }
    authentication {
        auth_type PASS
        auth_pass abc123
    }
}

可自行验证,两台服务器密码不一致的情况。



2.3 监控脚本

在之前章节中,本文设置keepalived1为主机,keepalived2为备机。两台服务器的网络状况、硬件情况可能会发生变化,若keepalived1的网络状况变差,此时理应由网络状况更好的keepalived2接管主机权限,对外提供服务。而依照之前的配置,keepalived1仅凭借初始配置的优先级priority高,始终掌握着主机权力,这显然不是我们希望看到的状况。

此时,可利用vrrp_script块,vrrp_script会按照设置间隔定时执行指定的脚本,并依照执行的结果改变服务器的优先级。

为主备机建立脚本 check.sh:

#!/bin/bash
ping -c 1 192.168.197.137
exit $?

在这个脚本中,“ping -c 1”表示对目标地址执行一次ping命令,若执行成功,“exit $?”将返回0,若执行失败则将返回1。Keepalived依赖脚本的返回码是0或1作为脚本执行成功或失败的依据。192.168.197.137是另一台虚拟机的IP,用作测试用。

注意:建立脚本后,要用“chmod +x”命令为该脚本赋予执行权限。

keepalived1配置文件:

! Configuration File for keepalived

global_defs {
   router_id SERVER_146
   # 执行脚本使用的用户
   script_user root
}

# 声明脚本
vrrp_script check {
   # 执行的脚本的路径
   script "/usr/local/keepalived/script/check.sh"
   # 执行脚本的时间间隔,单位秒,每隔10秒执行一次脚本
   interval 10
   # 执行脚本的超时时间,单位秒,脚本执行超过10秒视为失败
   timeout 10
   # 脚本执行失败后,本节点优先级减小值
   weight -20
}

vrrp_instance VI_1 {
state BACKUP
    # 采用抢占模式
    # nopreempt
    interface ens32
    virtual_router_id 148
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass abc123
    }
    virtual_ipaddress {
        192.168.197.148
}
    # 声明监控的脚本,脚本只有被监控时才会定时运行
    track_script {
        check
    }
}

keepalived2暂保持为2.2节配置不变,开启服务器192.168.197.137,先后启动keepalived1和keepalived2两台服务器上的Keepalived后,keepalived1将作为主机,keepalived2作为备机。

将服务器192.168.197.137关机,check_network.sh将返回1,查看/var/log/messages日志,可以看到两台服务器主备切换的全过程:

# 初始配置是BACKUP,进入备机状态
Nov 16 21:47:50 keepalived1 Keepalived_vrrp[3825]: (VI_1) Entering BACKUP STATE (init)
# 发现服务器keepalived2的优先级较低
Nov 16 21:47:51 keepalived1 Keepalived_vrrp[3825]: (VI_1) received lower priority (90) advert from 192.168.197.147 - discarding
# 成为主机
Nov 16 21:47:54 keepalived1 Keepalived_vrrp[3825]: (VI_1) Entering MASTER STATE

# ...关闭服务器192.168.197.137
# 执行脚本超时(因为ping不通)
Nov 16 21:55:47 keepalived1 Keepalived_vrrp[4160]: VRRP_Script(check_network) timed_out
# 执行脚本失败,根据配置,优先级减少20,变为80
Nov 16 21:55:47 keepalived1 Keepalived_vrrp[4160]: (VI_1) Changing effective priority from 100 to 80
# 发现备机的优先级更高
Nov 16 21:55:51 keepalived1 Keepalived_vrrp[4160]: (VI_1) Master received advert from 192.168.197.147 with higher priority 90, ours 80
# 进入备机状态,原备机成为主机
Nov 16 21:55:51 keepalived1 Keepalived_vrrp[4160]: (VI_1) Entering BACKUP STATE

在实际使用中,可以在check.sh中进行网络状态的监控(如ping网关)、应用程序状态的监控等,当本机健康状况不良时,可将主机权力让渡。

同样地,备机也配置以监控脚本,最终keepalived2的配置变为:

! Configuration File for keepalived

global_defs {
   router_id SERVER_147
   script_user root
}

vrrp_script check {
   script "/usr/local/keepalived/script/check.sh"
   interval 10
   timeout 10
   weight -20
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens32
    virtual_router_id 148
    priority 90
    advert_int 1
    virtual_ipaddress {
        192.168.197.148
    }
    authentication {
        auth_type PASS
        auth_pass abc123
    }
    track_script {
        check
    }
}



2.4 网卡监控

网卡是服务器进行通信的重要设备,若网卡异常,将直接影响服务器的通信。Keepalived可使用track_interface块针对网卡进行监控,当网卡连接不通时,进入失败(FAULT)状态,优先级降为0。

keepalived1服务器配置:

! Configuration File for keepalived

global_defs {
   router_id SERVER_146
   script_user root
}

vrrp_script check {
   script "/usr/local/keepalived/script/check.sh"
   interval 10
   timeout 10
   weight -20
}

vrrp_instance VI_1 {
    state BACKUP
    # nopreempt
    interface ens32
    virtual_router_id 148
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass abc123
    }
    virtual_ipaddress {
        192.168.197.148
}
    track_script {
        check
}

    # 列出所监控的网卡
    track_interface {
        ens32
        lo
    }
}

keepalived2服务器配置:

! Configuration File for keepalived

global_defs {
   router_id SERVER_147
   script_user root
}

vrrp_script check {
   script "/usr/local/keepalived/script/check.sh"
   interval 10
   timeout 10
   weight -20
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens32
    virtual_router_id 148
    priority 90
    advert_int 1
    virtual_ipaddress {
        192.168.197.148
    }
    authentication {
        auth_type PASS
        auth_pass abc123
    }
    track_script {
        check
    }
    track_interface {
        ens32
        lo
    }
}

这里在track_interface中列出了ens32,实际上,即使这里不列出,对于vrrp_instance中使用的网卡,连接不通后也会进入FAULT状态。在测试改设置的效果时,可使用本地回环接口lo:

# 断开网卡
[root@keepalived1 ~]# ifdown lo
# 连接网卡
[root@keepalived1 ~]# ifup lo

断开本地回环接口lo后,可以看到keepalived1的日志:

Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: Netlink reports lo down
Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: (VI_1) Entering FAULT STATE
Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: (VI_1) sent 0 priority
Nov 19 10:45:51 keepalived1 Keepalived_vrrp[6652]: Netlink: error: data remnant size 1148
Nov 19 10:45:51 keepalived1 avahi-daemon[675]: Withdrawing address record for 192.168.197.148 on ens32.

keepalived1由于lo断开,进入了FAULT状态,优先级变为0。查看keepalived2的日志可以看到,由于keepalived1的优先级已低于keepalived2,keepalived2成为了主机:

Nov 19 10:45:51 keepalived2 Keepalived_vrrp[4161]: (VI_1) Backup received priority 0 advertisement
Nov 19 10:45:51 keepalived2 Keepalived_vrrp[4161]: (VI_1) Backup received priority 0 advertisement
Nov 19 10:45:52 keepalived2 Keepalived_vrrp[4161]: (VI_1) Entering MASTER STATE



2.5 通知脚本

Keepalived的实际使用中,发生主备切换时,通常需要执行一系列操作,如备机通常应用程序是出于停止状态的,当备机成为主机时,需要将应用程序拉起,对外提供服务。为使应用程序及时启动,需由Keepalived在主备切换后自动运行应用程序。

Keepalived提供通知脚本(Notify Scripts),当Keepalived发生状态改变时,会自动运行设定的脚本。Keepalived配置文件:

! Configuration File for keepalived

global_defs {
   router_id SERVER_146
   script_user root
}

vrrp_script check {
   script "/usr/local/keepalived/script/check.sh"
   interval 10
   timeout 10
   weight -20
}

vrrp_instance VI_1 {
    state BACKUP
    # nopreempt
    interface ens32
    virtual_router_id 148
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass abc123
    }
    virtual_ipaddress {
        192.168.197.148
    }
    track_script {
        check
    }
    track_interface {
        ens32
        lo
    }

    # 当前服务器进入主机状态时运行的脚本
    notify_master "/usr/local/keepalived/script/notify_master.sh"
    # 当前服务器进入备机状态时运行的脚本
    notify_backup "/usr/local/keepalived/script/notify_backup.sh"
    # 当前服务器进入失败状态时运行的脚本
    notify_fault "/usr/local/keepalived/script/notify_fault.sh"
    # 当前服务器Keepalived停止时运行的脚本
    notify_stop "/usr/local/keepalived/script/notify_stop.sh"
    # 该脚本在任何状态切换后都会运行,且在以上脚本运行完毕后运行,以下3个参数会自动传入脚本中:$1=GROUP|INSTANCE,表示切换的是VRRP实例组或VRRP实例;$2=VRRP实例(组)的名称;$3=MASTER|BACKUP|FAULT,为切换的目标状态
    notify "/usr/local/keepalived/script/notify.sh"
}

除此之外,Keepalived还有更为复杂的配置,可实现丰富的功能。在我接触到的系统中,以上配置已足够使用。更多配置可参考Keepalived官网:

http://www.keepalived.org/manpage.html



版权声明:本文为weixin_43533358原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。