背景
这个事情的背景是生产环境的数据采集流程时不时会出问题(这个也是不可避免的),目前的处理手段是:所有的数据接口服务器(也就是存放原始数据等待采集的服务器)都部署一模一样的2台,数据也传的一模一样,然后当采集程序采集当前节点的数据异常的时候,由运维人员去改配置手动的切换。
这样操作面临的问题不用多说,首先就是时效性的问题,就算数据断了能够及时发出告警,等到运维人员处理完成那也是至少几十分钟后了,所以高可用的实现还是很有必要的。
简单调研下来还是只能用Keepalived来做这个软负载,毕竟客户不愿意出钱(铁 公 鸡)去购买F5设备这些东西做硬负载,所以就基于这个目标开干。- 由于是做高可用,且我们的使用场景不是主备,应该是角色相同的两个服务器,所以不使用Keepalived的抢占式机制,改为非抢占。
复制代码 准备工作
服务器和VIP
准备2台服务器和一个VIP:
- 服务器A:172.18.0.26,sftp等服务提前装好
- 服务器B:172.18.0.27,sftp等服务提前装好
- VIP:172.18.0.78,虚拟IP,用于对外访问,在AB之间漂移
Keepalived软件
因为官网提供的是源码包的下载,为了方便后续实施人员在环境上做安装操作,做成RPM包更稳妥点,Centos7的官方仓库RPM包也是很老的版本,好像是1.3.X的,最新版已经2.2.8了,所以这里要自己打一下包,先写SPEC文件,保存为:- %bcond_without snmp
- %bcond_without vrrp
- %bcond_without sha1
- %bcond_with profile
- %bcond_with debug
- %if 0%{?rhel} && 0%{?rhel} <= 6
- %bcond_with nftables
- %bcond_with track_process
- %bcond_with libiptc
- %else
- %bcond_without nftables
- %bcond_without track_process
- %bcond_without libiptc
- %endif
- %global _hardened_build 1
- Name: keepalived
- Summary: High Availability monitor built upon LVS, VRRP and service pollers
- Version: 2.2.8
- Release: 1%{?dist}
- License: GPLv2+
- URL: http://www.keepalived.org/
- Group: System Environment/Daemons
- Source0: http://www.keepalived.org/software/keepalived-%{version}.tar.gz
- Source1: keepalived.service
- Source2: keepalived.init
- # distribution specific definitions
- %define use_systemd (0%{?fedora} && 0%{?fedora} >= 18) || (0%{?rhel} && 0%{?rhel} >= 7) || (0%{?suse_version} == 1315)
- %if %{use_systemd}
- Requires(post): systemd
- Requires(preun): systemd
- Requires(postun): systemd
- %else
- Requires(post): /sbin/chkconfig
- Requires(preun): /sbin/chkconfig
- Requires(preun): /sbin/service
- Requires(postun): /sbin/service
- %endif
- BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root-%(%{__id_u} -n)
- %if %{with snmp}
- BuildRequires: net-snmp-devel
- %endif
- %if %{use_systemd}
- BuildRequires: systemd-units
- %endif
- BuildRequires: openssl-devel
- BuildRequires: libnl3-devel
- BuildRequires: ipset-devel
- BuildRequires: iptables-devel
- BuildRequires: libnfnetlink-devel
- %if (0%{?rhel} && 0%{?rhel} >= 7)
- Requires: ipset-libs
- %endif
- %description
- Keepalived provides simple and robust facilities for load balancing
- and high availability to Linux system and Linux based infrastructures.
- The load balancing framework relies on well-known and widely used
- Linux Virtual Server (IPVS) kernel module providing Layer4 load
- balancing. Keepalived implements a set of checkers to dynamically and
- adaptively maintain and manage load-balanced server pool according
- their health. High availability is achieved by VRRP protocol. VRRP is
- a fundamental brick for router failover. In addition, keepalived
- implements a set of hooks to the VRRP finite state machine providing
- low-level and high-speed protocol interactions. Keepalived frameworks
- can be used independently or all together to provide resilient
- infrastructures.
- %prep
- %setup -q
- %build
- %configure \
- %{?with_debug:--enable-debug} \
- %{?with_profile:--enable-profile} \
- %{!?with_vrrp:--disable-vrrp} \
- %{?with_snmp:--enable-snmp --enable-snmp-rfc} \
- %{?with_sha1:--enable-sha1} \
- %{!?with_nftables:--disable-nftables} \
- %{!?with_track_process:--disable-track-process} \
- %{!?with_libiptc:--disable-libiptc}
- %{__make} %{?_smp_mflags} STRIP=/bin/true
- %install
- rm -rf %{buildroot}
- make install DESTDIR=%{buildroot}
- rm -rf %{buildroot}%{_sysconfdir}/keepalived/samples/
- rm -rf %{buildroot}%{_defaultdocdir}/keepalived/
- %if %{use_systemd}
- rm -rf %{buildroot}%{_initrddir}/
- %{__install} -p -D -m 0644 %{SOURCE1} %{buildroot}%{_unitdir}/keepalived.service
- %else
- rm %{buildroot}%{_sysconfdir}/init/keepalived.conf
- %{__install} -p -D -m 0755 %{SOURCE2} %{buildroot}%{_initrddir}/keepalived
- %endif
- mkdir -p %{buildroot}%{_libexecdir}/keepalived
- %clean
- rm -rf %{buildroot}
- %post
- %if %{use_systemd}
- %systemd_post keepalived.service
- %else
- /sbin/chkconfig --add keepalived
- %endif
- %preun
- %if %{use_systemd}
- %systemd_preun keepalived.service
- %else
- if [ "$1" -eq 0 ]; then
- /sbin/service keepalived stop >/dev/null 2>&1
- /sbin/chkconfig --del keepalived
- fi
- %endif
- %postun
- %if %{use_systemd}
- %systemd_postun_with_restart keepalived.service
- %else
- if [ "$1" -eq 1 ]; then
- /sbin/service keepalived condrestart >/dev/null 2>&1 || :
- fi
- %endif
- %files
- %defattr(-,root,root,-)
- %attr(0755,root,root) %{_sbindir}/keepalived
- %config(noreplace) %attr(0644,root,root) %{_sysconfdir}/sysconfig/keepalived
- %config(noreplace) %attr(0644,root,root) %{_sysconfdir}/keepalived/keepalived.conf.sample
- %doc AUTHOR ChangeLog CONTRIBUTORS COPYING README README.md TODO
- %doc doc/keepalived.conf.SYNOPSIS doc/samples/keepalived.conf.*
- %dir %{_sysconfdir}/keepalived/
- %dir %{_libexecdir}/keepalived/
- %if %{with snmp}
- %{_datadir}/snmp/mibs/KEEPALIVED-MIB.txt
- %{_datadir}/snmp/mibs/VRRP-MIB.txt
- %{_datadir}/snmp/mibs/VRRPv3-MIB.txt
- %endif
- %{_bindir}/genhash
- %if %{use_systemd}
- %{_unitdir}/keepalived.service
- %else
- %{_initrddir}/keepalived
- %endif
- %{_mandir}/man1/genhash.1*
- %{_mandir}/man5/keepalived.conf.5*
- %{_mandir}/man8/keepalived.8*
复制代码 把这个spec文件放在下,把官网下载的源码包放在下然后执行编译命令:- rpmbuild -bb ~/rpmbuild/SPECS/keepalived.spec
复制代码 该命令成功后会在目录下生成这两个rpm包:
我们只需要用- keepalived-2.2.8-1.el7.x86_64.rpm
复制代码 包就行了。
实施
安装Keepalived软件
rpm包拷贝到服务器A和服务器B上做安装,或者自己会做yum就做成yum装,要方便些,不用到处scp:- rpm -ivh keepalived-2.2.8-1.el7.x86_64.rpm
复制代码
如果报了缺少之类的依赖,需要安装几个依赖软件:- yum install -y net-snmp-libs net-snmp-agent-libs
复制代码
准备健康检查脚本
准备以下健康脚本用来检查服务的状态,这个脚本可以根据实际情况来改动:- #!/bin/bash
- # 检查SSH服务是否正在运行
- ssh_status=$(systemctl is-active sshd)
- # 判断SSH服务状态
- if [ "$ssh_status" = "active" ]; then
- exit 0
- else
- systemctl stop keepalived
- exit 1
- fi
复制代码 当sshd服务异常的时候,sftp自然不能用了,keepalived也就没有必要启动了,于是执行stop逻辑
配置服务器
上面说到,我们使用的是非抢占式的模式,所以配置文件这样写,只要注意改动几个特别说明的字段就可以:- ! Configuration File for keepalived
- global_defs {
- router_id LVS_DEVEL
- vrrp_skip_check_adv_addr
- vrrp_garp_interval 0
- vrrp_gna_interval 0
- }
- vrrp_script check_sftp {
- script "/etc/keepalived/scripts/check_sftp.sh"
- interval 2
- timeout 5
- fall 2
- rise 1
- }
- # 节点配置内容
- vrrp_instance VI_1 {
- state BACKUP
- interface p1p2 # 绑定VIP的网卡
- nopreempt # 配置为非抢占式
- virtual_router_id 53
- mcast_src_ip 172.18.0.26
- priority 100
- advert_int 1
- authentication {
- auth_type PASS
- auth_pass 1111
- }
- virtual_ipaddress {
- 172.18.0.78
- }
- track_script {
- check_sftp
- }
- }
- # 注意此处
- virtual_server 172.18.0.78 22 { # 虚拟服务
- delay_loop 6
- lb_algo rr
- lb_kind DR
- nat_mask 255.255.255.0
- persistence_timeout 0
- protocol TCP
-
- real_server 172.18.0.26 22 { # 实际对应的服务,这是A服务器的
- weight 1
- TCP_CHECK {
- connect_timeout 8
- nb_get_retry 3
- delay_before_retry 3
- connect_port 22 # 服务端口
- }
- }
- real_server 172.18.0.27 22 { # 实际对应的服务,这是B服务器的
- weight 1
- TCP_CHECK {
- connect_timeout 8
- nb_get_retry 3
- delay_before_retry 3
- connect_port 22 # 服务端口
- }
- }
- }
复制代码 按照上述配置配置好2台服务器,然后分别启动keepalived服务:- systemctl start keepalived
- systemctl status keepalived
复制代码
我们可以通过ip addr查看当前vip绑定的机器是服务器B
测试验证
接下来测试验证一下高可用的能力,为了方便区分,首先在两个服务器的root目录下放不同的文件,如果使用别的用户测试就放在对应用户的默认目录下就行,编写以下的测试脚本:- import time
- import paramiko
- host = "172.18.0.78"
- username = "root"
- password = "xxxxx"
- print("开始运行测试脚本")
- ssh_client = paramiko.SSHClient()
- print("首次建立ssh和sftp连接")
- ssh_client.set_missing_host_key_policy(paramiko.WarningPolicy)
- ssh_client.connect(hostname=host, username=username, password=password)
- sftp = ssh_client.open_sftp()
- while True:
- try:
- tran = ssh_client.get_transport()
- if tran.is_active():
- print("检测到ssh连接已经建立,直接执行测试逻辑")
- # 如果连接已经建立
- print(sftp.listdir())
- else:
- ssh_client.connect(
- hostname=host, username=username, password=password)
- sftp = ssh_client.open_sftp()
- except Exception as e:
- print("检测到ssh发生主备切换,重新建立sftp连接")
- ssh_client.connect(hostname=host, username=username, password=password)
- sftp = ssh_client.open_sftp()
- time.sleep(10)
复制代码 脚本会每隔十秒就在sftp上面列出以下当前目录,运行起来:
然后我们后台去停止主节点(当前是服务器B)的keepalived服务:- systemctl stop keepalived
复制代码
总结
ok,大功告成
以上为个人经验,希望能给大家一个参考,也希望大家多多支持脚本之家。
来源:互联网
免责声明:如果侵犯了您的权益,请联系站长(1277306191@qq.com),我们会及时删除侵权内容,谢谢合作! |