最近为了完成公司交代的任务,简单学习了一下容灾和高可用相关的知识,记录如下。

安装 Keepalived

下载源码

https://www.keepalived.org/software/keepalived-2.3.2.tar.gz

tar -zxvf keepalived-2.3.2.tar.gz

配置安装选项

cd keepalived-2.3.2
./configure --prefix=/usr/local/keepalived

编译安装

make && make install

配置启动项

mkdir -p /etc/keepalived
mkdir -p /etc/sysconfig/keepalived
cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
cp keepalived-2.3.2/keepalived/etc/init.d/keepalived /etc/init.d/
cp keepalived-2.3.2/keepalived/keepalived.service /lib/systemd/system/
cp keepalived-2.3.2//keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived
cp keepalived-2.3.2/keepalived/etc/keepalived/keepalived.conf.sample /etc/keepalived/keepalived.conf

Nginx + keepalived

环境搭建

hostname ip os role
nginx1.local 192.168.250.101 Ubuntu Server 22.04 master
nginx2.local 192.168.250.102 Ubuntu Server 22.04 backup

虚拟 IP :192.168.250.130

安装基础依赖

  1. make
  2. gcc、g++
  3. libpcre、libpcre-dev
  4. zlib、zlib-dev
  5. openssl、openssl-dev
  6. libnl、libnl-devel
apt install make gcc g++ libpcre3 libpcre3-dev zlib1g zlib1g-dev openssl libssl-dev libnl-3-200 libnl-3-dev

安装 Nginx

下载源码

https://nginx.org/download/nginx-1.27.3.tar.gz

tar -zxvf nginx-1.27.3.tar.gz

创建临时目录

mkdir /var/temp/nginx -p

配置安装选项

cd nginx-1.27.3
./configure \
    --prefix=/usr/local/nginx \
    --pid-path=/var/run/nginx/nginx.pid \
    --lock-path=/var/lock/nginx.lock \
    --error-log-path=/var/log/nginx/error.log \
    --http-log-path=/var/log/nginx/access.log \
    --with-http_gzip_static_module \
    --http-client-body-temp-path=/var/temp/nginx/client \
    --http-proxy-temp-path=/var/temp/nginx/proxy \
    --http-fastcgi-temp-path=/var/temp/nginx/fastcgi \
    --http-uwsgi-temp-path=/var/temp/nginx/uwsgi \
    --http-scgi-temp-path=/var/temp/nginx/scgi

编译安装

make && make install

启动运行

/usr/local/nginx/sbin/nginx

配置文件

/usr/local/nginx/conf/nginx.conf

创建一个日志目录,mkdir /usr/local/nginx/logs

修改配置文件,解开其中的 pid 的注释,设置为 /usr/local/nginx/logs/nginx.pid

部署测试网站

/usr/local/nginx/html/index.html 中加入主机标记:

检测脚本

/usr/local/src/nginx_check.sh
#! /bin/bash

if [`ps -C nginx --no-header` -eq '']; then
    exit 1
fi

keepalived 配置文件

/etc/keepalived/keepalived.conf

Master

global_defs {
    script_user root  
    enable_script_security
    notification_email {
        yvling.cn@outlook.com
    }
    notification_email_from 1111111111@qq.com
    smtp_server 192.168.250.101       # 主服务器的ip地址
    smtp_connection_timeout 30
    router_id LVS_DEVEL               # 局域网内唯一标识,LVS_DEVEL字段在/etc/hosts文件中看,通过它访问到主机
}

vrrp_script chk_http_port {
    script "/usr/local/src/nginx_check.sh"        # 检测脚本存放的路径
    interval 2                                    # 检测脚本执行的间隔,单位是秒
    weight 2
}

vrrp_instance VI_1 {
    state MASTER              # 指定keepalived的角色,MASTER为主,BACKUP为备
    interface ens33           # 当前进行vrrp通讯的网络接口卡
    virtual_router_id 01      # 虚拟路由编号,主从要一致
    priority 100              # 优先级,数值越大,获取处理请求的优先级越高
    advert_int 1              # 检查间隔,默认为1s(vrrp组播周期秒数)
    authentication {
        auth_type PASS        # 验证类型和密码,MASTER和BACKUP必须使用相同的密码
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.250.130       # 定义虚拟ip(VIP),可多设,每行一个
    }
}

Backup

global_defs {
    script_user root  
    enable_script_security
    notification_email {
        yvling.cn@outlook.com
    }
    notification_email_from 2222222222@qq.com
    smtp_server 192.168.250.101       # 主服务器的ip地址
    smtp_connection_timeout 30
    router_id LVS_DEVEL               # 局域网内唯一标识,LVS_DEVEL字段在/etc/hosts文件中看,通过它访问到主机
}

vrrp_script chk_http_port {
    script "/usr/local/src/nginx_check.sh"        # 检测脚本存放的路径
    interval 2                                    # 检测脚本执行的间隔,单位是秒
    weight 2
}

vrrp_instance VI_1 {
    state BACKUP              # 指定keepalived的角色,MASTER为主,BACKUP为备
    interface ens33           # 当前进行vrrp通讯的网络接口卡
    virtual_router_id 01      # 虚拟路由编号,主从要一致
    priority 90               # 优先级,数值越大,获取处理请求的优先级越高
    advert_int 1              # 检查间隔,默认为1s(vrrp组播周期秒数)
    authentication {
        auth_type PASS        # 验证类型和密码,MASTER和BACKUP必须使用相同的密码
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.250.130       # 定义虚拟ip(VIP),可多设,每行一个
    }
}

启动运行

systemctl enable keepalived
systemctl start keepalived

测试虚拟 IP

虚拟 IP 默认使用 Nginx1,即 Master 节点作为服务。

停止 Nginx1 的服务:

/usr/local/nginx/sbin/nginx -s stop

Master 节点的 keepalived 检测到 Nginx 服务不存在之后,就停止向 Backup 节点发送心跳包,Backup 节点接收不到心跳包,就开始接管虚拟 IP 。

Redis + keepalived

基本思路

利用 keepalived 的 notify_master 配置,在 backup 节点被升级为 master 节点时,调用指定脚本,将 backup 节点上的 redis 服务从 “从节点” 升级为 “主节点” ,解决 redis 主从模式下 “从节点” 不可写的问题,进而实现 redis 的高可用。

环境搭建

hostname ip os role
redis1.local 192.168.250.101 Ubuntu Server 22.04 master
redis2.local 192.168.250.102 Ubuntu Server 22.04 backup

安装基础依赖

  1. make
  2. gcc、g++
  3. libsystemd-dev
  4. openssl、openssl-dev

安装 Redis

下载源码

https://download.redis.io/redis-stable.tar.gz

tar -zxvf redis-stable.tar.gz
cd redis-stable

编译安装

make USE_SYSTEMD=yes BUILD_TLS=yes && make install

配置文件

mkdir /etc/redis
cp redis.conf sentinel.conf /etc/redis

编辑 /etc/redis/redis.conf 文件,修改 daemonize 的值为 yesbind 的值为 0.0.0.0protected-mode 的值为 no

测试运行

sysctl vm.overcommit_memory=1
redis-server /etc/redis/redis.conf

添加服务

在 /etc/systemd/system/ 目录下添加service unit: redis.service ,内容为:

[Unit]  
Description=Redis Server  
After=network.target  
After=network-online.target  
Wants=network-online.target  
  
[Service]  
ExecStart=/usr/local/bin/redis-server /etc/redis/redis.conf --supervised systemd  
ExecStop=/usr/local/bin/redis-cli shutdown  
Type=notify  
User=root 
  
[Install]  
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable redis
systemctl start redis

数据持久化

在 redis 配置文件中加入以下内容:

save 900 1
save 300 10
save 60 10000
systemctl restart redis

设置主从模式

在 backup 节点的 redis 配置文件中加入以下内容:

replicaof 192.168.250.101 6379
systemctl restart redis

高可用配置

测试脚本

/usr/local/src/redis_check.sh

该脚本检查 redis 状态。

#!/bin/bash

response=$(redis-cli PING)
if [[ "$response" == "PONG" ]]; then
    exit 0
else
    exit 1
fi

/usr/local/src/redis_master.sh

该脚本为服务器切换为 master 时执行的脚本,首先会检查切换前与 master 的主从同步是否完成,然后利用 expect 工具跳到另一台服务器将其 redis 状态更改为 backup 。

#!/bin/bash
rediscli="/usr/local/bin/redis-cli"
sync=`$rediscli info replication | grep master_sync_in_progress | awk -F: '{print $2}' | sed 's/\r//'`

if [ $sync == 0 ]; then :
    $rediscli slaveof no one
elif [ $sync == 1 ]; then :
    sleep 10
    $rediscli slaveof no one
else
    echo "this host is master, do nothing"
fi

keepalived 配置文件

/etc/keepalived/keepalived.conf

Master

global_defs {
    script_user root  
    enable_script_security
    notification_email {
        yvling.cn@outlook.com
    }
    notification_email_from 1111111111@qq.com
    smtp_server 192.168.250.101       # 主服务器的ip地址
    smtp_connection_timeout 30
    router_id LVS_DEVEL               # 局域网内唯一标识,LVS_DEVEL字段在/etc/hosts文件中看,通过它访问到主机
}

vrrp_script chk_redis {
  script "/usr/local/src/redis_check.sh"
  interval 2    # 健康检查周期
  weight 30     # 优先级变化幅度
  fall 2        # 尝试两次都成功才成功
  rise 2        # 尝试两次都失败才失败
}
 
vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 100
    priority 100
    advert_int 1
    
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    
    virtual_ipaddress {
        192.168.250.130
    }
    
    track_script 
        chk_redis
    }
    
notify_master "/usr/local/src/redis_master.sh" # 指定当切换为master时,执行的脚本

Backup

global_defs {
    script_user root  
    enable_script_security
    notification_email {
        yvling.cn@outlook.com
    }
    notification_email_from 1111111111@qq.com
    smtp_server 192.168.250.101       # 主服务器的ip地址
    smtp_connection_timeout 30
    router_id LVS_DEVEL               # 局域网内唯一标识,LVS_DEVEL字段在/etc/hosts文件中看,通过它访问到主机
}

vrrp_script chk_redis {
  script "/usr/local/src/redis_check.sh"
  interval 2    # 健康检查周期
  weight 30     # 优先级变化幅度
  fall 2        # 尝试两次都成功才成功
  rise 2        # 尝试两次都失败才失败
}
 
vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 100
    priority 90
    advert_int 1
    
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    
    virtual_ipaddress {
        192.168.250.130
    }
    
    track_script 
        chk_redis
    }
    
notify_master "/usr/local/src/redis_master.sh" # 指定当切换为master时,执行的脚本

测试高可用

正常运行,使用 master 节点:

backup 作为从节点,同步 master 数据:

停止 master 服务,keepalived 切换至 backup 节点,并通过脚本将 backup 升级为 master ,读写依然正常:

RabbitMQ + keepalived

环境搭建

hostname ip os role
rabbitmq1.local 192.168.250.101 Ubuntu Server 22.04 master
rabbitmq2.local 192.168.250.102 Ubuntu Server 22.04 backup
修改 /etc/hosts 中的域名解析记录:
192.168.250.101 rabbitmq1
192.168.250.102 rabbitmq2

修改 /etc/hostname 中的主机名:

rabbitmq1
rabbitmq2

安装 rabbitmq

#!/bin/sh

sudo apt-get install curl gnupg apt-transport-https -y

## Team RabbitMQ's main signing key
curl -1sLf "https://keys.openpgp.org/vks/v1/by-fingerprint/0A9AF2115F4687BD29803A206B73A36E6026DFCA" | sudo gpg --dearmor | sudo tee /usr/share/keyrings/com.rabbitmq.team.gpg > /dev/null
## Community mirror of Cloudsmith: modern Erlang repository
curl -1sLf https://github.com/rabbitmq/signing-keys/releases/download/3.0/cloudsmith.rabbitmq-erlang.E495BB49CC4BBE5B.key | sudo gpg --dearmor | sudo tee /usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg > /dev/null
## Community mirror of Cloudsmith: RabbitMQ repository
curl -1sLf https://github.com/rabbitmq/signing-keys/releases/download/3.0/cloudsmith.rabbitmq-server.9F4587F226208342.key | sudo gpg --dearmor | sudo tee /usr/share/keyrings/rabbitmq.9F4587F226208342.gpg > /dev/null

## Add apt repositories maintained by Team RabbitMQ
sudo tee /etc/apt/sources.list.d/rabbitmq.list <<EOF
## Provides modern Erlang/OTP releases
##
deb [arch=amd64 signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa1.rabbitmq.com/rabbitmq/rabbitmq-erlang/deb/ubuntu jammy main
deb-src [signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa1.rabbitmq.com/rabbitmq/rabbitmq-erlang/deb/ubuntu jammy main

# another mirror for redundancy
deb [arch=amd64 signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa2.rabbitmq.com/rabbitmq/rabbitmq-erlang/deb/ubuntu jammy main
deb-src [signed-by=/usr/share/keyrings/rabbitmq.E495BB49CC4BBE5B.gpg] https://ppa2.rabbitmq.com/rabbitmq/rabbitmq-erlang/deb/ubuntu jammy main

## Provides RabbitMQ
##
deb [arch=amd64 signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa1.rabbitmq.com/rabbitmq/rabbitmq-server/deb/ubuntu jammy main
deb-src [signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa1.rabbitmq.com/rabbitmq/rabbitmq-server/deb/ubuntu jammy main

# another mirror for redundancy
deb [arch=amd64 signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa2.rabbitmq.com/rabbitmq/rabbitmq-server/deb/ubuntu jammy main
deb-src [signed-by=/usr/share/keyrings/rabbitmq.9F4587F226208342.gpg] https://ppa2.rabbitmq.com/rabbitmq/rabbitmq-server/deb/ubuntu jammy main
EOF

## Update package indices
sudo apt-get update -y

## Install Erlang packages
sudo apt-get install -y erlang-base \
                        erlang-asn1 erlang-crypto erlang-eldap erlang-ftp erlang-inets \
                        erlang-mnesia erlang-os-mon erlang-parsetools erlang-public-key \
                        erlang-runtime-tools erlang-snmp erlang-ssl \
                        erlang-syntax-tools erlang-tftp erlang-tools erlang-xmerl

## Install rabbitmq-server and its dependencies
sudo apt-get install rabbitmq-server -y --fix-missing
# 启动 rabbitmq 服务
service rabbitmq-server start

# 关闭 rabbitmq 服务
service rabbitmq-server stop

# 重启 rabbitmq 服务
service rabbitmq-server restart

# 查看 rabbitmq 状态
service rabbitmq-server status

启用管理服务插件

rabbitmq-plugins enable rabbitmq_management

通过 15672 端口访问管理页面:

配置基础集群

停止所有 rabbitmq 服务:

service rabbitmq-server stop

在 rabbitmq1(192.168.250.101) 上执行:

chmod +777 /var/lib/rabbitmq/.erlang.cookie

在 rabbitmq2(192.168.250.102) 上执行:

chmod +777 /var/lib/rabbitmq/.erlang.cookie
scp -r 192.168.250.101:/var/lib/rabbitmq/.erlang.cookie /var/lib/rabbitmq/.erlang.cookie

复制完成之后,在两台机器上执行:

chmod 400 /var/lib/rabbitmq/.erlang.cookie

使用以下命令查看集群状态:

rabbitmqctl cluster_status

加入集群

在 rabbitmq2(192.168.250.102) 上执行:

rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@rabbitmq1
rabbitmqctl start_app

设置账号

rabbitmqctl add_user admin 123456
rabbitmqctl set_permissions -p / admin ".*" ".*" ".*"
rabbitmqctl set_user_tags admin administrator

普通集群创建完成。

配置仲裁队列

仲裁队列是 3.8 版本以后才有的新功能,用来替代镜像队列,属于主从模式。请求仍然都是由主节点进行操作,然后同步到从节点中,但是对于任何节点来说,既可能是某个仲裁队列的主节点,也可能是其它仲裁队列的从节点,如果主节点挂了,其中的某个从节点就会变成主节点,并在其它节点上尽可能创建出新的主节点,保障主从数量一致。

只需要在创建队列时选择 quorum 类型即可: