[toc]

什么是Redis Cluster

  1. Redis集群是一个可以在多个Redis节点之间进行数据共享的设施(installation)。
  2. Redis集群不支持那些需要同时处理多个键的Redis命令,因为执行这些命令需要在多个Redis节点之间移动数据,并且在高负载的情况下,这些命令将降低Redis集群的性能,并导致不可预测的行为。
  3. Redis集群通过分区(partition)来提供一定程度的可用性(availability):即使集群中有一部分节点失效或者无法进行通讯,集群也可以继续处理命令请求。
  4. Redis集群有将数据自动切分(split)到多个节点的能力。

Redis Cluster的特点

  • 高性能

    1. 在多分片节点中,将16384个槽位,均匀分布到多个分片节点中

    2. 存数据时,将key做crc16(key),然后和16384进行取模,得出槽位值(0-16384之间)

    3. 根据计算得出的槽位值,找到相对应的分片节点的主节点,存储到相应槽位上

    4. 如果客户端当时连接的节点不是将来要存储的分片节点,分片集群会将客户端连接切换至真正存储节点进行数据存储

      image-20230529201636627

      image-20230529201639958

  • 高可用

    • 在搭建集群时,会为每一个分片的主节点,对应一个从节点,实现slaveof功能,同时当主节点down,实现类似于sentinel的自动failover的功能。
    • Redis Cluster客户端连接任意节点
    • name xxx 5槽
    • 如图所示,当我们用客户端连接A分片时,如果按照数据的取模,我们想要访问的数据,不在A分片中,
      那么集群会自动将请求进行转发。

redis集群数据共享(设计理念)

Redis 集群使用数据分片(sharding)而非一致性哈希(consistency hashing)来实现: 一个 Redis 集
群包含 16384 个哈希槽(hash slot), 数据库中的每个键都属于这 16384 个哈希槽的其中一个, 集群
使用公式 CRC16(key) % 16384 来计算键 key 属于哪个槽, 其中 CRC16(key) 语句用于计算键 key 的
CRC16 校验和 。

  1. 节点 A 负责处理 0 号至 5500 号哈希槽。
  2. 节点 B 负责处理 5501 号至 11000 号哈希槽。
  3. 节点 C 负责处理 11001 号至 16384 号哈希槽。

image-20230529201847390

Redis Cluster运行机制

所有的redis节点彼此互联(PING-PONG机制),内部使用二进制协议优化传输速度和带宽.
节点的fail是通过集群中超过半数的master节点检测失效时才生效.
客户端与redis节点直连,不需要中间proxy层.客户端不需要连接集群所有节点,连接集群中任何一个可用节点即可
把所有的物理节点映射到[0-16383]slot上,cluster 负责维护node<->slot<->key

Redis Cluster如何做集群复制

image-20230529201938009

Redis Cluster故障转移

  1. 在集群里面,节点会对其他节点进行下线检测。
  2. 当一个主节点下线时,集群里面的其他主节点负责对下线主节点进行故障移。
  3. 换句话说,集群的节点集成了下线检测和故障转移等类似 Sentinel 的功能。
  4. 因为 Sentinel 是一个独立运行的监控程序,而集群的下线检测和故障转移等功能是集成在节点里面的,它们的运行模式非常地不同,所以尽管这两者的功能很相似,但集群的实现没有重用 Sentinel 的代码。

Redis Cluster中执行命令的两种情况

  1. 命令发送到了正确的节点:命令要处理的键所在的槽正好是由接收命令的节点负责,那么该节点执行命令,就像单机 Redis 服务器一样。

    image-20230529202031799

  2. 命令发送到了错误的节点:接收到命令的节点并非处理键所在槽的节点,那么节点将向客户端返回一
    个转向(redirection)错误,告知客户端应该到哪个节点去执行这个命令,客户端会根据错误提示的信
    息,重新向正确的节点发送命令。

    image-20230529202041806

Redis Cluster 安装部署

环境准备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# 准备6个redis实例

# 移动/data目录
cd /data

# 创建实例存放目录
mkdir {7000..7005}
ll
drwxr-xr-x 2 root root 6 May 29 11:50 7000
drwxr-xr-x 2 root root 6 May 29 11:50 7001
drwxr-xr-x 2 root root 6 May 29 11:50 7002
drwxr-xr-x 2 root root 6 May 29 11:50 7003
drwxr-xr-x 2 root root 6 May 29 11:50 7004
drwxr-xr-x 2 root root 6 May 29 11:50 7005

# 编辑配置文件
vim 7000/redis.conf
port 7000
daemonize yes
pidfile /data/7000/redis.pid
loglevel notice
logfile "/data/7000/redis.log"
dbfilename dump.rdb
dir /data/7000
protected-mode no
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes

# 将各个实例配置文件的名字改为实例名
:%s#7000#700*#g

# 启动多实例
redis-server /data/7000/redis.conf
redis-server /data/7001/redis.conf
redis-server /data/7002/redis.conf
redis-server /data/7003/redis.conf
redis-server /data/7004/redis.conf
redis-server /data/7005/redis.conf

安装ruby插件并换源

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 安装ruby环境
yum install ruby rubygems -y

# 换源
gem source list
*** CURRENT SOURCES ***
https://rubygems.org/

# 删除旧的源
gem source --remove https://rubygems.org/
https://rubygems.org/ removed from sources

# 添加阿里源
gem sources -a http://mirrors.aliyun.com/rubygems/
http://mirrors.aliyun.com/rubygems/ added to sources

# 使用gem安装redis的ruby插件
gem install redis -v 3.3.3

将节点加入集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# 老语法(新版本无法使用)
redis-trib.rb create --replicas 1 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005

# 新语法
redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 \
> 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
> --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 127.0.0.1:7004 to 127.0.0.1:7000
Adding replica 127.0.0.1:7005 to 127.0.0.1:7001
Adding replica 127.0.0.1:7003 to 127.0.0.1:7002
>>> Trying to optimize slaves allocation for anti-affinity
[WARNING] Some slaves are in the same host as their master
# M是主
M: d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 127.0.0.1:7000
slots:[0-5460] (5461 slots) master
M: f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 127.0.0.1:7001
slots:[5461-10922] (5462 slots) master
M: 558e80ab1811b1bf03a66c3adb5cdcf112c45712 127.0.0.1:7002
slots:[10923-16383] (5461 slots) master
# S是从
S: 3397e08cf9700c6640137b60a3dfeaeaefd269ce 127.0.0.1:7003
replicates d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d
S: 755d27462fa13bc1e8d5af4658b512d3129f6746 127.0.0.1:7004
replicates f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3
S: d7b13a06207313bf1c18f7ce9b14514ddc4e4545 127.0.0.1:7005
replicates 558e80ab1811b1bf03a66c3adb5cdcf112c45712
Can I set the above configuration? (type 'yes' to accept): # yes

# 查看集群主节点状态
redis-cli -p 7000 cluster nodes | grep master
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 127.0.0.1:7001@17001 master - 0
1685333907544 2 connected 5461-10922
558e80ab1811b1bf03a66c3adb5cdcf112c45712 127.0.0.1:7002@17002 master - 0
1685333908258 3 connected 10923-16383
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 127.0.0.1:7000@17000 myself,master - 0
1685333907000 1 connected 0-5460

# 查看集群从阶段状态
redis-cli -p 7000 cluster nodes | grep slave
755d27462fa13bc1e8d5af4658b512d3129f6746 127.0.0.1:7004@17004 slave
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 0 1685333950000 2 connected
3397e08cf9700c6640137b60a3dfeaeaefd269ce 127.0.0.1:7003@17003 slave
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 0 1685333950452 1 connected
d7b13a06207313bf1c18f7ce9b14514ddc4e4545 127.0.0.1:7005@17005 slave
558e80ab1811b1bf03a66c3adb5cdcf112c45712 0 1685333949000 3 connected

测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# 连接cluster
redis-cli -c -p 7000
-c 连接cluster
-p 再连接7000端口

# 写入数据
127.0.0.1:7000> set name zzz
-> Redirected to slot [5798] located at 127.0.0.1:7001
OK
127.0.0.1:7001> get name
"zzz"
127.0.0.1:7001>

# 切换从库查看数据
redis-cli -c -p 7005
127.0.0.1:7005> get name
-> Redirected to slot [5798] located at 127.0.0.1:7001
"zzz"
127.0.0.1:7001> set age 18
-> Redirected to slot [741] located at 127.0.0.1:7000
OK
127.0.0.1:7000>

# 切换从库检查数据
redis-cli -c -p 7004
127.0.0.1:7004> get age
-> Redirected to slot [741] located at 127.0.0.1:7000
"18"
127.0.0.1:7000>

redis-cluster拓展集群

环境准备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# 准备新节点
mkdir 7006 7007

# 准备7006实例配置文件
vim 7006/redis.conf
port 7006
daemonize yes
pidfile /data/7006/redis.pid
loglevel notice
logfile "/data/7006/redis.log"
dbfilename dump.rdb
dir /data/7006
protected-mode no
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes

# 准备7007实例配置文件
vim 7007/redis.conf
port 7007
daemonize yes
pidfile /data/7007/redis.pid
loglevel notice
logfile "/data/7007/redis.log"
dbfilename dump.rdb
dir /data/7007
protected-mode no
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes

# 启动实例
redis-server /data/7006/redis.conf
redis-server /data/7007/redis.conf

将新节点加入集群

1
2
3
4
5
6
7
8
9
10
11
12
13
# 将新节点加入集群
redis-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000

# 查看master状态
redis-cli -p 7000 cluster nodes | grep master
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 127.0.0.1:7001@17001 master - 0
1685334460740 2 connected 5461-10922
558e80ab1811b1bf03a66c3adb5cdcf112c45712 127.0.0.1:7002@17002 master - 0
1685334460538 3 connected 10923-16383
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 127.0.0.1:7000@17000 myself,master - 0
1685334461000 1 connected 0-5460
4624c0afa81cb1dc444781c65ec90270db2c5909 127.0.0.1:7006@17006 master - 0
1685334461564 0 connected

拓展槽位

1
2
3
4
5
6
7
8
9
10
# 重新分片
redis-cli --cluster reshard 127.0.0.1:7000

# 配置解析
## 你想要移出多少的槽位
How many slots do you want to move (from 1 to 16384)?
How many slots do you want to move (from 1 to 16384)? 4096
## 你要给谁
What is the receiving node ID?
What is the receiving node ID? 4624c0afa81cb1dc444781c65ec90270db2c5909 ##(7006)的id

image-20230529202224424

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1: all (输入all)
## 是否想要按照上面的计划去执行
Do you want to proceed with the proposed reshard plan (yes/no)? yes

# 查看新增后的7006槽位
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 127.0.0.1:7001@17001 master - 0
1685334778000 2 connected 6827-10922
558e80ab1811b1bf03a66c3adb5cdcf112c45712 127.0.0.1:7002@17002 master - 0
1685334776620 3 connected 12288-16383
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 127.0.0.1:7000@17000 myself,master - 0
1685334776000 1 connected 1365-5460
4624c0afa81cb1dc444781c65ec90270db2c5909 127.0.0.1:7006@17006 master - 0
1685334778000 7 connected 0-1364 5461-6826 10923-12287

添加副节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# 查看副节点
redis-cli -p 7000 cluster nodes | grep slave
755d27462fa13bc1e8d5af4658b512d3129f6746 127.0.0.1:7004@17004 slave
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 0 1685334867000 2 connected
3397e08cf9700c6640137b60a3dfeaeaefd269ce 127.0.0.1:7003@17003 slave
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 0 1685334867590 1 connected
d7b13a06207313bf1c18f7ce9b14514ddc4e4545 127.0.0.1:7005@17005 slave
558e80ab1811b1bf03a66c3adb5cdcf112c45712 0 1685334867000 3 connected

# 添加从节点
redis-cli --cluster add-node --cluster-slave --cluster-master-id ##(7006的id)
127.0.0.1:7007 127.0.0.1:7006

# 查看副节点
redis-cli -p 7000 cluster nodes | grep slave
5cc4180e7c631efd1ff4940318e5bffd0fb29831 127.0.0.1:7007@17007 slave
4624c0afa81cb1dc444781c65ec90270db2c5909 0 1685335023522 7 connected
755d27462fa13bc1e8d5af4658b512d3129f6746 127.0.0.1:7004@17004 slave
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 0 1685335024046 2 connected
3397e08cf9700c6640137b60a3dfeaeaefd269ce 127.0.0.1:7003@17003 slave
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 0 1685335023000 1 connected
d7b13a06207313bf1c18f7ce9b14514ddc4e4545 127.0.0.1:7005@17005 slave
558e80ab1811b1bf03a66c3adb5cdcf112c45712 0 1685335022000 3 connected

# 查看主节点
redis-cli -p 7000 cluster nodes | grep master
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 127.0.0.1:7001@17001 master - 0
1685335029000 2 connected 6827-10922
558e80ab1811b1bf03a66c3adb5cdcf112c45712 127.0.0.1:7002@17002 master - 0
1685335029142 3 connected 12288-16383
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 127.0.0.1:7000@17000 myself,master - 0
1685335028000 1 connected 1365-5460
4624c0afa81cb1dc444781c65ec90270db2c5909 127.0.0.1:7006@17006 master - 0
1685335030000 7 connected 0-1364 5461-6826 10923-12287

删除节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# 重新分片
redis-cli --cluster reshard 127.0.0.1:7000

# 配置解析
## 需要重新分配多少个slot
How many slots do you want to move (from 1 to 16384)? 4096 ##(因为我之前给4096)
## 谁来接受这些槽位 7000来接收
What is the receiving node ID? d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d ##(这是我
7000的id
## 源节点的id 7006的id
Source node #1: 4624c0afa81cb1dc444781c65ec90270db2c5909
## 结束
Source node #2: done
## 是否按照原计划执行
Do you want to proceed with the proposed reshard plan (yes/no)?

# 查看主节点
redis-cli -p 7000 cluster nodes | grep master
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 127.0.0.1:7001@17001 master - 0
1685335331067 2 connected 6827-10922
558e80ab1811b1bf03a66c3adb5cdcf112c45712 127.0.0.1:7002@17002 master - 0
1685335331590 3 connected 12288-16383
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 127.0.0.1:7000@17000 myself,master - 0
1685335331000 8 connected 0-6826 10923-12287

# 查看从节点
redis-cli -p 7000 cluster nodes | grep slave
5cc4180e7c631efd1ff4940318e5bffd0fb29831 127.0.0.1:7007@17007 slave
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 0 1685335356075 8 connected
4624c0afa81cb1dc444781c65ec90270db2c5909 127.0.0.1:7006@17006 slave
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 0 1685335356796 8 connected
755d27462fa13bc1e8d5af4658b512d3129f6746 127.0.0.1:7004@17004 slave
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 0 1685335354746 2 connected
3397e08cf9700c6640137b60a3dfeaeaefd269ce 127.0.0.1:7003@17003 slave
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 0 1685335356000 8 connected
d7b13a06207313bf1c18f7ce9b14514ddc4e4545 127.0.0.1:7005@17005 slave
558e80ab1811b1bf03a66c3adb5cdcf112c45712 0 1685335355261 3 connected

# 删除7006 7007 从节点
redis-cli --cluster del-node 127.0.0.1:7006
(4624c0afa81cb1dc444781c65ec90270db2c5909 这是7006的id
redis-cli --cluster del-node 127.0.0.1:7007
(5cc4180e7c631efd1ff4940318e5bffd0fb29831 这是7007的id

# 查看从节点
redis-cli -p 7000 cluster nodes | grep slave
755d27462fa13bc1e8d5af4658b512d3129f6746 127.0.0.1:7004@17004 slave
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 0 1685335517000 2 connected
3397e08cf9700c6640137b60a3dfeaeaefd269ce 127.0.0.1:7003@17003 slave
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 0 1685335517680 8 connected
d7b13a06207313bf1c18f7ce9b14514ddc4e4545 127.0.0.1:7005@17005 slave
558e80ab1811b1bf03a66c3adb5cdcf112c45712 0 1685335518000 3 connected

# 查看主节点
redis-cli -p 7000 cluster nodes | grep master
f2642160d58fafbb3d52f9cf26d7ba8e1f96d6e3 127.0.0.1:7001@17001 master - 0
1685335523000 2 connected 6827-10922
558e80ab1811b1bf03a66c3adb5cdcf112c45712 127.0.0.1:7002@17002 master - 0
1685335522518 3 connected 12288-16383
d5ce6172ab9daa77e7004e18fb7dd30a5bd43f3d 127.0.0.1:7000@17000 myself,master - 0
1685335523000 8 connected 0-6826 10923-12287