且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

[Erlang 0086] RabbitMQ 集群: 从零开始

更新时间:2022-08-16 20:41:32

 之前文章介绍了RabbitMQ的metadata, 元数据可以持久化在RAM或Disc.从这个角度可以把RabbitMQ集群中的节点分成两种:RAM Node 和 Disk Node. RAM Node只会将元数据存放在RAM,Disc node 会将元数据持久化到磁盘.

   单节点系统就没有什么选择了,只允许disk node,否则由于没有数据冗余一旦重启就会丢掉所有的配置信息.但在节点环境中可以选择哪些节点是RAM node.

 

 

[Erlang 0086] RabbitMQ 集群: 从零开始

Check First

 

  RabbitMQ Cluster的部署非常方便,有一些需要注意的细节,只要做过Erlang节点互连的,这些也都是耳熟能详的了:

[1] 统一 Erlang Cookie; 虽然官方网站上提到了修改.erlang.cookie的方式,不过我从来没有这样做过,都是启动erlang node的时候使用  -setcookie  显示指定cookie;这样做的影响就是rabbitmqctl由于没有指定cookie不能正常使用了,可以同样修改一下添加-setcookie.这里为了方便我拷贝rabbitmqctl新建了一个工具rabbitmq-util指定了cookie.如下:

[Erlang 0086] RabbitMQ 集群: 从零开始
exec erl \
    -pa "${RABBITMQ_HOME}/ebin" \
    -noinput \
    -hidden \
    ${RABBITMQ_CTL_ERL_ARGS} \
    -setcookie  zen_rabbitmq \
    -name rabbitmqctl@zen.com \
    -s rabbit_control \
    -nodename $RABBITMQ_NODENAME \
    -extra "$@"
[Erlang 0086] RabbitMQ 集群: 从零开始

   需要统一erlang cookie的脚本有:rabbitmqctl  rabbitmq-server

 

[2] 如果使用sname创建Erlang节点不包含节点所在机器的域名,如果使用name就需要指定域名,比如: 127.0.0.1  zen.com

Windows 下c:\Windows\System32\drivers\etc\hosts

Centos 路径 /etc/hosts

  

[3] 如果在一台机器上启动多个节点,就需要用端口号和节点名称区分开,即使在多个机器上部署一般我们也会避免使用RabbitMQ的默认端口.这里也有很多方式,最快捷的方式就是添加变量启动节点,在生产环境肯定需要使用配置文件来实现.下面是我测试使用的一组节点启动命令:

[Erlang 0086] RabbitMQ 集群: 从零开始
RABBITMQ_NODE_PORT=9991 RABBITMQ_NODENAME=z_91@zen.com ./rabbitmq-server -detached

RABBITMQ_NODE_PORT=9992 RABBITMQ_NODENAME=z_92@zen.com ./rabbitmq-server -detached

RABBITMQ_NODE_PORT=9993 RABBITMQ_NODENAME=z_93@zen.com ./rabbitmq-server -detached

RABBITMQ_NODE_PORT=9994 RABBITMQ_NODENAME=z_94@zen.com ./rabbitmq-server -detached
[Erlang 0086] RabbitMQ 集群: 从零开始

 

[4] 如果是在多台物理机进行测试,那么注意打开4369端口,保证EPMD正常工作.这个也可以通过修改环境变量ERL_EPMD_PORT使用别的端口.

 

[5] 执行命令细心一点,特别是关闭应用程序是stop_app,如果你执行的是stop,整个节点都会关闭,后续操作就错了;

   有一些操作过程不需要执行reset,这里也要注意,想清楚自己要做什么再动手.

 

  下面我们走一个step by step的过程,完成一些RabbitMQ集群组建的常见的操作

[Erlang 0086] RabbitMQ 集群: 从零开始

Just do it

 

从零开始创建集群 

 

  1. 启动z_91@zen.com节点
  2. 启动z_92@zen.com节点
  3. 关闭z_91节点的应用程序  ./rabbitmq-util -n z_91@zen.com stop_app
  4. 重置节点配置和元数据(可以理解为恢复出厂设置)  ./rabbitmq-util -n z_91@zen.com reset
  5. 91与92节点组成集群  ./rabbitmq-util -n z_91@zen.com cluster z_92@zen.com
  6. 启动91节点  ./rabbitmq-util -n z_91@zen.com start_app
  7. 查看集群的状态  ./rabbitmq-util -n z_91@zen.com cluster_status
[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#
[root@localhost scripts]# RABBITMQ_NODE_PORT=9991 RABBITMQ_NODENAME=z_91@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]# RABBITMQ_NODE_PORT=9992 RABBITMQ_NODENAME=z_92@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com stop_app
Stopping node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com reset
Resetting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com cluster z_92@zen.com
Clustering node 'z_91@zen.com' with ['z_92@zen.com'] ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com start_app
Starting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com cluster_status
Cluster status of node 'z_91@zen.com' ...
[{nodes,[{disc,['z_92@zen.com']},{ram,['z_91@zen.com']}]},
{running_nodes,['z_92@zen.com','z_91@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

   细心的你一定发现了,这里的结果有点奇怪,91节点将92节点拉入组成集群,但是disc节点是92,91节点是ram节点!这是怎么回事?我们暂且按下不表,后面细说,先来把实验做完.

  

退出集群

 

  记得911的一部纪录片提到劫机的匪徒在学习开飞机的课程只学习了起飞,没有学习降落,这,,,,,这就是找死去的啊.

  我们能够将Erlang 节点加入集群,也要学会退出集群.  看一下详细的步骤:

  1. 关闭91节点的应用程序 ./rabbitmq-util -n z_91@zen.com stop_app
  2. 重置节点配置和元数据  ./rabbitmq-util -n z_91@zen.com reset
  3. 启动91应用程序   ./rabbitmq-util -n z_91@zen.com start_app
  4. 查看集群状态  ./rabbitmq-util -n z_92@zen.com cluster_status       
[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com stop_app
Stopping node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com reset
Resetting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com start_app
Starting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com cluster_status                                      
Cluster status of node 'z_92@zen.com' ...
[{nodes,[{disc,['z_92@zen.com']}]},{running_nodes,['z_92@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

   可以看到集群中已经没有91节点了.

 

换一种方式组建集群

 

   下面换一种方式组建集群,目的是观察rabbitmq在构建集群是如何选择Disc node的.和第一种组建方式的差异在于这行命令: ./rabbitmq-util -n z_91@zen.com cluster z_92@zen.com z_91@zen.com    这样完成组建之后,查看一下集群状态,注意disk node的已经变成了:   [{nodes,[{disc,['z_91@zen.com','z_92@zen.com']}]},{running_nodes,['z_92@zen.com','z_91@zen.com']}]

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com stop_app     
Stopping node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com reset
Resetting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com cluster z_92@zen.com z_91@zen.com
Clustering node 'z_91@zen.com' with ['z_92@zen.com','z_91@zen.com'] ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com start_app
Starting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com cluster_status
Cluster status of node 'z_91@zen.com' ...
[{nodes,[{disc,['z_91@zen.com','z_92@zen.com']}]},
{running_nodes,['z_92@zen.com','z_91@zen.com']}]
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com cluster_status
Cluster status of node 'z_92@zen.com' ...
[{nodes,[{disc,['z_91@zen.com','z_92@zen.com']}]},
{running_nodes,['z_91@zen.com','z_92@zen.com']}]
...done.
[root@localhost scripts]# 
[Erlang 0086] RabbitMQ 集群: 从零开始

 

节点类型转换--将92从disk node转为ram node

 

   上面组建的rabbitmq集群里面有两个节点91 92,这两个节点都是disk节点,我们希望可以动态调整节点的类型,比如把92节点从disk node 转成ram node.看一下操作步骤 

  1. 停止92的应用程序   ./rabbitmq-util -n z_92@zen.com stop_app
  2. 重新进行cluster     ./rabbitmq-util -n z_92@zen.com cluster z_91@zen.com
  3. 启动92应用程序   ./rabbitmq-util -n z_92@zen.com start_app       
  4. 查看集群状态     ./rabbitmq-util -n z_92@zen.com cluster_status

  通过查看集群状态,可以看到92已经变成了Ram节点.注意:在停止了92的应用程序之后并没有执行reset操作.

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com stop_app
Stopping node 'z_92@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com cluster z_91@zen.com
Clustering node 'z_92@zen.com' with ['z_91@zen.com'] ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com start_app          
Starting node 'z_92@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com cluster_status
Cluster status of node 'z_92@zen.com' ...
[{nodes,[{disc,['z_91@zen.com']},{ram,['z_92@zen.com']}]},
{running_nodes,['z_91@zen.com','z_92@zen.com']}]
...done.
[root@localhost scripts]# 
[Erlang 0086] RabbitMQ 集群: 从零开始

  

节点类型转换--将92修改为disk节点

 

   上面的过程将92从disk node转换成为ram node ,下面我们执行逆过程,将92再转成disk node,看下过程:

  1. 关闭92应用程序   ./rabbitmq-util -n z_92@zen.com stop_app
  2. 重新执行Cluster  ./rabbitmq-util -n z_92@zen.com cluster z_91@zen.com z_92@zen.com
  3. 启动92应用程序   ./rabbitmq-util -n z_92@zen.com start_app
  4. 检查集群状态      ./rabbitmq-util -n z_91@zen.com cluster_status
  这里的要点就是在cluster命令,./rabbitmq-util -n z_92@zen.com cluster z_91@zen.com z_92@zen.com 当92执行cluster包含自己的时候,就会把自己设置为disk node.
 
[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com cluster_status                                  
Cluster status of node 'z_91@zen.com' ...
[{nodes,[{disc,['z_91@zen.com']},{ram,['z_92@zen.com']}]},
{running_nodes,['z_92@zen.com','z_91@zen.com']}]
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com stop_app
Stopping node 'z_92@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com cluster z_91@zen.com z_92@zen.com
Clustering node 'z_92@zen.com' with ['z_91@zen.com','z_92@zen.com'] ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com start_app
Starting node 'z_92@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com cluster_status
Cluster status of node 'z_91@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']}]},
{running_nodes,['z_92@zen.com','z_91@zen.com']}]
...done.
[root@localhost scripts]# 
[Erlang 0086] RabbitMQ 集群: 从零开始

  

增加node 93

 

   在上面的基础上我们新增一个节点93:

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]# ./rabbitmq-util -n z_93@zen.com cluster_status
Cluster status of node 'z_93@zen.com' ...
[{nodes,[{disc,['z_93@zen.com']}]},{running_nodes,['z_93@zen.com']}]
...done.
[root@localhost scripts]# ./rabbitmq-util -n z_93@zen.com stop_app
Stopping node 'z_93@zen.com' ...
...done.
[root@localhost scripts]# ./rabbitmq-util -n z_93@zen.com reset
Resetting node 'z_93@zen.com' ...
...done.

[root@localhost scripts]# ./rabbitmq-util -n z_93@zen.com cluster z_91@zen.com z_92@zen.com      
Clustering node 'z_93@zen.com' with ['z_91@zen.com','z_92@zen.com'] ...
...done.
[root@localhost scripts]# ./rabbitmq-util -n z_93@zen.com start_app
Starting node 'z_93@zen.com' ...
...done.
[root@localhost scripts]# ./rabbitmq-util -n z_93@zen.com cluster_status
Cluster status of node 'z_93@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},{ram,['z_93@zen.com']}]},
{running_nodes,['z_92@zen.com','z_91@zen.com','z_93@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

 

  重新启动节点,我们首先启动节点RAM节点93 

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com stop
Stopping and halting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com stop
Stopping and halting node 'z_92@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com stop
Stopping and halting node 'z_93@zen.com' ...
...done.
[root@localhost scripts]# RABBITMQ_NODE_PORT=9993 RABBITMQ_NODENAME=z_93@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com cluster_status
Cluster status of node 'z_93@zen.com' ...
Error: unable to connect to node 'z_93@zen.com': nodedown
[Erlang 0086] RabbitMQ 集群: 从零开始

 

  启动93失败了,这是因为93节点是RAM节点并没有持久化集群的元数据,启动时需要连接到disk node获取集群元数据,而这时其它的节点都没有启动,所以启动就失败了.下面我们尝试先启动92节点:

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]# RABBITMQ_NODE_PORT=9992 RABBITMQ_NODENAME=z_92@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com cluster_status                                   
Cluster status of node 'z_92@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},{ram,['z_93@zen.com']}]},
{running_nodes,['z_92@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

 

  是正常的,下面我们启动93节点

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]# RABBITMQ_NODE_PORT=9993 RABBITMQ_NODENAME=z_93@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com cluster_status                                   
Cluster status of node 'z_93@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},{ram,['z_93@zen.com']}]},
{running_nodes,['z_92@zen.com','z_93@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

 

 乘胜追击,继续启动91节点,注意cluster_status里面running nodes的变化:

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]# RABBITMQ_NODE_PORT=9991 RABBITMQ_NODENAME=z_91@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com cluster_status                                  
Cluster status of node 'z_93@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},{ram,['z_93@zen.com']}]},
{running_nodes,['z_91@zen.com','z_92@zen.com','z_93@zen.com']}]
...done.
[root@localhost scripts]# 
[Erlang 0086] RabbitMQ 集群: 从零开始

 

发布消息到集群

 

   我们用C#写一段代码连接到92节点 创建队列,并发布两条消息;可以看到消息虽然连接到92节点,集群中的其它节点也都有了队列信息.

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com list_queues
Listing queues ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com list_queues
Listing queues ...
zen_qp_pic_queue        1
qp_pic_queue2   1
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com list_queues 
Listing queues ...
zen_qp_pic_queue        1
qp_pic_queue2   1
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com list_queues
Listing queues ...
zen_qp_pic_queue        1
qp_pic_queue2   1
...done.
[root@localhost scripts]#
[Erlang 0086] RabbitMQ 集群: 从零开始

 

  在没有disk node的情况下,添加节点,移除节点

 

 为了方便下面的实验,我们添加94节点到集群中,过程省略,我们检查一下集群状态:

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_94@zen.com cluster_status
Cluster status of node 'z_94@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},
         {ram,['z_94@zen.com','z_93@zen.com']}]},
{running_nodes,['z_92@zen.com','z_93@zen.com','z_91@zen.com',
                 'z_94@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

 

 现在集群中两个disk node: 91 92 两个RAM Node:93 94 

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com stop
Stopping and halting node 'z_91@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_92@zen.com stop
Stopping and halting node 'z_92@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com cluster_status
Cluster status of node 'z_93@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},
         {ram,['z_94@zen.com','z_93@zen.com']}]},
{running_nodes,['z_94@zen.com','z_93@zen.com']}]
...done.
[root@localhost scripts]#
[Erlang 0086] RabbitMQ 集群: 从零开始

 

下面我们新增一个节点到集群中(过程略),看下结果

[root@localhost scripts]#  ./rabbitmq-util -n z_95@zen.com cluster_status
Cluster status of node 'z_95@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},
         {ram,['z_95@zen.com','z_94@zen.com','z_93@zen.com']}]},
{running_nodes,['z_94@zen.com','z_93@zen.com','z_95@zen.com']}]
...done.

 

  现在我们把集群中的所有节点都关闭,然后再启动,看下会是什么情况:

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_95@zen.com stop
Stopping and halting node 'z_95@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_94@zen.com stop
Stopping and halting node 'z_94@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com stop
Stopping and halting node 'z_93@zen.com' ...
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

 

启动集群中所有的节点

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]# RABBITMQ_NODE_PORT=9991 RABBITMQ_NODENAME=z_91@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]#  ./rabbitmq-util -n z_91@zen.com cluster_status
Cluster status of node 'z_91@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},
         {ram,['z_94@zen.com','z_93@zen.com']}]},
{running_nodes,['z_91@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始
[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]# RABBITMQ_NODE_PORT=9992 RABBITMQ_NODENAME=z_92@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]# RABBITMQ_NODE_PORT=9993 RABBITMQ_NODENAME=z_93@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]# RABBITMQ_NODE_PORT=9994 RABBITMQ_NODENAME=z_94@zen.com ./rabbitmq-server -detached
Activating RabbitMQ plugins ...
0 plugins activated:

[root@localhost scripts]#  ./rabbitmq-util -n z_93@zen.com cluster_status                                   
Cluster status of node 'z_93@zen.com' ...
[{nodes,[{disc,['z_92@zen.com','z_91@zen.com']},
         {ram,['z_94@zen.com','z_93@zen.com']}]},
{running_nodes,['z_94@zen.com','z_92@zen.com','z_93@zen.com']}]
...done.
[Erlang 0086] RabbitMQ 集群: 从零开始

     这时都是正常的,但是如果我们把95节点也启动,就会出现异常,91 92 节点可能会当掉.后面再启动就变成非常混乱的局面了.同样,如果在没有disk node的情况下移除了节点,也会导致这种混乱,甚至会导致disk node无法正常启动,必须把节点重新加入之后,disk node才可以正常启动.

 

   在disk node全部关闭的情况下,我们可以继续使用集群,就像什么都没有发生一样,但是使用过程中声明的新的exchange queues等都会随着节点的重启烟消云散.

 [Erlang 0086] RabbitMQ 集群: 从零开始

有可能遇到的问题

 

  问题1 怎样从头再来?

 

          如果你测试过程中把节点关系搞得乱七八糟,各种重启都会失败,想从头再来,但是崩溃的是reset命令执行也是失败;没有关系,要Hold住,转到/var/lib/rabbitmq/mnesia  目录把出问题节点对应的文件删掉,重启即可.

 

  问题2 "Incompatible schema cookies. Please, restart from old backup"

 

   在组建RabbitMQ集群的过程中,你可能会遇到"Incompatible schema cookies. Please, restart from old backup"的问题,这往往是下面的原因造成的:cluster多个节点,而这些节点并没有构成集群.复现一下这个错误:我们启动95 96 97 三个独立的节点,然后 ./rabbitmq-util -n z_97@zen.com cluster z_95@zen.com z_96@zen.com 注意这时95 96并没有组成集群,发生了上面的"Incompatible schema cookies. Please, restart from old backup"异常. 

 往往着急看到效果的时候,会犯这样的错,在不熟练的时候循序渐进的练习一下是很有必要的,可以避开一些坑.

 

[Erlang 0086] RabbitMQ 集群: 从零开始
[root@localhost scripts]#  ./rabbitmq-util -n z_95@zen.com cluster_status
Cluster status of node 'z_95@zen.com' ...
[{nodes,[{disc,['z_95@zen.com']}]},{running_nodes,['z_95@zen.com']}]
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_96@zen.com cluster_status
Cluster status of node 'z_96@zen.com' ...
[{nodes,[{disc,['z_96@zen.com']}]},{running_nodes,['z_96@zen.com']}]
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_97@zen.com cluster_status                                   
Cluster status of node 'z_97@zen.com' ...
[{nodes,[{disc,['z_97@zen.com']}]},{running_nodes,['z_97@zen.com']}]
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_97@zen.com stop_app
Stopping node 'z_97@zen.com' ...
...done.
[root@localhost scripts]#  ./rabbitmq-util -n z_97@zen.com cluster z_95@zen.com z_96@zen.com
Clustering node 'z_97@zen.com' with ['z_95@zen.com','z_96@zen.com'] ...
Error: {unable_to_join_cluster,
           ['z_95@zen.com','z_96@zen.com'],
           {merge_schema_failed,
               "Incompatible schema cookies. Please, restart from old backup.'z_95@zen.com' = [{name,schema},{type,set},{ram_copies,[]},{disc_copies,['z_95@zen.com']},{disc_only_copies,[]},{load_order,0},{access_mode,read_write},{majority,false},{index,[]},{snmp,[]},{local_content,false},{record_name,schema},{attributes,[table,cstruct]},{user_properties,[]},{frag_properties,[]},{storage_properties,[]},{cookie,{{1352,726635,757709},'z_95@zen.com'}},{version,{{3,0},{'z_95@zen.com',{1352,727291,753066}}}}], 'z_97@zen.com' = [{name,schema},{type,set},{ram_copies,['z_97@zen.com']},{disc_copies,['z_96@zen.com']},{disc_only_copies,[]},{load_order,0},{access_mode,read_write},{majority,false},{index,[]},{snmp,[]},{local_content,false},{record_name,schema},{attributes,[table,cstruct]},{user_properties,[]},{frag_properties,[]},{storage_properties,[]},{cookie,{{1352,727184,282322},'z_96@zen.com'}},{version,{{3,0},{'z_97@zen.com',{1352,727291,780429}}}}]\n"}}
[root@localhost scripts]# 
[Erlang 0086] RabbitMQ 集群: 从零开始

 注:  Rabbitmq 上有人遇到相同的问题 [链接]

 

   通过上面的动手实验,我们已经可以创建和管理RabbitMQ Cluster,但是创建RAM节点还是Disc节点呢?如何做这个选择呢?咱们下回再说

 

   附RabbitMQ Cluster 文档: http://www.rabbitmq.com/clustering.html#auto-config

 

另外,在测试过程中往往在单机创建多个实例,下面的命令常用:

A cluster on a single machine

Under some circumstances it can be useful to run a cluster of RabbitMQ nodes on a single machine. This would typically be useful for experimenting with clustering on a desktop or laptop without the overhead of starting several virtual machines for the cluster. The two main requirements for running more than one node on a single machine are that each node should have a unique name and bind to a unique port / IP address combination for each protocol in use.

You can start multiple nodes on the same host manually by repeated invocation of rabbitmq-server (rabbitmq-server.bat on Windows). You must ensure that for each invocation you set the environment variables RABBITMQ_NODENAME and RABBITMQ_NODE_PORT to suitable values.

For example:

$ RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit rabbitmq-server -detached
$ RABBITMQ_NODE_PORT=5673 RABBITMQ_NODENAME=hare rabbitmq-server -detached
$ rabbitmqctl -n hare stop_app
$ rabbitmqctl -n hare join_cluster rabbit@`hostname -s`
$ rabbitmqctl -n hare start_app

will set up a two node cluster with one disc node and one ram node. Note that if you have RabbitMQ opening any ports other than AMQP, you'll need to configure those not to *** as well - for example:

$ RABBITMQ_NODE_PORT=5672 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15672}]" RABBITMQ_NODENAME=rabbit rabbitmq-server -detached
$ RABBITMQ_NODE_PORT=5673 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15673}]" RABBITMQ_NODENAME=hare rabbitmq-server -detached

will start two nodes (which can then be clustered) when the management plugin is installed.

 

 

最后,小图一张 Maggie Q 简称MQ 我们俩都姓李 Nikita!

 

[Erlang 0086] RabbitMQ 集群: 从零开始