且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Cassandra种子节点和客户端连接到节点

更新时间:2022-10-25 09:59:22

回答自己的问题:



种子



a href =http://wiki.apache.org/cassandra/FAQ#seed>常见问题:


种子


也可以从 DataStax文档


种子节点名称没有其他而不是为加入集群的新节点引导gossip进程
。种子节点不是单个
点失败,它们在节点引导之外的
集群操作中也没有任何其他特殊用途。


从这些细节看来,一个种子对客户来说没什么特别的。



客户 b
$ b

有关客户端请求的 DataStax文档


Cassandra中的所有节点都是同级节点。客户端读或写请求可以
去集群中的任何节点。当客户端连接到一个节点并且
发出一个读或写请求时,该节点作为协调器
用于该特定客户端操作。



协调器的工作是作为客户端
应用程序和拥有所请求的
数据的节点(或副本)之间的代理。协调器根据群集配置的分区器和
副本放置策略确定环中的哪些节点应该
获取请求。


我认为客户端连接的节点池只是DC中的一小部分(随机?)节点,以允许潜在的故障。


I'm a little confused about Cassandra seed nodes and how clients are meant to connect to the cluster. I can't seem to find this bit of information in the documentation.

Do the clients only contain a list of the seed node and each node delegates a new host for the client to connect to? Are seed nodes only really for node to node discovery, rather than a special node for clients?

Should each client use a small sample of random nodes in the DC to connect to?

Or, should each client use all the nodes in the DC?

Answering my own question:

Seeds

From the FAQ:

Seeds are used during startup to discover the cluster.

Also from the DataStax documentation on "Gossip":

The seed node designation has no purpose other than bootstrapping the gossip process for new nodes joining the cluster. Seed nodes are not a single point of failure, nor do they have any other special purpose in cluster operations beyond the bootstrapping of nodes.

From these details it seems that a seed is nothing special to clients.

Clients

From the DataStax documentation on client requests:

All nodes in Cassandra are peers. A client read or write request can go to any node in the cluster. When a client connects to a node and issues a read or write request, that node serves as the coordinator for that particular client operation.

The job of the coordinator is to act as a proxy between the client application and the nodes (or replicas) that own the data being requested. The coordinator determines which nodes in the ring should get the request based on the cluster configured partitioner and replica placement strategy.

I gather that the pool of nodes that a client connects to can just be a handful of (random?) nodes in the DC to allow for potential failures.