且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

CouchDB-筛选复制-速度可以提高吗?

更新时间:2023-12-01 15:30:34

过滤复制的工作很慢,因为每个获取的文档都运行复杂的逻辑来决定是否复制它:

Filtered replications works slow because for each fetched document runs complex logic to decide whether to replicate it or not:


  1. CouchDB获取下一个文档;

  2. 因为必须应用过滤器功能,文档才转换为JSON;

  3. JSON化的文档通过stdio传递到查询服务器;

  4. 查询服务器处理文档并通过JSON对其进行解码;

  5. 现在,查询服务器查找并运行您的过滤器函数,该函数返回 true false 到CouchDB的值;

  6. 如果结果为 true 个文档将被复制; li>
  7. 转到第1页,循环浏览所有文档;

  1. CouchDB fetches next document;
  2. Because filter function has to be applied the document gets converted to JSON;
  3. JSONifyed document passes through stdio to query server;
  4. Query server handles document and decodes it from JSON;
  5. Now, query server lookups and runs your filter function which returns true or false value to CouchDB;
  6. If result is true document goes to be replicated;
  7. Go to p.1 and loop for all documents;

对于未过滤的复制,请执行以下操作列表,扔掉第2-5页,让第6页始终为 true 结果。这种开销减慢了整个复制过程的速度。

For non-filtered replications take this list, throw away p.2-5 and let p.6 has always true result. This overhead slows down whole replication process.

要显着提高过滤的复制速度,可以通过 Erlang本机服务器。它们在CouchDB中运行,不通过任何stdio接口,并且不应用JSON解码/编码开销。

To significantly improve filtered replication speed, you may use Erlang filters via Erlang native server. They runs inside CouchDB, doesn't pass through any stdio interface and there is no JSON decode/encode overhead applied.

注意,Erlang查询服务器未运行在沙箱中就像JavaScript一样,因此您需要真正信任使用它运行的代码。

另一种选择是优化过滤器功能,例如减少对象创建,方法调用,但实际上您不会因此获得任何好处。

Another option is to optimize your filter function e.g. reduce object creation, method calls, but actually you wouldn't win much with this.