且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

查找缺少任意字段的 CouchDB 文档

更新时间:2023-02-26 17:56:00

这种技术被称为泰式按摩.当(且仅当)视图以文档 ID 为键时,使用它在视图中有效地查找文档.

This technique is called the Thai massage. Use it to efficiently find documents not in a view if (and only if) the view is keyed on the document id.

function(doc) {
    // _view/fields map, showing all fields of all docs
    // In principle you could emit e.g. "foo.bar.baz"
    // for nested objects. Obviously I do not.
    for (var field in doc)
        emit(field, doc._id);
}

function(keys, vals, is_rerun) {
    // _view/fields reduce; could also be the string "_count"
    return re ? sum(vals) : vals.length;
}

要查找没有该字段的文档,

To find documents not having that field,

  1. GET/db/_all_docs 并记住所有 ID
  2. GET/db/_design/ex/_view/fields?reduce=false&key="some_field"
  3. 比较 _all_docs 中的 id 与查询中的 id.
  1. GET /db/_all_docs and remember all the ids
  2. GET /db/_design/ex/_view/fields?reduce=false&key="some_field"
  3. Compare the ids from _all_docs vs the ids from the query.

_all_docs 中但不在视图中的 id 是缺少该字段的那些.

The ids in _all_docs but not in the view are those missing that field.

将 id 保存在内存中听起来很糟糕,但您不必这样做!您可以使用合并排序策略,同时迭代两个查询.您从 has 列表的第一个 id(来自视图)和 full 列表的第一个 id(来自 _all_docs)开始.

It sounds bad to keep the ids in memory, but you don't have to! You can use a merge sort strategy, iterating through both queries simultaneously. You start with the first id of the has list (from the view) and the first id of the full list (from _all_docs).

  1. 如果 <,缺少字段,用下一个完整元素重做
  2. 如果full = has,它有字段,用下一个full元素重做
  3. 如果 full > has,用下一个 has 元素重做
  1. If full < has, it is missing the field, redo with the next full element
  2. If full = has, it has the field, redo with the next full element
  3. If full > has, redo with the next has element

根据您的语言,这可能很困难.但在 Javascript 或其他事件驱动的编程框架中非常容易.

Depending on your language, that might be difficult. But it is pretty easy in Javascript, for example, or other event-driven programming frameworks.