且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

具有频繁更新索引的 FieldCache

更新时间:2023-12-04 15:38:07

FieldCache 使用对索引读取器的弱引用作为其缓存的键.(通过调用未过时的 IndexReader.GetCacheKey.)使用 FSDirectoryIndexReader.Open 的标准调用将使用读者,每个部分都有一个.

The FieldCache uses weak references to index readers as keys for their cache. (By calling IndexReader.GetCacheKey which has been un-obsoleted.) A standard call to IndexReader.Open with a FSDirectory will use a pool of readers, one for every segment.

您应该始终将最里面的阅读器传递给 FieldCache.查看 ReaderUtil 以获取一些帮助内容,以检索包含文档的单个阅读器.文档 ID 不会在一个段内更改,当将其描述为不可预测/易失性时,它们的意思是它将在两个索引提交之间更改.已删除的文档可能已被删除,段已被合并,以及此类操作.

You should always pass the innermost reader to the FieldCache. Check out ReaderUtil for some helper stuff to retrieve the individual reader a document is contained within. Document ids wont change within a segment, what they mean when describing it as unpredictable/volatile is that it will change between two index commits. Deleted documents could have been proned, segments have been merged, and such actions.

提交需要从磁盘中删除段(合并/优化掉),这意味着新的读取器不会拥有池化的段读取器,并且垃圾收集会在所有旧读取器关闭后立即将其删除.

A commit needs to remove the segment from disk (merged/optimized away), which means that new readers wont have the pooled segment reader, and the garbage collection will remove it as soon as all older readers are closed.

永远不要调用 FieldCache.PurgeAllCaches().它用于测试,而不是生产用途.

Never, ever, call FieldCache.PurgeAllCaches(). It's meant for testing, not production use.

添加于 2011-04-03;使用子阅读器的示例代码.

Added 2011-04-03; example code using subreaders.

var directory = FSDirectory.Open(new DirectoryInfo("index"));
var reader = IndexReader.Open(directory, readOnly: true);
var documentId = 1337;

// Grab all subreaders.
var subReaders = new List<IndexReader>();
ReaderUtil.GatherSubReaders(subReaders, reader);

// Loop through all subreaders. While subReaderId is higher than the
// maximum document id in the subreader, go to next.
var subReaderId = documentId;
var subReader = subReaders.First(sub => {
    if (sub.MaxDoc() < subReaderId) {
        subReaderId -= sub.MaxDoc();
        return false;
    }

    return true;
});

var values = FieldCache_Fields.DEFAULT.GetInts(subReader, "newsdate");
var value = values[subReaderId];