且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

什么是 Solr 中良好的自动预热查询以及它们如何工作?

更新时间:2023-11-22 23:46:40

有两种升温方式.查询缓存预热和文档缓存预热(也有过滤器,但与查询类似).查询缓存预热可以通过一个设置来完成,该设置将在重新加载索引之前重新运行 X 个最近的查询.文档缓存变暖是不同的.

There are 2 types of warming. Query cache warming and document cache warming (There's also filters, but those are similar to queries). Query cache warming can be done through a setting which will just re-run X number of recent queries before the index was reloaded. Document cache warming is different.

文档缓存预热的目标是将大量最常访问的文档放入文档缓存中,这样就不必从磁盘读取它们.因此,您的查询应该集中在这一点上.您需要尝试找出最常搜索的文档是什么并加载它们.***使用最少数量的查询.这与字段的实际内容无关.澄清.当预热文档缓存时,您的主要兴趣是最常出现在搜索结果中的文档,无论它们是如何查询的.

The goal of document cache warming is to get a large quantity of your most frequently accessed documents into the document caches so they don't have to be read from disk. So, your queries should focus on this. You need to try and figure out what your most frequently searched documents are and load those. Preferably with a minimal number of queries. This has nothing to do with the actual content of the fields. To clarify. When warming document caches your primary interest is the documents that turn up in search RESULTS most often, regardless of how they are queried.

就个人而言,我会搜索以下内容:

Personally, I'd run searches for things like:

  • 如果您的大部分搜索都是针对美国电影,则按国家/地区加载.
  • 如果您的大部分搜索都是针对较新的电影,则按年份加载.
  • 如果您有大量搜索过的流派的简短列表,则按流派加载.

最后一种可能是全部加载.您的文档看起来很小.如今,就服务器内存而言,其中 70,000 个根本算不上什么.如果您的文档缓存足够大,并且您有足够的可用内存,那就去吧.附带说明一下,您的一些最大好处将来自您的文档缓存.查询缓存仅对重复查询有益,这可能令人失望地低.您几乎总能从大型文档缓存中受益.

A last possibility is to load them all. Your documents look small. 70,000 of them is nothing in terms of server memory nowadays. If your document cache is large enough, and you have enough memory available, go for it. As a side note, some of your biggest benefit will be from your document cache. A query cache is only beneficial for repeated queries, which can be disappointingly low. You almost always benefit from a large document cache.