且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Lucene-查询字段中的多个术语

更新时间:2022-01-19 05:35:24

取决于您用于该字段的分析器(它需要标记化并删除标点符号).您可以使用草率短语查询.

Depending on the analyser you use for the field (it would need to tokenise and remove the punctuation). You could use a slop phrase query.

"manchester paris"〜2应该只能找到12345.根据每个字段中值的数量和顺序,您可能需要使用更大的斜率.

"manchester paris"~2 should find just 12345. Depending on the number and order of values in each field you may need to use a larger slop.

斜率定义了允许匹配的短语上的操作"数.这可以是重新排序,也可以是词组中的其他术语.

The slop defines the number of "operations" on the phrase allowable to match. This can be reordering or additional terms within the phrase.

所以"x y"〜1可以匹配

So "x y"~1 could match

  • "y x"
  • "x fred y"
  • 但不包括"y fred x"(这需要两个操作:沼泽加一个附加项)

为满足您的需要,斜率可能应等于字段中允许的最大术语数.我还没有解决,但是我认为即使您查询两个以上的词也足够了.

For your need the slop probably ought to be equal to the maximum number of terms allowed in a field. I haven't worked it through but I think that would suffice even if you query for more than 2 terms.