更新时间:2022-11-14 09:06:16
听起来您想在分析中使用 ShingleFilter,以便索引单词 bigram:因此在查询和索引时都添加 ShingleFilterFactory.
It sounds like you want to use ShingleFilter in your analysis, so that you index word bigrams: so add ShingleFilterFactory at both query and index time.
在索引时,您的文档将被编入索引:
At index time your documents are then indexed as such:
在查询时,您的查询变为:
At query time your query becomes:
这样还是不行,默认会形成词组查询.因此,在您的仅查询分析器中,在 ShingleFilterFactory 之后添加 PositionFilterFactory.这将展平"查询中的位置,以便查询解析器将输出视为同义词,这将产生一个带有这些子项的布尔查询(所有 SHOULD 子句,所以它基本上是一个 OR 查询):
This is still no good, by default it will form a phrase query. So in your query analyzer only add PositionFilterFactory after the ShingleFilterFactory. This "flattens" the positions in the query so that the queryparser treats the output as synonyms, which will yield a booleanquery with these subs (all SHOULD clauses, so its basically an OR query):
布尔查询:
这应该是最高效的方式,因为它实际上只是术语查询的布尔查询.
this should be the most performant way, as then its really just a booleanquery of termqueries.