且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

过滤弹性搜索结果仅包含基于一个字段值的唯一文档

更新时间:2022-12-20 11:53:25

你需要一个 top_hits 聚合。 p>

根据您的具体情况:

  {
查询:{
multi_match:{
...
}
},
aggs:{
top-uids
条款:{
field:uid
},
aggs:{
top_uids_hits:{
top_hits :{
sort:[
{
_score:{
order:de sc
}
}
],
size:1
}
}
}
}
}
}

上面的查询执行你的 multi_match 根据 uid 查询并聚合结果。对于每个uid bucket,它只返回一个结果,但是桶中的所有文档都是根据 _score 按照后代顺序排序的。


All my documents have a uid field with an ID that links the document to a user. There are multiple documents with the same uid.

I want to perform a search over all the documents returning only the highest scoring document per unique uid.

The query selecting the relevant documents is a simple multi_match query.

You need a top_hits aggregation.

And for your specific case:

{
  "query": {
    "multi_match": {
      ...
    }
  },
  "aggs": {
    "top-uids": {
      "terms": {
        "field": "uid"
      },
      "aggs": {
        "top_uids_hits": {
          "top_hits": {
            "sort": [
              {
                "_score": {
                  "order": "desc"
                }
              }
            ],
            "size": 1
          }
        }
      }
    }
  }
}

The query above does perform your multi_match query and aggregates the results based on uid. For each uid bucket it returns only one result, but after all the documents in the bucket were sorted based on _score in descendant order.