且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Elasticsearch反向match_phrase

更新时间:2022-06-22 22:48:17

我知道Stefan在评论中提供了一种简单有效的解决方案,但您可能还需要查看跨度查询仅供参考!

I know Stefan has given a simple and efficient solution in the comments, but you may also want to look at Span Queries as an FYI!!

我已经创建了示例映射,文档,查询和响应:

I've created sample mapping, documents, query and response:

PUT my_span_index
{
  "mappings": {
    "properties": {
      "Title":{
        "type": "text"
      }
    }
  }
}



示例文档:



Sample Documents:

POST my_span_index/_doc/1
{
  "Title": "Western Europe"
}

POST my_span_index/_doc/2
{
  "Title": "Eastern Europe"
}

//slop - distance between words Western and Europe here is 13
POST my_span_index/_doc/3
{
  "Title": "As far as Western culture is America, we see gradually more and more of the same in Europe"
}



Span Query:

Span Query:

POST my_span_index/_search
{
    "query": {
        "span_near" : {
            "clauses" : [
                { "span_term" : { "Title": "western" } },
                { "span_term" : { "Title": "europe" } }
            ],
            "slop" : 12,                                <---- Distance Between Words
            "in_order" : true                           <---- If order is important
        }
    }
}

请注意,我使用了跨近& 跨度查询并注意上面的评论。

Note that I made use of Span Near & Span Term Query and do note the comments above.

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.5420371,
    "hits" : [
      {
        "_index" : "my_span_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.5420371,
        "_source" : {
          "Title" : "Western Europe"
        }
      },
      {
        "_index" : "my_span_index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.028773852,
        "_source" : {
          "Title" : "As far as Western culture is America, we see gradually more and more of the same in Europe"
        }
      }
    ]
  }
}

请注意,在响应中还会返回具有 id:3 的文档,但是如果将斜率更改为较小的值,它不会出现。

Note that in the response the doc having id:3 is also returned, however if you change the slop to lesser value, it would not appear.

痛苦的是,如果您的请求要有更多的令牌,您最终将在应用程序端编写/生成长查询。

The pain would be that you'd end up writing/generating long query at your application side if your request is going to have more tokens.

希望我帮助了!