且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

弹性搜索中的部分词搜索的自定义类型

更新时间:2023-11-18 17:35:16

我的猜测,更改 index_analyzer with analyzer




  • 这取决于你的弹性搜索版本



编辑



我不熟悉创建类型,但您可以使用 first_name 相关分析器:

 first_name:{
type:string,
:translation_index_analyzer,
search_analyzer:translation_search_analyzer
},


I am in a tight schedule for a demo with Elastic Search.

I have setup my database, used JDBC input with Logstash and started indexing database tables. Its all working fine for some extent. But partial word search is not working. A quick search in Elastic documents suggested nGram filters and stemmers. Since I didn't get stemmers I followed the path of ngram filters.

Following is my index mapping,

{
 "settings" : {
	"analysis" : {
		"analyzer" : {
			"translation_index_analyzer" : {
				"type" : "custom",
				"tokenizer" : "standard",
				"filter" : "standard, lowercase, translation"
			},
			"translation_search_analyzer" : {
				"type" : "custom",
				"tokenizer" : "standard",
				"filter" : "standard, lowercase"
			}		
		},
		"filter" : {
			"translation" : {
				"type" : "nGram",
				"min-gram" : 3,
				"max-gram" : 12
			}
		}
    	}
 }, 
 "mappings" : {
        "type" : {
		"properties" : {
		     "myType" : {
			"type" : "string",
                        "index_analyzer" : "translation_index_analyzer",
                        "search_analyzer" : "translation_search_analyzer"
		     }	
		}
        }, 
        "employee" : {
            "properties" : {
                      "birth_date": {
                             "type": "date",
                            "format" : "yyyy-MM-dd HH:mm:ss ZZ"
          },
          "emp_no": {
                   "type": "long",
                    "index" : "not_analyzed"
          },
          "first_name": {
            "type": "myType"
          },
          "gender": {
            "type": "string",
              "index" : "not_analyzed"
          },
          "hire_date": { 
                             "type": "date",
                            "format" : "yyyy-MM-dd HH:mm:ss ZZ"
         },
          "last_name": {
            "type": "string"
          }
            }
        }
    }
   

I rather guessed the type mapping as I need something like that desperately. Not sure whether thats correct.

When I post this I am getting the error

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "analyzer on field [myType] must be set when search_analyzer is set"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [type]: analyzer on field [myType] must be set when search_analyzer is set",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "analyzer on field [myType] must be set when search_analyzer is set"
    }
  },
  "status": 400
}

Any clue where I have got it wrong?

UPDATE: My elastic version is 2.2
change from index_analyzer to analyzer gave a different error

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "No handler for type [myType] declared on field [first_name]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [employee]: No handler for type [myType] declared on field [first_name]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "No handler for type [myType] declared on field [first_name]"
    }
  },
  "status": 400
}

FINAL SOLUTION: Well, my issue was with defining a type. Which I will ignore for the moment but partial search works with following setup

{
"settings" : {
            "analysis" : {
                "analyzer" : {
                    "my_ngram_analyzer" : {
                        "tokenizer" : "my_ngram_tokenizer"
                    }
                },
                "tokenizer" : {
                    "my_ngram_tokenizer" : {
                        "type" : "nGram",
                        "min_gram" : "3",
                        "max_gram" : "12",
                        "token_chars": [ "letter", "digit" ]
                    }
                }
            }
},
 
 "mappings" : {
 
        "employee" : {
            "properties" : {
                      "birth_date": {
                             "type": "date",
                            "format" : "yyyy-MM-dd HH:mm:ss ZZ"
          },
          "emp_no": {
                   "type": "long",
                    "index" : "not_analyzed"
          },
          "first_name": {
            "type": "string",
                        "analyzer" : "my_ngram_analyzer",
                        "search_analyzer" : "my_ngram_analyzer"
          },
          "gender": {
            "type": "string",
              "index" : "not_analyzed"
          },
          "hire_date": { 
                             "type": "date",
                            "format" : "yyyy-MM-dd HH:mm:ss ZZ"
          },
          "last_name": {
            "type": "string",
                        "analyzer" : "my_ngram_analyzer",
                        "search_analyzer" : "my_ngram_analyzer"
          }
            }
        }
    }
}

My guess, change index_analyzer with analyzer

  • It's depend your elasticsearch version

EDIT

I'm not familiar with creating type, but you can init first_name with the relevant analyzer:

"first_name": {
  "type" : "string",
  "analyzer" : "translation_index_analyzer",
  "search_analyzer" : "translation_search_analyzer"
},

相关阅读

推荐文章