且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

SQL Server全文搜索包含连字符的短语不会返回预期的结果

更新时间:2023-02-07 07:38:35

http://support.microsoft.com/default.aspx?scid=kb;en-us;200043



在搜索标准中必须使用非字母数字字符(主要是短划线' - '字符)时,请使用Transact-SQL LIKE子句而不是FULLTEXT或CONTAINS谓词。

We have an application that using a SQL Server 2008 database, and full-text search. I'm trying to understand why the following searches behave differently:

First, a phrase containing a hyphenated word, like this:

contains(column_name, '"one two-three-four five"')

And second, an identical phrase, where the hyphens are replaced by spaces:

contains(column_name, '"one two three four five"')

The full-text index uses the ENGLISH (1033) locale, and the default system stoplist.

From my observations of other full-text searches containing hyphenated words, the first one should allow for matches on either one two three four five or one twothreefour five. Instead, it only matches one twothreefour five (and not one two-three-four five).


Test Case

Setup:

create table ftTest 
(
    Id int identity(1,1) not null, 
    Value nvarchar(100) not null, 
    constraint PK_ftTest primary key (Id)
);

insert ftTest (Value) values ('one two-three-four five');
insert ftTest (Value) values ('one twothreefour five');

create fulltext catalog ftTest_catalog;
create fulltext index on ftTest (Value language 1033)
    key index PK_ftTest on ftTest_catalog;
GO

Queries:

--returns one match
select * from ftTest where contains(Value, '"one two-three-four five"')

--returns two matches
select * from ftTest where contains(Value, '"one two three four five"')
select * from ftTest where contains(Value, 'one and "two-three-four five"')
select * from ftTest where contains(Value, '"one two-three-four" and five')
GO

Cleanup:

drop fulltext index on ftTest
drop fulltext catalog ftTest_catalog;
drop table ftTest;

http://support.microsoft.com/default.aspx?scid=kb;en-us;200043

"Where non-alphanumeric character must be used in the search critera (primarily the dash '-' character), use the Transact-SQL LIKE clause instead of the FULLTEXT or CONTAINS predicates."