且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在 SQL Server 中替换正则表达式 HTML 标签?

更新时间:2023-02-25 16:28:25

确实,T-SQL 本身并不支持正则表达式,在这种问题中,正则表达式将成为首选工具.首先,我会说解决方案的复杂程度在很大程度上取决于您的数据的一致性.例如,假设我们搜索具有以下标题的项目:

Indeed T-SQL does not natively support regular expressions and this is the sort of problem in which regular expressions would be the tool of choice. First, I'll say that the level of complication in the solution depends greatly on how consistent your data is. For example, suppose we search for items with the heading:

Select ..
From ...
Where HtmlContent Like '<span class="heading-2">%'

这假定 spanclass 之间没有额外的间距,并且在结束括号之前的最后一个双引号之后没有额外的间距.我们可以编写 '%<span%class="heading-2"%>%' 来说明空格,但也会发现 div 标记为 heading-2 在与任何 span 标签相同的内容中.如果后面的场景不应该发生,但你可能有不同的空间,那么使用这个修改后的模式.我们真正会遇到麻烦的是结束标签.假设我们的内容如下所示:

This assumes no additional spacing between span and class as well as no additional spacing after the final double quote before the end bracket. We could write '%<span%class="heading-2"%>%' to account for the spaces but that would also find div tags marked as heading-2 in the same content as any span tag. If this later scenario shouldn't happen but you might have varying spaces, then use this revised pattern. Where we will really run into troubles is the closing tag. Suppose our content looks like so:

<span class="heading-2"> Foo <span class="heading-3">Bar</span> And Gamma Too</span> .... <span class="heading-4">Fubar Is the right way!</span>...

找到正确的结束 span 标记以更改为 </h2> 并不是那么简单.您不能简单地找到第一个 并将其更改为 .如果你知道你没有嵌套的 span 标签,那么你可以编写一个用户定义的函数来做到这一点:

It is not so simple to find the correct closing span tag to change to </h2>. You cannot simply find the first </span> and change it to </h2>. If you knew that you had no nested span tags, then you could write a user-defined function that would do it:

Create Function ReplaceSpanToH2( @HtmlContent nvarchar(max) )
Returns nvarchar(max)
As
Begin
    Declare @StartPos int
    Declare @EndBracket int

    Set @StartPos = CharIndex('<span class="heading-2">', @HtmlContent)
    If @StartPos = 0
        Return @HtmlContent

    Set @HtmlContent = Replace(@HtmlContent, '<span class="heading-2">', '<h2>')

    -- find next </span>
    Set @StartPos = CharIndex('</span>', @HtmlContent, @StartPos)

    Set @HtmlContent = Stuff(@HtmlContent, @StartPos, 7, '</h2>')
    Return @HtmlContent
End