且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Ruby regex用于剥离BBCode

更新时间:2023-02-21 19:57:08

这篇文章的目的是说明如何解释BBCode,在保留剥离BBCode标记时应考虑这一点内容

这只会删除此页面所定义的BB代码标签.

This will only remove BB code tags as defined by this page.

不过,它可能删除的内容超过有效的BB代码标签.例如,[b ]Bold[/b] 不是此BBCode测试器加粗,因此,按权利,这些标签应单独放置.但是[\b]将被下面的正则表达式删除.它还会清楚地删除非BBCode,例如[\b=something]

It may remove more than what is considered valid BB code tag, though. For example, [b ]Bold[/b] is not bolded by this BBCode tester, so by right, those tags should be left alone. But [\b] will be removed by the regex below. It will also remove clearly non-BBCode such as [\b=something]

另一个示例是[url=http://example.com/ ][/url](注意空格).取决于BBCode解析器,这可能是确定的,还是不是确定的.下面的正则表达式会忽略开始标记,但会删除结束标记.

Another example is [url=http://example.com/ ][/url] (note the space). This might be OK or not OK depending on the BBCode parser. The regex below ignores the opening tag, but removes the closing tag.

/\[\/?(?:b|u|i|s|size|color|center|quote|url|img|ul|ol|list|li|\*|code|table|tr|th|td|***|gvideo)(?:=[^\]\s]+)?\]/

正则表达式也无法正确处理[code]标记,如此演示中所示.替换后,应仅将[code]放在code标记之间.

The [code] tag is also not treated correctly by the regex as seen in this demo. The replacement should leave [code] in between code tag alone.

BBCode测试器允许将[b][b][b]Text[/b][/b][/b]解析为加粗的Text ,但另一个将其解释为[b][b]Text[/b][/b],而部分[b][b]Text用粗体显示,其余部分不用粗体显示.如果允许嵌套标签,则正则表达式不是一个好选择.

This BBCode tester allows [b][b][b]Text[/b][/b][/b] to be parsed into Text bolded, but the other one interpret it as [b][b]Text[/b][/b] with the part [b][b]Text bolded and the rest not bolded. If you allow nested tags, then regex is not a good choice.