且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

为什么解析Markdown链接的PHP正则表达式损坏?

更新时间:2023-02-23 13:34:47

如果我理解正确,那么您真正需要做的就是也将两者之间的任意数量的空格匹配,例如:

If I understand you right, all you need to do really is also match any number of spaces between the two as well, for example:

/\[([^]]*)\] *\(([^)]*)\)/i

说明:

\[             # Matches the opening square bracket (escaped)
([^]]*)        # Captures any number of characters that aren't close square brackets
\]             # Match close square bracket (escaped)
 *             # Match any number of spaces
\(             # Match the opening bracket (escaped)
([^)]*)        # Captures any number of characters that aren't close brackets
\)             # Match the close bracket (escaped)

辩护:

我可能应该证明我将您的.*?更改为[^]]*

I should probably justify that the reason I changed your .*? into [^]]*

第二个版本效率更高,因为它不需要像.*?那样进行大量的回溯.另外,一旦遇到[开头,.*?版本将继续查找直到找到匹配项,而不是如果它不是我们想要的标记,则会失败.例如,如果我们使用.*?与以下表达式匹配

The second version is more efficient because it doesn't need to do a huge amount of backtracking that .*? does. Additionally, once an opening [ is encountered, the .*? version will carry on looking until it finds a match, rather than failing if it is not a tag as we would want. For example, if we match the expression using .*? against:

Sad face :[ blah [LINK1](http://sub.example.com/) blah

它将匹配

[ blah [LINK1]

http://sub.example.com/

使用[^]]*方法将意味着输入正确匹配.

Using the [^]]* approach will mean that the input is matched correctly.