且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何找到除引号之间的空格以外的所有空格?

更新时间:2023-02-19 21:14:00

在#regex irc频道(irc.freenode.net)的用户MizardX的帮助下,找到了解决方案.它甚至支持单引号.

With the help of user MizardX from #regex irc channel (irc.freenode.net) solution was found. It even supports single quotes.

$str= 'word1 word2 \'this is a phrase\' word3 word4 "this is a second phrase" word5 word1 word2 "this is a phrase" word3 word4 "this is a second phrase" word5';

$regexp = '/\G(?:"[^"]*"|\'[^\']*\'|[^"\'\s]+)*\K\s+/';

$arr = preg_split($regexp, $str);

print_r($arr);

结果是:

Array (
    [0] => word1
    [1] => word2
    [2] => 'this is a phrase'
    [3] => word3
    [4] => word4
    [5] => "this is a second phrase"
    [6] => word5
    [7] => word1
    [8] => word2
    [9] => "this is a phrase"
    [10] => word3
    [11] => word4
    [12] => "this is a second phrase"
    [13] => word5  
)

PS.唯一的缺点是此正则表达式仅适用于PCRE 7.

PS. Only disadvantage is that this regexp works only for PCRE 7.

原来,我在生产服务器上不支持PCRE 7,仅在其中安装了PCRE 6.即使它不像以前的PCRE 7那样灵活,仍可以使用的regexp是(摆脱了\ G和\ K):

It turned out that I do not have PCRE 7 support on production server, only PCRE 6 is installed there. Even though it is not as flexible as previous one for PCRE 7, regexp that will work is (got rid of \G and \K):

/(?:"[^"]*"|\'[^\']*\'|[^"\'\s]+)+/

对于给定的输入结果与上面相同.

For the given input result is the same as above.