且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用斜杠字符查找和替换文本

更新时间:2023-02-23 17:52:50

使用 perl 正则表达式,您可以使用任何字符作为正则表达式分隔符,尽管

  • \w 中的字符(所以 s xfooxbarxs/foo/bar/ 相同)和
  • 问号 ?(隐式激活仅匹配一次行为,已弃用)和
  • 单引号 '...'(变量插值的循环)

应该避免.我更喜欢花括号:

perl -pi -w -e 's{path/to/file}{new/path/to/file}g;'*.html

分隔符可能不会出现在相应的字符串中,除非它们是平衡大括号或正确转义.所以你也可以

perl -pi -w -e 's/path\/to\/file/new\/path\/to\/file/g;'*.html

但这实在是太丑了.

当使用大括号/括号等时,正则表达式和替换之间可以有空格,允许像

这样的漂亮代码
$string =~ s {foo}{bar}g;

此上下文中另一个有趣的正则表达式选项是 quotemeta 函数.如果您的搜索表达式包含许多通常会被解释为具有特殊含义的字符,我们可以将该字符串包含在 \Q...\E 中.所以

m{\Qx*+\E}

匹配精确的字符串 x*+,即使包含 *、'+' 或 | 等字符也是如此.>

So I looked around on *** and I understand finding and replacing text works something like this:

perl -pi -w -e 's/www.example.com/www.pressbin.com/g;' *.html

However, what if the text I want to find and replace is a filepath that has slashes? How do I do it then?

perl -pi -w -e 's/path/to/file/new/path/to/file/g;' *.html

With perl regexes, you can use any character except spaces as regex delimiter, although

  • Characters in \w (so s xfooxbarx is the same as s/foo/bar/) and
  • Question marks ? (implicitly activates match-only-once behaviour, deprecated) and
  • single quotes '...' (turns of variable interpolation)

should be avoided. I prefer curly braces:

perl -pi -w -e 's{path/to/file}{new/path/to/file}g;' *.html

The delimiting character may not occur inside the respective strings, except when they are balanced braces or properly escaped. So you could also say

perl -pi -w -e 's/path\/to\/file/new\/path\/to\/file/g;' *.html

but that is dowrnright ugly.

When using braces/parens etc there can be whitespace between the regex and the replacement, allowing for beatiful code like

$string =~ s {foo}
             {bar}g;

Another interesting regex option in this context is the quotemeta function. If your search expression contains many characters that would usually be interpreted with a special meaning, we can enclose that string inside \Q...\E. So

m{\Qx*+\E}

matches the exact string x*+, even if characters like *, '+' or | etc. are included.