且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Python regex - 用括号的内容替换括号内的文本

更新时间:2022-12-27 16:57:32

您可以使用捕获组来替换匹配的一部分:

>>>re.sub(r"{([^{}]+)}", r"\1", "foo{}bar{baz}")'foo{}barbaz'>>>re.sub(r"{([^{}]+)}", r"\1", "foo {} bar {baz}")'foo {} bar baz'

I'm trying to write a Python function that replaces instances of text surrounded with curly braces with the contents of the braces, while leaving empty brace-pairs alone. For example:

foo {} bar {baz} would become foo {} bar baz.

The pattern that I've created to match this is {[^{}]+}, i.e. some text that doesn't contain curly braces (to prevent overlapping matches) surrounded by a set of curly braces.

The obvious solution is to use re.sub with my pattern, and I've found that I can reference the matched text with \g<0>:

>>> re.sub("{[^{}]+}", "A \g<0> B", "foo {} bar {baz}")
'foo {} bar A {baz} B'

So that's no problem. However, I'm stuck on how to trim the brackets from the referenced text. If I try applying a range to the replacement string:

>>> re.sub("{[^{}]+}", "\g<0>"[1:-1], "foo{}bar{baz}")
'foo{}barg<0'

The range is applied before the \g<0> is resolved to the matched text, and it trims the leading \ and trailing >, leaving just g<0, which has no special meaning.

I also tried defining a function to perform the trimming:

def trimBraces(string):
    return string[1:-1]

But, unsurprisingly, that didn't change anything.

>>> re.sub("{[^{}]+}", trimBraces("\g<0>"), "foo{}bar{baz}")
'foo{}barg<0'

What am I missing here? Many thanks in advance.

You can use a capturing group to replace a part of the match:

>>> re.sub(r"{([^{}]+)}", r"\1", "foo{}bar{baz}")
'foo{}barbaz'
>>> re.sub(r"{([^{}]+)}", r"\1", "foo {} bar {baz}")
'foo {} bar baz'