且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

用于管理字符串文字等项目的转义字符的正则表达式

更新时间:2023-02-26 10:46:36

我认为这会奏效:

import re
regexc = re.compile(r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'")

def check(test, base, target):
    match = regexc.search(base)
    assert match is not None, test+": regex didn't match for "+base
    assert match.group(1) == target, test+": "+target+" not found in "+base
    print "test %s passed"%test

check("Empty","''","")
check("single escape1", r""" Example: 'Foo \' Bar'  End. """,r"Foo \' Bar")
check("single escape2", r"""'\''""",r"\'")
check("double escape",r""" Example2: 'Foo \\' End. """,r"Foo \\")
check("First quote escaped",r"not matched\''a'","a")
check("First quote escaped beginning",r"\''a'","a")

正则表达式 r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'" 是仅前向匹配字符串中我们想要的内容:

The regular expression r"(?:^|[^\\])'(([^\\']|\\'|\\\\)*)'" is forward matching only the things that we want inside the string:

  1. 不是反斜杠或引号的字符.
  2. 转义引用
  3. 转义反斜杠

在前面添加额外的正则表达式以检查第一个转义的引号.

Add extra regex at front to check for first quote escaped.