且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

正则表达式匹配特定的单词并忽略特定的文本

更新时间:2023-02-25 16:11:06


  1. 首先匹配包含文本发票不足的行Reported

  2. 然后在开始时使用负向前视断言,以便不匹配行,前提是字符开头它们实际上由 Line \\s(?:( \\d +)\\s )?\\s *:\\s +(\\d +)? pattern。


正则表达式:

 ^ Line \\s(?:( \\ d +)\\s)?\\s *:\\s+(\\d +)?。*?发票不足报告。* | ^(?! Line \\s (?:( \\d +)\\s)?\\\s *:\\s+(\\d +)?。*)。+

DEMO


i have a list of below error messages

def errorMessages = ["Line : 1 Invoice does not foot Reported"
                     "Line : 2 Could not parse INVOICE_DATE value"
                     "Line 3 : Could not parse ADJUSTMENT_AMOUNT value"
                     "Line 4 : MATH ERROR"
                     "cl_id is a required field"
                     "File Error : The file does not contain delimiters"
                     "lf_name is a required field"]

Am trying to create a new list which doesnt match the regex "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+" but has the text Invoice does not foot Reported

my desired new list should be like below

def headErrors= ["Line : 1 Invoice does not foot Reported"
                 "cl_id is a required field"
                 "File Error : The file does not contain delimiters"
                 "lf_name is a required field"]

this is what am doing for now

regex = "^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.+"
errorMessages.each{
    if(it.contains('Invoice does not foot Reported'))
        headErrors.add(it)
    else if(!it.matches(regex)
        headErrors.add(it)
}

Is there a way it can be done just using regex instead of if else?

  1. At first match the line which contains the text Invoice does not foot Reported in the message part.

  2. Then use a negative lookahead assertion at the start to not to match a line if it is startswith the chars which are actually matched by Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)? pattern.

Regex:

"^Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.*?Invoice does not foot Reported.*|^(?!Line\\s(?:(\\d+)\\s)?\\s*:\\s+(\\d+)?.*).+"

DEMO