且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用起始字符串和结束字符串从长字符串中提取子字符串?

更新时间:2022-03-10 18:06:24

这对我有用,基于您的(单个)示例。学习使用不情愿的修饰符来表达正则表达式。在这种情况下,他们会帮助你很多。

This worked for me, based on your (single) example. Learn to use the reluctant modifiers for regular expressions. They'll help you a lot in situations like this.

例如,要获得与第一部分相匹配的字符串:Home地址(。+?)\ + \d +最后更新:此正则表达式不会跳过我们不想要的上次更新字符串或+ dd(数字)。正则表达式表达式(。+?)是不情愿的(不是贪婪的),不会跳过+号或数字,让它们与表达式的其余部分匹配。

For example, to get a string of characters to match the first part: "Home address (.+?) \+\d+ Last Updated: this regex will not skip the "Last Updated" string or the "+dd" (digits) we don't want. The regex expression "(.+?)" is reluctant (not greedy) and won't skip over the + sign or the digits, leaving them to be matched by the rest of the expression.

你可以使用它来匹配静态文本包围的正则表达式中的子串。这里我使用捕获组来找到我想要的文本。(捕获组是括号中的部分。)

You can use this to match substrings in a regular expression that is surrounded by static text. Here I'm using capturing groups to locate the text I want. (Capturing groups are the parts in parenthesis.)

class Goofy
{

   public static void main( String[] args )
   {
      final String input
              = "Home address H.NO- 12 SECTOR- 12 GAUTAM BUDH NAGAR " +
              "NOIDA- 121212, UTTAR PRADESH INDIA +911112121212 " +
              "Last Updated: 12-JUN-12 Semester/Term-time " +
              "Accommodation Type: Hall of residence (private " +
              "provider) Semester/Term-time address A121A SOME " +
              "APPARTMENT SOME LANE CITY COUNTY OX3 7FJ +91 " +
              "1212121212 Last Updated: 12-SEP-12 Mobile Telephone " +
              "Number : 01212121212";

      final String regex = "Home address (.+?) \\+\\d+ Last Updated: " +
              "\\S+ Semester/Term-time Accommodation Type: (.+?) " +
              "Semester/Term-time address (.+?) \\+\\d\\d \\d+ " +
              "Last Updated.+ Number : (\\d+)";

      Pattern pattern = Pattern.compile( regex );
      Matcher matcher = pattern.matcher( input );
      if( matcher.find() ) {
         System.out.println("Found: "+matcher.group() );
         for( int i = 1; i <= matcher.groupCount(); i++ ) {
            System.out.println( "   Match " + i + ": " + matcher.group( i ));
         }
      }
   }
}