适用于UUID的Java正则表达式

更新时间：2022-05-02 22:27:33

您的较正则表达式示例使用的是< ，其中输入为& lt; 所以很混乱.

Your example of a faster regex is using a < where the input is < so that's confusing.

关于速度，首先，您的UUID是十六进制的，因此请不要与 A-Z 匹配，而应与 a-f 匹配.其次，您不提供大小写混合的指示，因此请不要使用不区分大小写的字母，并在范围内编写正确的字母.

Regarding speed, first, your UUID is hexadecimal, so don't match with A-Z but rather a-f. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.

您无需解释是否需要UUID之前的部分.如果不是，请不要包含.*?，您也可以在中一起编写 re1 和 re2 的文字.最终模式.没有迹象表明您也需要DOTALL.

You don't explain if you need the part preceding the UUID. If not, don't include .*?, and you may as well write the literals for re1 and re2 together in your final Pattern. There's no indication you need DOTALL either.

private static final Pattern splitter =
  Pattern.compile("([a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8})");

或者，如果您测量正则表达式的性能太慢，则可以尝试另一种方法，例如:
在您的示例中，每个uuid前面是否都带有"uuid:"?如果可以的话

Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
Is each uuid preceded by "uuid:" as in your example? If so you can

找到"uuid:"的第一个索引为 i ，然后
子字符串0到 i +5 [假设您完全需要它]，并且
将字符串 i +5更改为 i +41，如果我算对的话(长度为36个字符).

find the first index of "uuid:" as i, then
substring 0 to i+5 [assuming you needed it at all], and
substring i+5 to i+41, if I counted that right (36 characters in length).

沿着相似的行，您更快的正则表达式可能是:

Along similar lines your faster regex could be:

private static final Pattern URN_UUID_PATTERN =
    Pattern.compile("^&lt;urn:uuid:(.{36})&gt;");

OTOH，如果您所有的输入字符串都将以这些确切的字符开头，则无需执行先前建议中的步骤1，只需 input.substring(13，49);

OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);

上一篇 : ：用Bash正则表达式匹配单词边界下一篇 : Lucene 正则表达式中的单词边界

适用于UUID的Java正则表达式

相关阅读

技术问答最新文章