更新时间:2022-05-02 22:27:33
您的较正则表达式示例使用的是<
,其中输入为& lt;
所以很混乱.
Your example of a faster regex is using a <
where the input is <
so that's confusing.
关于速度,首先,您的UUID是十六进制的,因此请不要与 A-Z
匹配,而应与 a-f
匹配.其次,您不提供大小写混合的指示,因此请不要使用不区分大小写的字母,并在范围内编写正确的字母.
Regarding speed, first, your UUID is hexadecimal, so don't match with A-Z
but rather a-f
. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.
您无需解释是否需要UUID之前的部分.如果不是,请不要包含.*?
,您也可以在中一起编写
.没有迹象表明您也需要DOTALL. re1
和 re2
的文字.最终模式
You don't explain if you need the part preceding the UUID. If not, don't include .*?
, and you may as well write the literals for re1
and re2
together in your final Pattern
. There's no indication you need DOTALL either.
private static final Pattern splitter =
Pattern.compile("([a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8})");
或者,如果您测量正则表达式的性能太慢,则可以尝试另一种方法,例如:
在您的示例中,每个uuid前面是否都带有"uuid:"?如果可以的话
Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
Is each uuid preceded by "uuid:" as in your example? If so you can
沿着相似的行,您更快的正则表达式可能是:
Along similar lines your faster regex could be:
private static final Pattern URN_UUID_PATTERN =
Pattern.compile("^<urn:uuid:(.{36})>");
OTOH,如果您所有的输入字符串都将以这些确切的字符开头,则无需执行先前建议中的步骤1,只需 input.substring(13,49);
OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);