更新时间:2022-11-15 11:21:18
你的正则表达式中有一个小错误。试试这个:
You have one small mistake in your regex. Try this:
String[] Res = Text.split("[\\p{Punct}\\s]+");
[\\\\ {{Punct} \\\ \\ s] +
将字符类中的 +
表单移到外面。另外,你也在 +
上拆分,并且不要连续组合拆分字符。
[\\p{Punct}\\s]+
move the +
form inside the character class to the outside. Other wise you are splitting also on a +
and do not combine split characters in a row.
所以我得到了对于此代码
So I get for this code
String Text = "But I know. For example, the word \"can\'t\" should";
String[] Res = Text.split("[\\p{Punct}\\s]+");
System.out.println(Res.length);
for (String s:Res){
System.out.println(s);
}
此结果
10
但是
我
知道
例子
字
可以
t
应该
10
But
I
know
For
example
the
word
can
t
should
哪个符合您的要求。
作为替代方案,您可以使用
As an alternative you can use
String[] Res = Text.split("\\P{L}+");
\\\\ {L}
表示不是具有Letter属性的unicode代码点
\\P{L}
means is not a unicode code point that has the property "Letter"