且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何替换字符串中的特殊字符?

更新时间:2023-01-22 19:24:59

这取决于你的意思。如果您只想摆脱它们,请执行以下操作:

(更新:显然您也想保留数字,在这种情况下使用第二行)

That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

或等价物:

String alphaOnly = input.replaceAll("[^\\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\\p{Alpha}\\p{Digit}]+","");

(所有这些都可以通过预编译正则表达式并将其存储在常量中来显着改善)

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

或者,使用番石榴

private static final CharMatcher ALNUM =
  CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
  .or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

但是如果你想将重音字符转换成仍然是ascii的合理字符,请看这些问题:

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:

  • Converting Java String to ASCII
  • Java change áéőűú to aeouu
  • ń ǹ ň ñ ṅ ņ ṇ ṋ ṉ ̈ ɲ ƞ ᶇ ɳ ȵ --> n or Remove diacritical marks from unicode chars