更新时间:2022-06-10 04:32:05
您可以使用正则表达式匹配来查找尖括号之间的所有字符串,并循环遍历/处理每个字符串。在此示例中,我使用了 Apache Commons Lang 进行XML转义。 / p>
You could use regular expression matching to find all the strings between angled brackets, and loop through/process each of those. In this example I've used the Apache Commons Lang to do the XML escaping.
public String sanitiseXml(String xml)
{
// Match the pattern <something>text</something>
Pattern xmlCleanerPattern = Pattern.compile("(<[^/<>]*>)([^<>]*)(</[^<>]*>)");
StringBuilder xmlStringBuilder = new StringBuilder();
Matcher matcher = xmlCleanerPattern.matcher(xml);
int lastEnd = 0;
while (matcher.find())
{
// Include any non-matching text between this result and the previous result
if (matcher.start() > lastEnd) {
xmlStringBuilder.append(xml.substring(lastEnd, matcher.start()));
}
lastEnd = matcher.end();
// Sanitise the characters inside the tags and append the sanitised version
String cleanText = StringEscapeUtils.escapeXml10(matcher.group(2));
xmlStringBuilder.append(matcher.group(1)).append(cleanText).append(matcher.group(3));
}
// Include any leftover text after the last result
xmlStringBuilder.append(xml.substring(lastEnd));
return xmlStringBuilder.toString();
}
这会查找< something> text< / something>的匹配项,并捕获标签名称和包含的文本,对包含的文本进行消毒,然后将其放回原处。
This looks for matches of <something>text</something>, captures the tag names and contained text, sanitises the contained text, and then puts it back together.