更新时间:2022-06-23 22:21:22
假设要解析字符串而不是HTML文档(根据您的问题),此方法将起作用:
Assuming String rather than HTML documents are being parsed (as per your question) this method will work:
public String escapeHtml(String source) {
Document doc = Jsoup.parseBodyFragment(source);
Elements elements = doc.select("b");
for (Element element : elements) {
element.replaceWith(new TextNode(element.toString(),""));
}
return Jsoup.clean(doc.body().toString(), new Whitelist().addTags("a").addAttributes("a", "href", "name", "rel", "target"));
}
您可以将"b"标签设为自变量,以传递要转义的标签列表.
You could make the "b" tag an argument to pass in a list of tags you wish to escape.
关联的通过JUnit测试:
The associated passing JUnit test:
@Test
public void testHtmlEscaping() throws Exception {
String source = "This is <b>REALLY</b> dirty code from <a href=\"www.rubbish.url.zzzz\">haxors-r-us</a>";
String expected = "This is <b>REALLY</b> dirty code from \n<a href=\"www.rubbish.url.zzzz\">haxors-r-us</a>";
String transformed = transformer.escapeHtml(source);
assertEquals(transformed, expected);
}
请注意,由于JSoup格式化了页面,因此我在测试的预期"字符串中的"a"标记之前添加了行返回"\ n".
Note that I added a line return "\n" before your "a" tag in my test's "expected" String because JSoup formats the page.