且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用jsoup编辑html标签中的所有文本值

更新时间:2023-12-03 20:22:46

您可以尝试类似于这段代码:

You can try with something similar to this code:

String html = "<html><body><div><p>Test Data</p> <div> <p>HELLO World</p></div></div> other text</body></html>";

Document doc = Jsoup.parse(html);
List<Node> children = doc.childNodes();

// We will search nodes in a breadth-first way
Queue<Node> nodes = new ArrayDeque<>();

nodes.addAll(doc.childNodes());

while (!nodes.isEmpty()) {
    Node n = nodes.remove();

    if (n instanceof TextNode && ((TextNode) n).text().trim().length() > 0) {
        // Do whatever you want with n.
        // Here we just print its text...
        System.out.println(n.parent().nodeName()+" contains text: "+((TextNode) n).text().trim());
    } else {
        nodes.addAll(n.childNodes());
    }
}

你将得到以下输出: p>

And you'll get the following output:

body contains text: other text
p contains text: Test Data
p contains text: HELLO World