且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

DOMDocument从HTML源代码中删除脚本标签

更新时间:2023-12-05 18:55:10

您的错误实际上是微不足道的.一个DOMNode对象(及其所有后代-DOMElementDOMNodeList和其他几个对象!)在其父元素发生更改时会自动更新,尤其是在其子级数目发生更改时会自动更新.这是写在PHP文档中的两行代码上,但大多被掩盖了.

Your error is actually trivial. A DOMNode object (and all its descendants - DOMElement, DOMNodeList and a few others!) is automatically updated when its parent element changes, most notably when its number of children change. This is written on a couple of lines in the PHP doc, but is mostly swept under the carpet.

如果使用($k instanceof DOMNode)->length循环,然后从节点中删除元素,则会注意到length属性实际上发生了变化!我必须编写自己的库来抵消此问题和其他一些怪癖.

If you loop using ($k instanceof DOMNode)->length, and subsequently remove elements from the nodes, you'll notice that the length property actually changes! I had to write my own library to counteract this and a few other quirks.

解决方案:

if($dom->loadHTML($result))
{
    while (($r = $dom->getElementsByTagName("script")) && $r->length) {
            $r->item(0)->parentNode->removeChild($r->item(0));
    }
echo $dom->saveHTML();

我实际上并没有在循环-只是一次弹出第一个元素.结果: http://sebrenauld.co.uk/domremovescript.php

I'm not actually looping - just popping the first element one at a time. The result: http://sebrenauld.co.uk/domremovescript.php