PHP删除指向特定网站的链接但保留文本

更新时间：2023-02-21 21:10:54

  $ html ='...我可以隐藏HTML？...'; 
 $ whitelist = array（'herpyderp.com'，'google.com'）; 

 $ dom = new DomDocument（）; 
 $ dom-> loadHtml（$ html）; 
 $ links = $ dom-> getELementsByTagName（'a'）; 

 foreach（$ link as $ link）{
 $ host = parse_url（$ link-> getAttribute（'href'），PHP_URL_HOST）; 

 if（$ host&&！in_array（$ host，$ whitelist））{

 //使用列入黑名单的链接$ b的内容创建一个文本节点$ b $ text = new DomText（$ link-> nodeValue）; 

 //在链接前插入
 $ link-> parentNode-> insertBefore（$ text，$ link）; 

 //并删除链接
 $ link-> parentNode-> removeChild（$ link）; 
} 

} 

 //删除解析器添加的包装标签
 $ dom-> removeChild（$ dom-> firstChild）; 
 $ dom-> replaceChild（$ dom-> firstChild-> firstChild-> firstChild，$ dom-> firstChild）; 

 $ html = $ dom-> saveHtml（）;

对于那些害怕使用DomDocument而不是 preg_replace 出于性能原因，我在这个和Q中链接的代码之间进行了快速测试（完全删除了链接的代码）=> DomDocument只慢了~4倍。 / p>

For example, <a href="http://msdn.microsoft.com/art029nr/">remove links to here but keep text</a> but <a href="http://herpyderp.com">leave all other links alone</a>

I've been trying to solve this using preg_replace. I've searched through here and found answers that solve pieces of the problem.

The answer at PHP: Remove all hyperlinks of specific domain from text removes links to a specific url but removes the text also.

The site at http://php-opensource-help.blogspot.ie/2010/10/how-to-remove-hyperlink-from-string.html removes a hyperlink from a string but I can't seem to modify the pattern so that it applies only to a specific website.

$html = '...I can haz HTML?...';
$whitelist = array('herpyderp.com', 'google.com');

$dom = new DomDocument();
$dom->loadHtml($html);    
$links = $dom->getELementsByTagName('a');

foreach($links as $link){
  $host = parse_url($link->getAttribute('href'), PHP_URL_HOST);

  if($host && !in_array($host, $whitelist)){    

    // create a text node with the contents of the blacklisted link
    $text = new DomText($link->nodeValue);

    // insert it before the link
    $link->parentNode->insertBefore($text, $link);

    // and remove the link
    $link->parentNode->removeChild($link);
  }  

}

// remove wrapping tags added by the parser
$dom->removeChild($dom->firstChild);            
$dom->replaceChild($dom->firstChild->firstChild->firstChild, $dom->firstChild);

$html = $dom->saveHtml();

For those scared to use DomDocument instead of preg_replace for performance reasons, I did a quick test between this and the code linked in the Q (the one that completely removes the links) => DomDocument is only ~4 times slower.

上一篇 : ：通过PHP curl发布文件下一篇 : 将选择框的选定值作为字符串发送到PHP

PHP删除指向特定网站的链接但保留文本

相关阅读

技术问答最新文章