且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

PHP preg_match_all限制

更新时间:2021-12-17 14:41:52

增加PCRE回溯和递归限制可能会解决此问题,但是当数据大小达到新限制时仍然会失败. (无法通过更多数据很好地扩展)

increasing the PCRE backtrack and recursion limit may fix the problem, but will still fail when the size of your data hits the new limit. (doesn't scale well with more data)

示例:

<?php 
// essential for huge PCREs
ini_set("pcre.backtrack_limit", "23001337");
ini_set("pcre.recursion_limit", "23001337");
// imagine your PCRE here...
?>

要真正解决潜在的问题,您必须优化表达式并将(如果可能的话)将复杂的表达式拆分为部分",并将一些逻辑移至PHP.我希望您通过阅读示例获得想法.而不是尝试通过单个PCRE直接查找子结构,而是演示了一种更加迭代"的方法,该方法使用PHP越来越深入到结构中.例如:

to really solve the underlying problem, you must optimize your expression and (if possible) split your complex expression into "parts" and move some logic to PHP. I hope you get the idea by reading the example .. instead of trying to find the sub-structure directly with a single PCRE, i demonstrate a more "iterative" approach going deeper and deeper into the structure using PHP. example:

<?php
$html = file_get_contents("huge_input.html");

// first find all tables, and work on those later
$res = preg_match_all("!<table.*>(?P<content>.*)</table>!isU", $html, $table_matches);

if ($res) foreach($table_matches['content'] as $table_match) {  

    // now find all cells in each table that was found earlier ..
    $res = preg_match_all("!<td.*>(?P<content>.*)</td>!isU", $table_match, $cell_matches);

    if ($res) foreach($cell_matches['content'] as $cell_match) {

        // imagine going deeper and deeper into the structure here...
        echo "found a table cell! content: ", $cell_match;

    }    
}