且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

抓取/从多个页面使用php preg_match_all&卷曲

更新时间:2021-12-17 14:42:04

这很混乱,因为它听起来像是你只是每页保存一个图像感兴趣。但是代码使它看起来像你实际上试图保存每个页面上的每个图像。

This was pretty confusing, because it sounded like you were only interested in saving one image per page. But then the code makes it look like you're actually trying to save every image on each page. So it's entirely possible I completely misunderstood... But here goes.

在每个页面上循环并不是那么困难:

Looping over each page isn't that difficult:

$i = 1;
$l = 101;

while ($i < $l) {
    $html = get_data('http://somedomain.com/id/'.$i.'/');
    getImages($html);
    $i += 1;
}

下面假设您试图保存 all 该特定页面上的图片:

The following then assumes that you're trying to save all the images on that particular page:

function getImages($html) {
    $matches = array();
    $regex = '~http://somedomain.com/images/(.*?)\.jpg~i';
    preg_match_all($regex, $html, $matches);
    foreach ($matches[1] as $img) {
        saveImg($img);
    }
}

function saveImg($name) {
    $url = 'http://somedomain.com/images/'.$name.'.jpg';
    $data = get_data($url);
    file_put_contents('photos/'.$name.'.jpg', $data);
}