且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

PHP Curl:获取目录列表并下载连接到HTTP的目录

更新时间:2023-12-04 20:35:52

您将不得不解析服务器生成的列表,无论是上面的DirectoryListing生成的,还是服务器另一个生成链接列表的脚本。

You're going to have to parse a list generated by the server, whether that is by DirectoryListing as above, or another server-side script that generates a list of links.

然后您将解析HTML并提取所有a href标记。

You'll then parse the HTML and pull out of all the a href tags.

如果您依赖另一个脚本的输出(Directorylisting),则可能需要通过整洁地运行HTML来生成XHTML,然后将其传递给simplexml。然后,您可以编写类似‘// a’的xpath查询并检索所有属性。

If you're relying on the output of another script (Directorylisting), you may need to run the HTML through tidy to produce XHTML, then pass that into simplexml. You can then write an xpath query like '//a' and retrieve all the attributes.

$list = array();
$x = new SimpleXMLElement($stringfromcurl);
foreach ($x->xpath('//a') as $node) {
    curl_fetch_href($x['href']);
}

或者...自己生成列表,因为它比较容易解析,然后执行相同的交易。

Or... generate the list yourself as something a little easier to parse, then do the same sort of deal.

这等效于执行 wget -r -l1