且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

php使用curl和preg_match_all

更新时间:2022-10-14 21:37:54

我认为正则表达式看起来像这个:



< td align = right>(\d +?)< / td> p>

但是,当你从XML / HTML结构中获取数据时,***使用解析器:

  $ dd = new DOMDocument(); 
$ dd-> loadHTML($ response);
$ tds = $ dd-> getElementsByTagName('td');

foreach($ tds as $ td){
if(is_numeric($ td-> nodeValue))
echo $ td-> nodeValue。'< br / >';
}


So what I'm wanting to do is use preg_match_all to pull the number from the table below. I've tried playing around with a few regular expressions, but I'm not getting it yet. I would like to pull the numbers and print them. ie.

//gets the site
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://site.org');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch); 

//parse the data
preg_match_all('/[0-9]+(?=[^0-9]+(N7:0<|N7:10<|N7:20))/', $response, $matches);

//prints the parsed data
print_r($matches[0]);

Any help would be great.

<html><head><title>Monitor</title></head>
<body bgcolor="#ffffff"><center>
<h2><font face="helvetica">Ethernet Processor</font></h2>
<h2><i>Data Table Monitor</i></h2>
<hr width=25% align=center>
<meta HTTP-EQUIV="refresh" CONTENT="15"><body bgcolor="#ffffff"><center><table border=1><tr><th align=left>Address</th><th width=50>0</th><th width=50>1</th><th width=50>2</th><th width=50>3</th><th width=50>4</th><th width=50>5</th><th width=50>6</th><th width=50>7</th><th width=50>8</th><th width=50>9</th></tr><tr><td>N7:0</td>
<td align=right>1</td>
<td align=right>1</td>
<td align=right>1</td>
<td align=right>99</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:10</td>
<td align=right>0</td>
<td align=right>7300</td>
<td align=right>16400</td>
<td align=right>3300</td>
<td align=right>2200</td>
<td align=right>6100</td>
<td align=right>28000</td>
<td align=right>18000</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:20</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:30</td>
<td align=right>16993</td>
<td align=right>29251</td>
<td align=right>28516</td>
<td align=right>25888</td>
<td align=right>20079</td>
<td align=right>29728</td>
<td align=right>18031</td>
<td align=right>30062</td>
<td align=right>25633</td>
<td align=right>0</td>
</tr><tr><td>N7:40</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:50</td>
<td align=right>205</td>
<td align=right>158</td>
<td align=right>152</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>79</td>
<td align=right>7</td>
<td align=right>19</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:60</td>
<td align=right>0</td>
<td align=right>4000</td>
<td align=right>18000</td>
<td align=right>2500</td>
<td align=right>1750</td>
<td align=right>2000</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:70</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>14</td>
<td align=right>0</td>
<td align=right>2210</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:80</td>
<td align=right>363</td>
<td align=right>347</td>
<td align=right>361</td>
<td align=right>0</td>
<td align=right>371</td>
<td align=right>379</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
</tr><tr><td>N7:90</td>
<td align=right>6</td>
<td align=right>474</td>
<td align=right>42</td>
<td align=right>114</td>
<td align=right>408</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>0</td>
<td align=right>308</td>
<td align=right>248</td>
</tr></table></center><hr width=25% align=center>

I think the regex you're after looks something like this:

<td align=right>(\d+?)</td>

However when you're getting data from an XML/HTML structure you're better off using a parser:

$dd = new DOMDocument();                                                                                                                                                                                   
$dd->loadHTML($response);                                                                                                                                                                                      
$tds = $dd->getElementsByTagName('td');                                                                                                                                                                    

foreach($tds as $td) {                                                                                                                                                                                     
    if(is_numeric($td->nodeValue))                                                                                                                                                                         
        echo $td->nodeValue.'<br />';                                                                                                                                                                      
}