更新时间:2022-01-20 23:40:50
有一个类名r
.请注意以下几点:
There is a class name r
. Observe the following:
Option Explicit
Public Sub GetLinks()
Dim html As HTMLDocument, links As Object, i As Long, counter As Long
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.google.co.uk/search?q=03701116565", False
.send
html.body.innerHTML = StrConv(.responseBody, vbUnicode)
End With
With html
Set links = .querySelectorAll(".r > [href] , .r h3")
End With
For i = 0 To links.Length - 1 Step 2
counter = counter + 1
ActiveSheet.Cells(counter, 1) = links.item(i)
ActiveSheet.Cells(counter, 2) = links.item(i + 1).innerText
Next
End Sub
实际的href
与子a
标记关联,该子标记位于您要按类定位的h3
标题标记元素之前. r
是a
标记的父级的类.
The actual href
is associated with a child a
tag which precedes the h3
header tag element which you are targeting by class. The r
is the class of the parent of the a
tag.
如果要使用后期绑定以及与之类似的方法,则可以使用效率较低的跟随方法.请注意,已选择父div元素,因此对于合格的类,可以访问a
标记和h3
.
If you want to use late bound, and a similar approach to yours, you can use the less efficient following method. Note that the parent div elements are selected so access to the a
tag and h3
are possible for qualifying classes.
Option Explicit
Public Sub GetLinks()
Dim html As Object, i As Long
Dim objResultDiv As Object, objH3 As Object, link As Object
Set html = CreateObject("htmlfile")
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.google.co.uk/search?q=03701116565", False
.send
html.body.innerHTML = .responseText
End With
Set objResultDiv = html.getElementById("rso")
Set objH3 = objResultDiv.getElementsByTagName("div")
For Each link In objH3
If link.className = "r" Then
i = i + 1
On Error Resume Next
ActiveSheet.Cells(i, 2) = link.getElementsByTagName("a")(0).href
ActiveSheet.Cells(i, 3) = link.getElementsByTagName("h3")(0).innerText
On Error GoTo 0
End If
Next
End Sub