更新时间:2023-09-26 12:19:22
//创建一个Web客户端。
Web客户端Web客户端=新的Web客户端(BrowserVersion.FIREFOX_17)
{
JavaScriptEnabled =真
ThrowExceptionOnScriptError =假,
ThrowExceptionOnFailingStatus code =假,
};
webClient.WaitForBackgroundJavaScript(5000);
HtmlPage htmlPage = webClient.GetHtmlPage(URL);
//返回页面指定的URL为文本。
返回htmlPage.WebResponse.ContentAsString;
我注意到你没有启用JavaScript,对不起,如果我错了。
I am part of ASP.NET and C# project. We are trying to make our asp.net portal Google search engine friendly (https://developers.google.com/webmasters/ajax-crawling/). Web pages in our site are generated dynamically and the DOM is modified with JavaScript so we use NHTML to generate the snapshot (Server-side) when the Google search engine sends the request. It generates the HTML snapshot but the issue is when there is a script error in the page, it returns partially rendered page (the content that gets modified by the page JavaScript is partially rendered). Pages work perfectly in Browsers.
I tried the following options
ThrowExceptionOnScriptError = false,
ThrowExceptionOnFailingStatusCode = false
But no LUCK.
Is there a way to Force NHtmlUnit to ignore page errors and continue execution?
following is the code
// Create a webclient.
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17)
{
ThrowExceptionOnScriptError = false,
ThrowExceptionOnFailingStatusCode = false
};
webClient.WaitForBackgroundJavaScript(5000);
// Load the Page with the given URL.
HtmlPage htmlPage = webClient.GetHtmlPage(url);
// Return the page for the given URL as Text.
return htmlPage.WebResponse.ContentAsString;
// Create a webclient.
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17)
{
JavaScriptEnabled = true
ThrowExceptionOnScriptError = false,
ThrowExceptionOnFailingStatusCode = false,
};
webClient.WaitForBackgroundJavaScript(5000);
HtmlPage htmlPage = webClient.GetHtmlPage(url);
// Return the page for the given URL as Text.
return htmlPage.WebResponse.ContentAsString;
I noticed you didn't enable JavaScript, sorry if I'm wrong.