且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

有没有办法强制NHTMLUNIT为忽略页面的JavaScript错误,并继续执行脚本?

更新时间:2023-09-26 12:19:22

  //创建一个Web客户端。
Web客户端Web客户端=新的Web客户端(BrowserVersion.FIREFOX_17)
    {
        JavaScriptEnabled =真
        ThrowExceptionOnScriptError =假,
        ThrowExceptionOnFailingStatus code =假,
    };

webClient.WaitForBackgroundJavaScript(5000);

HtmlPage htmlPage = webClient.GetHtmlPage(URL);

//返回页面指定的URL为文本。
返回htmlPage.WebResponse.ContentAsString;
 

我注意到你没有启用JavaScript,对不起,如果我错了。

I am part of ASP.NET and C# project. We are trying to make our asp.net portal Google search engine friendly (https://developers.google.com/webmasters/ajax-crawling/). Web pages in our site are generated dynamically and the DOM is modified with JavaScript so we use NHTML to generate the snapshot (Server-side) when the Google search engine sends the request. It generates the HTML snapshot but the issue is when there is a script error in the page, it returns partially rendered page (the content that gets modified by the page JavaScript is partially rendered). Pages work perfectly in Browsers.

I tried the following options

ThrowExceptionOnScriptError = false,
ThrowExceptionOnFailingStatusCode = false

But no LUCK.

Is there a way to Force NHtmlUnit to ignore page errors and continue execution?

following is the code

    // Create a webclient.
    WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17)
        {
            ThrowExceptionOnScriptError = false,
            ThrowExceptionOnFailingStatusCode = false
        };

    webClient.WaitForBackgroundJavaScript(5000);

    // Load the Page with the given URL.
    HtmlPage htmlPage = webClient.GetHtmlPage(url);

    // Return the page for the given URL as Text.
    return htmlPage.WebResponse.ContentAsString;

// Create a webclient.
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_17)
    {
        JavaScriptEnabled = true
        ThrowExceptionOnScriptError = false,
        ThrowExceptionOnFailingStatusCode = false,
    };

webClient.WaitForBackgroundJavaScript(5000);

HtmlPage htmlPage = webClient.GetHtmlPage(url);

// Return the page for the given URL as Text.
return htmlPage.WebResponse.ContentAsString;

I noticed you didn't enable JavaScript, sorry if I'm wrong.