且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

JSoup:难以提取单个元素

更新时间:2023-12-03 21:01:58

网站https://www.coindesk.com/price/bitcoin在显示内容时严重依赖JavaScript. Jsoup无法执行JavaScript.它只能解析原始HTML文档.
要查看Jsoup看到的内容,请尝试在禁用JavaScript的情况下访问此页面.您会看到页面缺少主要内容.或者,访问此页面并按Ctrl + U在修改JavaScript之前检查页面源.
使用Chrome的调试器(网络"标签),您可以看到它发出了其他AJAX请求,以从以下URL获取JSON中的当前汇率:

Site https://www.coindesk.com/price/bitcoin relies heavily on JavaScript when presenting content. Jsoup can't execute JavaScript. It can only parse raw HTML documents.
To see what Jsoup sees try to visit this page with JavaScript disabled. You'll see the page is missing main content. Alternatively visit this page and press Ctrl+U to check page source before JavaScript modifications.
Using Chrome's debugger (Network tab) you can see it makes additional AJAX requests to get current exchange rates in JSON from this URL: https://production.api.coindesk.com/v1/exchangeRates
Then JavaScript is used to create dynamic HTML elements for this data. It also requests few other URLs to fetch graph data.