且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用Google-apps脚本从延迟加载的网页(通过API)中抓取数据?

更新时间:2023-11-11 08:36:10

The only way i found to obtain the data was using your workaround, getting the request URL to fetch from the console, but additionally you have to add the "x-xsrf-token" and "cookie" headers to the options when using fetch() method [1].

You can get the "x-xsrf-token" and "cookie" request headers from the console as well. Only problem is that the cookies and xsrf-token are valid up to 2 hours, this is because they implemented cross site request forgery protection [2]:

Here is the code i tested and worked:

function testFunction() {
  var url = 'https://www.barchart.com/proxies/core-api/v1/historical/get?symbol=%24AVVN&fields=tradeTime.format(m%2Fd%2Fy)%2CopenPrice%2ChighPrice%2ClowPrice%2ClastPrice%2CpriceChange%2Cvolume%2CsymbolCode%2CsymbolType&startDate=2019-04-16&endDate=2019-07-16&type=eod&orderBy=tradeTime&orderDir=desc&limit=2000&meta=field.shortName%2Cfield.type%2Cfield.description&raw=1';

  var map = {
    "x-xsrf-token": "XXXXX",
    "cookie": "XXXXX"
  }

  var options = {
     "method": "get", 
     "muteHttpExceptions": false,
     "headers": map
  };
  var response = UrlFetchApp.fetch(url, options);   
  Logger.log(response);

  var json = JSON.parse(response);
  Logger.log(json.data[0]);
}

[1] https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app

[2] Difference between CSRF and X-CSRF-Token