且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

wget用于获取Facebook个人资料/朋友页面

更新时间:2023-12-02 18:20:52

首先,Facebook可能创建了某些用户代理(例如wget)无法抓取页面的条件.因此,他们将某些用户代理重定向到另一个页面,该页面可能会显示类似"不支持您的浏览器" 之类的内容.这样做是为了防止人们完全按照自己的方式做.但是,您可以使用-U参数对wget告诉wget将自己标识为其他代理(请阅读wget手册页).例如wget -U Mozilla http://....

First, Facebook have probably created a condition where certain user agents (e.g. wget) cannot crawl the pages. So they redirect certain user agents yo a different page which would probably say something like "your browser is not supported" They do that to protect people from doing exactly what you are doing. However you can tell wget to identify itself as a different agent using -U argument to wget (read the wget man page). e.g. wget -U Mozilla http://....

第二,Facebook的隐私设置很少允许您读取任何/很多信息,除非您以用户身份登录,并且可能仅以与要抓取的个人资料成为好友的用户身份登录.

Second, Facebooks privacy setting rarely allows you to read any/much information unless you are logged in as a user, and probably only as a user who is friend to the profile you are trying to scrape.

第三,有一个 Facebook API ,您需要使用它来抓取和提取来自Facebook的信息-如果您尝试以任何其他方式获取信息,则很可能违反了可接受使用"政策.

Thridly, there is an Facebook API which you need to use to crawl and extract information from facebook -- you are likely in violation of the Acceptable Use policy if you try to obtain information in any other way.