且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

bs4.FeatureNotFound:找不到一棵树建设者您所要求的功能:LXML。你需要安装一个解析器库?

更新时间:2021-12-05 15:30:29

我怀疑,这是相关的BS将用于读取HTML解析器。他们在这里文件的,但如果你像我一样(在OSX)你可能会坚持的东西,需要做一些工作:

I have a suspicion that this is related the the parser that BS will use to read the HTML. They document it here but if you're like me (on OSX) you might be stuck with something that requires a bit of work:

您会注意到,在BS4文档网页上面,他们指出,在默认情况下BS4将使用内置的HTML解析器Python的。假设你是在OSX,Python中的苹果​​捆绑的版本是2.7.2这是不宽松的字符格式。我打这个同样的问题,所以我用Python版本升级来解决它。在virtualenv中这样做将尽量减少对其他项目。

You'll notice that in the BS4 documentation page above, they point out that by default BS4 will use the Python built-in HTML parser. Assuming you are in OSX, the Apple-bundled version of Python is 2.7.2 which is not lenient for character formatting. I hit this same problem, so I upgraded by version of Python to work around it. Doing this in a virtualenv will minimize disruption to other projects.

如果这样做,听起来像一个痛苦,你可以切换到LXML解析器:

If doing that sounds like a pain, you can switch over to the LXML parser:

pip install lxml

然后再试试:

soup = BeautifulSoup(html, "lxml")

根据您的情况,这可能是不够好。我发现这个够烦人的,以保证升级我的Python版本。使用的virtualenv,可以迁移你的包很容易