更新时间:2022-04-29 23:09:41
参见 文档中的实体.BeautifulSoup 4 为所有实体生成正确的 Unicode:
See Entities in the documentation. BeautifulSoup 4 produces proper Unicode for all entities:
传入的 HTML 或 XML 实体始终会转换为相应的 Unicode 字符.
An incoming HTML or XML entity is always converted into the corresponding Unicode character.
是的,
变成了不间断的空格字符.如果您真的希望它们成为空格字符,则必须进行 unicode 替换.
Yes,
is turned into a non-breaking space character. If you really want those to be space characters instead, you'll have to do a unicode replace.