且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

TypeError:“字节"类型的对象不可JSON序列化

更新时间:2023-01-17 16:13:19

您将自己创建这些bytes对象:

You are creating those bytes objects yourself:

item['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)

这些t.encode()l.encode()d.encode()调用中的每一个都会创建一个bytes字符串.请勿执行此操作,将其保留为JSON格式以将其序列化.

Each of those t.encode(), l.encode() and d.encode() calls creates a bytes string. Do not do this, leave it to the JSON format to serialise these.

接下来,您正在犯其他几个错误;您在不需要的地方编码过多.将其留给json模块和open()调用返回的 standard 文件对象以处理编码.

Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json module and the standard file object returned by the open() call to handle encoding.

您也不需要将items列表转换为字典;它已经是可以直接进行JSON编码的对象:

You also don't need to convert your items list to a dictionary; it'll already be an object that can be JSON encoded directly:

class W3SchoolPipeline(object):    
    def __init__(self):
        self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')

    def process_item(self, item, spider):
        line = json.dumps(item) + '\n'
        self.file.write(line)
        return item

我猜您遵循的是一个假定使用Python 2的教程,而您使用的是Python 3.我强烈建议您找到其他教程;它不仅是为过时的Python版本编写的,而且如果它倡导line.decode('unicode_escape'),它也在教一些极端的不良习惯,这些习惯会导致难以跟踪的错误.我可以建议您查看 Think Python,第二版 获得一本关于学习Python 3的好书,免费.

I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape') it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.