且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

UnicodeDecodeError:'ascii'编解码器无法解码位置8中的字节0xea:序号不在范围内(128)

更新时间:2022-10-21 17:22:12

您正在调用 unicode(),这意味着Python必须先解码为Unicode:

 >>> unicode('\xea')
Traceback(最近一次调用的最后一个):
在< module>中,第1行的文件< stdin>
UnicodeDecodeError:'ascii'编解码器无法解码位置0的字节0xea:序号不在范围内(128)

正是这种解码失败,不是从Unicode返回到字节字符串的编码。



-1输入数据,或者你应该使用适当的编解码器进行解码:

  unicode(row [j],'utf8')。 ('latin1')

或使用 str.decode() code $:b
$ b $ $ $ $ $ $ $ $ $ $ $ $ $ $ b

我在这里选择了UTF-8作为示例,您没有提供关于输入数据或其输入的详细信息可能的编码。你需要在这里选择正确的编解码器。


I'm writing data, fetched from jobs API, to the Google spreadsheet. Following encoding for 'latin-1' encodes till page# 93 but when reaches 94, it goes in exception. I've used different following techniques, but 'latin-1' did max pagination. Else have been commented(as they die on page #65). Could you please tell me how to modify non-commented(i-e .encode('latin-1')) to get 199 pages safely written on spreadsheet? Code is given as below: Any guideline in this regard is appreciated in advance.

  def append_data(self,worksheet,row,start_row, start_col,end_col):
    r = start_row #last_empty_row(worksheet)
    j = 0
    i = start_col
    while (i <= end_col):
        try:
            worksheet.update_cell(r,i,unicode(row[j]).encode('latin-1','ignore'))
            #worksheet.update_cell(r,i,unicode(row[j]).decode('latin-1').encode("utf- 
             16"))
            #worksheet.update_cell(r,i,unicode(row[j]).encode('iso-8859-1'))
            #worksheet.update_cell(r,i,unicode(row[j]).encode('latin-1').decode("utf-
            8"))
            #worksheet.update_cell(r,i,unicode(row[j]).decode('utf-8'))
            #worksheet.update_cell(r,i,unicode(row[j]).encode('latin-1', 'replace'))
            #worksheet.update_cell(r,i,unicode(row[j]).encode(sys.stdout.encoding,  
            'replace'))
            #worksheet.update_cell(r,i,row[j].encode('utf8'))
            #worksheet.update_cell(r,i,filter(self.onlyascii(str(row[j]))))      

        except Exception as e:  
            self.ehandling_obj.error_handler(self.ehandling_obj.SPREADSHEET_ERROR,[1])
            try:
                worksheet.update_cell(r,i,'N/A')
            except Exception as ee:
                y = 23
        j = j + 1
        i = i + 1

You are calling unicode() on a byte string value, which means Python will have to decode to Unicode first:

>>> unicode('\xea')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xea in position 0: ordinal not in range(128)

It is this decoding that fails, not the encoding from Unicode back to byte strings.

You either already have Latin-1 input data, or you should decode using the appropriate codec:

unicode(row[j], 'utf8').encode('latin1')

or using str.decode():

row[j].decode('utf8').encode('latin1')

I picked UTF-8 as an example here, you didn't provide any detail about the input data or its possible encodings. You need to pick the right codec yourself here.