且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Python和希伯来语编码/解码错误

更新时间:2023-02-26 11:56:14

您将传统的名称转换为Unicode字符串的字符串格式参数。理想情况下,通过这种方式,弦也应该是Unicode



但fabricate_hebrew_name没有返回Unicode的 - 它返回UTF-8编码的字符串,这是不一样的



所以,删除这个编码('utf-8')的调用,看看是否有帮助。



下一个问题是runql是什么类型的。如果是期待Unicode,没问题。如果它期待一个ASCII编码的字符串,那么你会有问题,因为希伯来语不是ASCII。在不太可能的情况下,它期待一个UTF-8编码的字符串,然后就是将其转换的时间 - 置换完成后

在另一个答案,伊格纳西奥。 Vazquez-Abrams在查询中警告字符串插值。这里的概念是,使用%运算符,而不是使用字符串替换,通常应该使用参数化查询,并将希伯来语字符串作为参数传递给它。这可能在查询优化和针对SQL注入的安全性方面具有一些优势。



示例



 # -  *  - 编码:utf-8  -  *  -  
import sqlite3

#在内存中创建数据库
conn = sqlite3.connect(:memory: )
cur = conn.cursor()
cur.execute(CREATE TABLE personal(
id INTEGER PRIMARY KEY,
name VARCHAR(42)NOT NULL) )

#insert random name
import random
fabricate_hebrew_name = lambda:random.choice([
u'ירדן',u'יפה',u'תמי ''你''你''你''你',你'你',你'你',你'你' ,$'
$ b cur.execute(INSERT INTO personal VALUES(
NULL,,name),dict(name = fabricate_hebrew_name()))
conn.commit()

ID,名称= cur.execute( SELECT * FROM个人)fetchone()
打印编号,名称
# - > 1אלונה


I have sqlite database which I would like to insert values in Hebrew to

I am keep getting the following error :

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd7 in position 0: ordinal
not in range(128)

my code is as following :

runsql(u'INSERT into personal values(%(ID)d,%(name)s)' % {'ID':1,'name':fabricate_hebrew_name()})

    def fabricate_hebrew_name():
        hebrew_names = [u'ירדן',u'יפה',u'תמי',u'ענת',u'רבקה',u'טלי',u'גינה',u'דנה',u'ימית',u'אלונה',u'אילן',u'אדם',u'חווה']
        return random.sample(names,1)[0].encode('utf-8')

note: runsql executing the query on the sqlite database fabricate_hebrew_name() should return a string which could be used in my SQL query. any help is much appreciated.

You are passing the fabricated names into the string formatting parameter for a Unicode string. Ideally, the strings passed this way should also be Unicode.

But fabricate_hebrew_name isn't returning Unicode - it is returned UTF-8 encoded string, which isn't the same.

So, get rid of the call the encode('utf-8') and see whether that helps.

The next question is what type runsql is expecting. If it is expecting Unicode, no problem. If it is expecting an ASCII-encoded string, then you will have problems because the Hebrew is not ASCII. In the unlikely case it is expecting a UTF-8 encoded-string, then that is the time to convert it - after the substitution is done.

In another answer, Ignacio Vazquez-Abrams warns against string interpolation in queries. The concept here is that instead of doing the string substitution, using the % operator, you should generally use a parameterised query, and pass the Hebrew strings as parameters to it. This may have some advantages in query optimisation and security against SQL injection.

Example

# -*- coding: utf-8 -*-
import sqlite3

# create db in memory
conn = sqlite3.connect(":memory:")
cur = conn.cursor()
cur.execute("CREATE TABLE personal ("
            "id INTEGER PRIMARY KEY,"
            "name VARCHAR(42) NOT NULL)")

# insert random name
import random
fabricate_hebrew_name = lambda: random.choice([
    u'ירדן',u'יפה',u'תמי',u'ענת', u'רבקה',u'טלי',u'גינה',u'דנה',u'ימית',
    u'אלונה',u'אילן',u'אדם',u'חווה'])

cur.execute("INSERT INTO personal VALUES("
            "NULL, :name)", dict(name=fabricate_hebrew_name()))
conn.commit()

id, name = cur.execute("SELECT * FROM personal").fetchone()
print id, name
# -> 1 אלונה