且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用Python有效地将CSV文件数据插入MYSQL?

更新时间:2023-01-21 20:46:26

有很多方法可以优化此插入.这里有一些想法:

There are many ways to optimise this insert. Here are some ideas:

  1. 您在整个数据集上都有一个for循环.您可以每100个左右执行一次 commit()
  2. 您可以将许多行插入一个插入
  3. 您可以将两者结合起来,并在CSV上每100行进行多行插入
  4. 如果您不需要python,则可以直接在MySQL上使用它,如此处所述.(如果必须使用python进行操作,则仍可以在python中准备该语句,避免手动循环遍历文件.)
  1. You have a for loop over the entire dataset. You can do a commit() every 100 or so
  2. You can insert many rows into one insert
  3. you can combine the two and make a multi-row insert every 100 rows on your CSV
  4. If python is not a requirement for you can do it directly using MySQL as it's explained here. (If you must do it using python, you can still prepare that statement in python and avoid looping through the file manually).

示例:

对于列表中的2,代码将具有以下结构:

for number 2 in the list, the code will have the following structure:

def csv_to_DB(xing_csv_input, db_opts):
    print("Inserting csv file {} to database {}".format(xing_csv_input, db_opts['host']))
    conn = pymysql.connect(**db_opts)
    cur = conn.cursor()
    try:
        with open(xing_csv_input, newline='') as csvfile:
            csv_data = csv.reader(csvfile, delimiter=',', quotechar='"')
            to_insert = []
            insert_str = "INSERT INTO table_x (ID, desc, desc_version, val, class) VALUES "
            template = '(%s, %s, %s, %s, %s)'
            count = 0
            for row in csv_data:
                count += 1
                to_insert.append(tuple(row))
                if count % 100 == 0:
                    query = insert_str + '\n'.join([template % r for r in to_insert])
                    cur.execute(query)
                    to_insert = []
                    conn.commit()
            query = insert_str + '\n'.join(template % to_insert)
            cur.execute(query)
            conn.commit()
    finally:
        conn.close()