且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

SqlBulkCopy的和数据表与标识列的父/子关系

更新时间:2023-12-01 19:44:04

首先:SqlBulkCopy的是不可能做你想要的。正如其名称所暗示的,它只是一个单行道。我将数据迁移到SQL Server尽可能快。这是.NET版本进口原料的文本文件导入表的旧大容量复制命令。所以没有办法,如果你使用的是SqlBulkCopy的获得标识值了。

First of all: SqlBulkCopy is not possible to do what you want. As the name suggests, it's just a "one way street". I moves data into sql server as quick as possible. It's the .Net version of the old bulk copy command which imports raw text files into tables. So there is no way to get the identity values back if you are using SqlBulkCopy.

我已经做了很多大量的数据处理,并且已经面临这个问题好几次。该解决方案依赖于你的架构和数据分发。这里有一些想法:

I have done a lot of bulk data processing and have faced this problem several times. The solution depends on your architecture and data distribution. Here are some ideas:

  • 创建一组目标表的每个线程,进口这些表。在端点连接这些表。大多数这可以在一个地方,你生成的表,从表中称为TABLENAME自动调用TABLENAME_THREAD_ID相当通用的方式来实现的。

  • Create one set of target tables for each thread, import in these tables. At the end join these tables. Most of this can implemented in a quite generic way where you generate tables called TABLENAME_THREAD_ID automatically from tables called TABLENAME.

移动ID生成完全地从数据库中。例如,实施中间web服务,其产生的ID。在这种情况下,你应该不会产生每次调用一个识别码,而是产生ID范围。否则,网络开销通常会成为一个瓶颈。

Move ID generation completly out of the database. For example, implement a central webservice which generates the IDs. In that case you should not generate one ID per call but rather generate ID ranges. Otherwise network overhead becomes usually a bottle neck.

尝试生成标识出您的数据。如果有可能,你的问题就已经走了。不要说这是不可能的快速。也许你可以使用字符串ID可以清理在后处理步骤?

Try to generate IDs out your data. If it's possible, your problem would have been gone. Don't say "it's not possible" to fast. Perhaps you can use string ids which can be cleaned up in a post processing step?

和多一个备注:使用的因素时,BulkCopy 34的增加声音小的意见。如果你想快速插入数据时,请确保您的数据库配置正确。

And one more remark: An increase of factor 34 when using BulkCopy sounds to small in opinion. If you want to insert data fast, make sure that your database is configured correctly.