且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

JPA:EntityManager花费的时间太长,无法保存数据

更新时间:2023-11-30 19:39:46

我在批处理应用程序中遇到了同样的问题,我们采用了两种技术来极大地加快数据导入过程:

I was having same problems in my batch application and we have incorporated two techniques which vastly speed up the process of importing the data:

1)多线程-您必须利用多个线程来处理文件数据并进行保存.

1) Multithreading - You have to take advantage of multiple threads processing your file data and doing the saving.

我们的方法是首先从文件中读取所有数据,然后将其打包到一组POJO对象中.

The way we did it was to first, read all the data from the file and pack it into a Set of POJO objects.

然后根据我们可以创建的可能线程的数量,将Set均匀地分割,并在一定范围的数据中提供线程.

Then based on the number of possible threads that we can create we would split the Set evenly and feed the threads with a certain range of data.

然后每个集合将被并行处理.

Then each set would be processed in parallel.

我不打算赘述,因为这超出了这个问题的范围.我可以提供的一个提示是,您应该尝试利用java.util.concurrent及其提供的功能.

I am not going get into the details as this is outside of the boundaries of this question. Just a tip that i can give is that you should try to take advantage of the java.util.concurrent and features it offers.

2)批量保存-我们所做的第二个改进是利用了hibernate的批量保存功能(您已经添加了Hibernate标记,因此我认为这是您的基础持久性提供程序):

2) Batch Saving - The second improvement that we did was to take advantage of the batch save feature of hibernate (you have added the Hibernate tag so i assume this is your underlying persistence provider):

您可以尝试利用批量插入功能.

You can try and take advantage of the bulk insert feature.

有一个可以定义为启用此功能的休眠属性:

There is hibernate property which you can define to enable this feature:

<property name="jdbc.batch_size">250</property>

使用此批处理设置,您应该具有以下输出:

With this batch setting you should have output like:

insert into Table(id , name) values (1, 'na1') , (2, 'na2') ,(3, 'na3')..

代替

insert into Table(id , name) values (1, 'na1');
insert into Table(id , name) values (2, 'na2');
insert into Table(id , name) values (3, 'na3');

3)刷新计数-在刷新到数据库之前,您已将计数设置为50.现在启用了批处理插入功能,也许您可​​以将其提高到几分.尝试用这个数字找到***位置.

3) Flush count - you have your count set to 50 before you flush to the db.. now with the batch inserts enabled maybe you could raise it up a bit to few houndread.. try to experiment with this number to find the sweet spot.