且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在并发访问时标记表中的某些行

更新时间:2023-12-02 08:06:46

在你所指的相关答案中:

In the related answer you are referring to:

目标是一次锁定 一个 行.无论是否使用咨询锁,这都可以正常工作,因为不会出现死锁 - 只要您不尝试在同一事务中锁定更多行.

The objective is to lock one row at a time. This works fine with or without advisory locks, because there is no chance for a deadlock - as long as you don't try to lock more rows in the same transaction.

您的示例与您的示例不同,您希望一次锁定 3000 行.有可能发生死锁,除非所有并发写入操作以相同的一致顺序锁定行.每个文档:

Your example is different in that you want to lock 3000 rows at a time. There is potential for deadlock, except if all concurrent write operations lock rows in the same consistent order. Per documentation:

防止死锁的***方法通常是通过以下方式避免死锁确保所有使用数据库的应用程序都获得锁定多个对象以一致的顺序.

The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.

在您的子查询中使用 ORDER BY 来实现.

Implement that with an ORDER BY in your subquery.

UPDATE cargo_item item
SET job_id = 'SOME_UUID', job_ts = now()
FROM  ( 
   SELECT id
   FROM   cargo_item
   WHERE  state='NEW' AND job_id is null 
   ORDER  BY id
   LIMIT  3000
   FOR UPDATE
   ) sub
WHERE  item.id = sub.id;

这是安全可靠的,只要所有事务以相同的顺序获取锁并且不期望排序列的并发更新.(阅读黄色的注意"框在手册本章的末尾.)所以这对你来说应该是安全的,因为你不会更新 id 列.

This is safe and reliable, as long as all transactions acquire locks in the same order and concurrent updates of the ordering columns are not to be expected. (Read the yellow "CAUTION" box at the end of this chapter in the manual.) So this should be safe in your case, since you are not going to update the id column.

实际上一次只有一个客户端可以以这种方式操作行.并发事务会尝试锁定相同(锁定)的行并等待第一个事务完成.

Effectively only one client at a time can manipulate rows this way. Concurrent transactions would try to lock the same (locked) rows and wait for the first transaction to finish.

咨询锁很有用.只有少数几个,总体而言,仅使用上述查询并让并发事务等待轮到它们会更便宜.

Advisory locks are useful if you have many or very long running concurrent transactions (doesn't seem you do). With only a few, it will be cheaper overall to just use above query and have concurrent transactions wait for their turn.

在您的设置中,似乎并发访问本身不是问题.并发是您当前解决方案造成的问题.

It seems concurrent access isn't a problem per se in your setup. Concurrency is an issue created by your current solution.

相反,在单个 UPDATE 中完成所有操作.将成批的 n 数字(在示例中为 3000)分配给每个 UUID 并一次全部更新.应该是最快的.

Instead, do it all in a single UPDATE. Assign batches of n numbers (3000 in the example) to each UUID and update all at once. Should be fastest.

UPDATE cargo_item c
SET    job_id = u.uuid_col
     , job_ts = now()
FROM  (
   SELECT row_number() OVER () AS rn, uuid_col
   FROM   uuid_tbl WHERE  <some_criteria>  -- or see below
   ) u
JOIN (
   SELECT (row_number() OVER () / 3000) + 1 AS rn, item.id 
   FROM   cargo_item
   WHERE  state = 'NEW' AND job_id IS NULL
   FOR    UPDATE   -- just to be sure
   ) c2 USING (rn)
WHERE  c2.item_id = c.item_id;

要点

  • 整数除法被截断.前 3000 行获得 1,接下来的 3000 行获得 2.等等.

    Major points

    • Integer division truncates. You get 1 for the first 3000 rows, 2 for the next 3000 rows. etc.

      我随意选择行,你可以在窗口中应用 ORDER BYrow_number() 分配某些行.

      I pick rows arbitrarily, you could apply ORDER BY in the window for row_number() to assign certain rows.

      如果您没有要分派的 UUID 表 (uuid_tbl),请使用 VALUES 表达式来提供它们.示例.

      If you don't have a table of UUIDs to dispatch (uuid_tbl), use a VALUES expression to supply them. Example.

      您会获得 3000 行的批次.如果您找不到要分配的 3000 的倍数,最后一批将缺少 3000.

      You get batches of 3000 rows. The last batch will be short of 3000 if you don't find a multiple of 3000 to assign.