且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

CosmosDB租约集合不再自动创建

更新时间:2023-02-14 19:09:07

任何源集合... 错误都来自此处: https://github.com/job.sdk-extensions/blob/0683d1bd08a16680c70f982ad00c940b7e9c1fce/src/WebJobs.Extensions.CosmosDB/Trigger/CosmosDBTriggerListener.cs#L140 在尝试启动触发器时对 NotFound 作出反应./p>

此处的关键是要了解,在函数初始化"过程中会创建租赁集合",而不是在函数"正在运行时发生.

如果在函数运行时删除租约集合(或受监视的集合),则可能会看到由正在运行的实例产生的错误弹出窗口.如果出现新实例(由于扩展)或重新启动功能,则创建将在 https://github.com/Azure/azure-webjobs-sdk-extensions/blob/0683d1bd08a16680c70f982ad9c1Websce7Web.CosmosDB/Trigger/CosmosDBTriggerAttributeBindingProvider.cs#L155 .

那么,这些错误什么时候发生?

  1. 函数初始化-> CreateIfNotExist检查并创建Leases集合.如果失败,则初始化在此处停止.这会产生一条错误消息.
  2. 正在运行的功能->实例可以正在运行,并且如果删除租约,则运行时错误将使功能代码重试以再次启动该过程,因为重试不会再次运行初始化,因此它将输出源集合...
  3. 偶尔租约丢失发生在负载均衡方案中,其中多个Function实例正在运行,并且在将租约(来自lease集合)分配给新实例时,分配了按比例分配的负载.如果触发器试图更新检查点,而您突然删除了租约集合,也会发生这种情况.

你能做什么

如果您要手动删除租约集合,那么您将控制可能发生的情况.建议是:

  1. 停止功能
  2. 删除租赁集合
  3. 启动功能.

如果您不停止该功能以及在其运行时删除租用存储,则该功能的行为是完全不确定的.

I'm having a very strange problem with CosmosDB & Azure Functions. I frequently delete my database and re-create it in DEV. I then re-deploy the function app. When I call the APIs in the app and CosmosDB triggers are invoked, I normally see the leases collection created. Here's a typical trigger:

[FunctionName("MyTrigger")]
public static async Task RunAsync([CosmosDBTrigger("MyDatabase", "MyContainer",
ConnectionStringSetting = "CosmosConnectionString", LeaseCollectionName = "leases", 
LeaseCollectionPrefix = "MyTrigger", CreateLeaseCollectionIfNotExists = true)]IReadOnlyList<Document> documents, 
ExecutionContext executionContext)
{
     // code
}

For some reason, the leases collection is no longer being created. I re-created the database, re-deployed the function app multiple times and made API calls with no luck. What am I missing?

EDIT: I looked at the logs and noticed there are a lot of Microsoft.Azure.Documents.ChangeFeedProcessor.Exceptions.LeaseLostException exceptions with The lease was lost message, so I'm not sure what's going on.

EDIT2: Here's a more detailed error message I was able to extract from the logs:

"Either the source collection 'MyContainer' (in database 'MyDatabase') or the lease collection 'leases' (in database 'MyDatabase') does not exist. Both collections must exist before the listener starts. To automatically create the lease collection, set 'CreateLeaseCollectionIfNotExists' to 'true'

Note that CreateLeaseCollectionIfNotExists is already set to true.

Either the source collection... error comes from here: https://github.com/Azure/azure-webjobs-sdk-extensions/blob/0683d1bd08a16680c70f982ad00c940b7e9c1fce/src/WebJobs.Extensions.CosmosDB/Trigger/CosmosDBTriggerListener.cs#L140 which reacts on a NotFound being detected while trying to start the Trigger process.

The key here is understanding that the Lease Collection creation happens during Function initialization, not if the Function is running.

If you delete the lease collection (or the monitored collection) while the Function is running, you might see that error pop, produced by the running instances. If a new instance comes up (due to scaling) or you restart the Function, then the creation kicks in in https://github.com/Azure/azure-webjobs-sdk-extensions/blob/0683d1bd08a16680c70f982ad00c940b7e9c1fce/src/WebJobs.Extensions.CosmosDB/Trigger/CosmosDBTriggerAttributeBindingProvider.cs#L155.

So, when do these errors happen?

  1. Function initialization -> CreateIfNotExist checks and creates Leases collection. If this fails, then initialization stops here. This produces an error message.
  2. Function running -> Instances can be running and if the lease is deleted runtime errors will make the Function code to retry to Start the process again, since the retry does not run the initialization again, it outputs the Either the source collection...
  3. Occasional The lease was lost occurs in load balancing scenarios where multiple Function instances are running and distributing scaled load when a lease (from the lease collection) is distributed to a new instance. This can also happen if the Trigger tried to update the checkpoint and you suddenly deleted the lease collection.

What you can do

If you are manually deleting the leases collection, then you are in control of what can happen. The recommendation is:

  1. stop your Functions
  2. Delete the leases collection
  3. Start your Functions.

The behavior of the Function if you don't stop it and if you delete the lease store while it's running is totally undefined.