且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

“未找到:表";用于新的bigquery表

更新时间:2023-11-30 18:47:40

针对您关于使用NOT_FOUND作为指标创建表的问题的答案,这是有意的(尽管有些令人沮丧)行为.

Per your answers to my question regarding using NOT_FOUND as an indicator to create the table, this is intended (though admittedly somewhat frustrating) behavior.

流插入路径缓存有关表的信息(以及用户向表中插入的授权).这是因为该API具有预期的高QPS性质.我们还会缓存某些负面响应,以再次保护有漏洞的用户.那些缓存的否定响应之一是目标表不存在.我们一直在每台计算机上执行此操作,但最近又添加了一个额外的集中式缓存,这样,在返回第一个NOT_FOUND响应后,几乎所有机器都会立即看到负缓存结果.

The streaming insertion path caches information about tables (and the authorization of a user to insert into the table). This is because of the intended high QPS nature of the API. We also cache certain negative responses in order to protect again buggy or abusive clients. One of those cached negative responses is the non-existence of a destination table. We've always done this on a per-machine basis, but recently added an additional centralized cache, such that all machines will see the negative cache result almost immediately after the first NOT_FOUND response is returned.

通常,我们建议不要在插入请求的行内进行表创建,因为在发出数千个QPS插入的系统中,表丢失可能会导致成千上万个表创建操作,这可能会给我们的系统增加负担.相反,如果您事先知道表的可能集合,我们建议您进行一些定期处理,该过程在将表用作流目标之前先执行表创建.如果您的目标表本质上更具动态性,则在执行表创建后可能需要执行延迟.

In general, we recommend that table creation not occur inline with insert requests, because in a system that is issuing thousands of QPS of inserts, a table miss could result in thousands of table creation operations which can be taxing on our system. Instead, if you know the possible set of tables beforehand, we recommend some periodic process that performs table creations in advance of their usage as a streaming destination. If your destination tables are more dynamic in nature, you may need to implement a delay after table creation has been performed.

道歉的困难.我们确实希望解决这个问题,但目前还没有任何时间表.

Apologies for the difficulty. We do hope to address this issue, but we don't have any timeframe yet for doing so.