且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

仅从数据库中的日志表中读取新行

更新时间:2023-11-30 18:34:28

正在使用的数据库不是指定,所以不清楚解决方案是否必须被锤击到现有部署中。有一些可以插入到MySQL中的队列引擎可能有效。其中一个是 Q4M 。一些商业数据库(如Oracle)具有临时数据库功能,可以确定事务时间与有效时间与实时时间。

The database being used wasn't specified so it's not clear as to whether the solution has to be hammered into an existing deployment or not. There are some queue engines that can be plugged into MySQL that could potentially work. One of them is Q4M. Some commercial databases like Oracle have temporal database functionality that allow for determining transaction time vs valid time vs real time.

使用Oracle时,伪列 ora_rowscn 或有用的组合 scn_to_timestamp(ora_rowscn)可以有效地提供一行提交的时间戳(发生SCN的时间戳) )。或者,Oracle Workspace Manager提供版本启用表,基本上如下所示:您可以在具有 DBMS_WM.EnableVersioning(...)的表上启用版本控制,指定有效时间范围的 WMSYS.WM_PERIOD(...)字段设置工作空间的有效范围在读取器 DBMS_WM上设置。 SetValidTime(...)

When using Oracle, either the pseudo-column ora_rowscn or the useful combination scn_to_timestamp(ora_rowscn) can effectively provide the timestamp for when a row was committed (the SCN in which it took place). Alternatively, Oracle Workspace Manager provides version-enable tables, basically it goes like this: You enable versioning on a table with DBMS_WM.EnableVersioning(...), rows are inserted with an aditional WMSYS.WM_PERIOD(...) field specifying a valid time range, set a valid range for the workspace is set on the reader DBMS_WM.SetValidTime(...).

您也可以通过将您的时间戳想法与提交时间启发式相关联来在一定程度上假冒此功能。这个想法只是将有效时间与数据一起存储,而不是使用now()的任意增量。换句话说,基于提交时间+某些可接受的延迟窗口(可能是平均提交时间+标准偏差的两倍)的启发式,将指定一些未来日期(有效时间)的辅助时间戳列。或者,使用一些ceil()的平均提交时间(至少提交时间,但舍入到30秒间隔)。后者将有效量化(合并?)时间日志记录将被读取。它看起来不太一样,但这样可以避免读取冗余行。它也解决了读书应用程序无法准确了解写作应用程序的提交时间,而无需编写更多代码的问题。

You could also fake this functionality to a certain degree by meshing your timestamp idea with the commit time heuristic. The idea is simply to store the "valid time" as a column along with the data instead of using an arbitrary delta from now(). In other words a secondary timestamp column that would specify some future date (the "valid time") based on a heuristic of commit time + some acceptable window of delay (perhaps the mean commit time + twice the standard deviation). Alternatively, using some ceil()ing of mean commit time ("at least the commit time but rounding up to, say, 30 second intervals"). The latter would effectively quantize (coalesce?) the time log records would be read. It doesn't seem too different but this way would save you from reading redundant rows. It also solves the problem that the reading application cannot accurately know the commit times of the writing application without writing a lot more code.