Add row insertion timestamp to tko_jobs table |
|
Issue description
Among three tables archived by ci_results_archiver (afe_jobs, tko_jobs, cidb_builds), tko_jobs is the only table without row insertion timestamp column. I'd like to add one.
Background:
In ci_results_archiver, imported database rows have autoincrement integer primary keys. This means their primary keys are usually contiguous, but not always. For example, a row with ID=7 might be committed after a row with ID=8 is committed if the transaction for ID=7 takes more time than that for ID=8. If we scan the table during such concurrent transactions, we observe an "ID skip".
We have two notable reasons to observe ID skips: concurrent transactions and rolled-back transactions. For the former case, ID skips will be eventually filled, but for the latter case, ID skips will remain forever.
In order not to drop rows, on encountering ID skips, importers will wait for them to be filled for some time ("grace period"). While waiting, importers will not process rows after the ID skips.
Grace period is defined by two constraints: timeout and capacity.
1. timeout: A duration. If the insertion timestamp of the row next to an ID
skip is older than this, we assume it will never be filled.
2. capacity: A number. If the number of committed rows after an ID skip is
more than this number, we assume it will never be filled.
Whenever possible, timeout constraint should be preferred over capacity constraint because it is difficult to guess the maximum rate of insertion.
Currently, tko_jobs is the only table using capacity constraint. This is fragile, and in fact, I observed drops with capacity=100.
,
Jun 8 2018
Hi, this bug has not been updated recently. Please acknowledge the bug and provide status within two weeks (6/22/2018), or the bug will be closed. Thank you. |
|
►
Sign in to add a comment |
|
Comment 1 by bugdroid1@chromium.org
, Oct 25 2017