Establish CrOS logs to BigQuery |
|||
Issue descriptionBefore we can build alerts off of high cardinality data that requires computation, we need to export that data. To do this, we’ll use the BigQuery streaming API <https://cloud.google.com/bigquery/streaming-data-into-bigquery#bigquery-stream-data-python> to insert structured events into tables from CBuildBot/recipes. At a minimum, we should have a table for stage event data, which is too high cardinality for most Monarch use-cases. That first table, stage events, will probably have a structure like this: {TIMESTAMP, ID, BUILDER, ARCH, STAGE, STATE_ENUM, DURATION, DEBUG_MESSAGE} without intending to query on DEBUG_MESSAGE, an unstructured string field. Other tables we might immediately add include: * GS client data: builder name, ID, upload/download enum, file size, remote URI * BuildPackages and BuildImage metadata: builder name, ID, number of packages emerged, rebuilt, used binary, new version count * Package-level data: builder name, Packages/Image phase enum, ID, package name, emerge time * Package-level unit test data: builder name, ID, package name, unit test name, pass/fail/warning, unit test time * CommitQueueSync/DetectRelevantChanges: builder name, CL included, CL relevant, time to cherry-pick, dependant CL’s count * Signer stats * Archived artifacts published * Prebuilts published Some inspiration should be drawn from our historical Cloud Datastore <https://pantheon.corp.google.com/datastore/entities/query/kind?organizationId=433637338589&project=cosmic-strategy-646>, which was a previous attempt to do something similar. There was even a now broken integration <http://shortn/_Ezvsl7uWTu> into Viceroy. We should delete all usage of Cloud Datastore from CBuildBot at the same time.
,
Nov 10
,
Nov 13
,
Nov 15
"Package-level data: builder name, Packages/Image phase enum, ID, package name, emerge time" could actually be a nested, repeated proto under "BuildPackages and BuildImage metadata: builder name, ID, number of packages emerged, rebuilt, used binary, new version count".
,
Nov 15
"CommitQueueSync/DetectRelevantChanges: builder name, CL included, CL relevant, time to cherry-pick, dependant CL’s count" could also include raw CL's patched vs stacks of CL's.
,
Nov 15
{TIMESTAMP, ID, BUILDER, ARCH, STAGE, STATE_ENUM, DURATION, DEBUG_MESSAGE} might be already available in
https://pantheon.corp.google.com/bigquery?project=cr-buildbucket&folder&organizationId=433637338589&p=cr-buildbucket&d=chrome&t=completed_builds_BETA&page=table
see "steps." fields. I assume stage=recipe step here.
Note that this table schema is the same thing as go/build-proto
bqschemaupdater tool can create/update table with schema derived from a proto file.
Foundation never creates table schemas manually
http://godoc.org/go.chromium.org/luci/tools/cmd/bqschemaupdater/
This is how we use the same go/build-proto in both buildbucket RPC and tables.
Beware that a JOIN in BQ is much more expensive than in traditional relational databases. I see the bug talks about multiple tables.
- https://cloud.google.com/bigquery/docs/best-practices-performance-compute#optimize_your_join_patterns
- https://cloud.google.com/bigquery/docs/best-practices-performance-communication#reduce_data_before_using_a_join
Please consider making the event-streaming part reusable by other recipes. Chromium might benefit from this.
,
Nov 16
> {TIMESTAMP, ID, BUILDER, ARCH, STAGE, STATE_ENUM, DURATION, DEBUG_MESSAGE} might be already available in
> https://pantheon.corp.google.com/bigquery?project=cr-buildbucket&folder&organizationId=433637338589&p=cr-buildbucket&d=chrome&t=completed_builds_BETA&page=table
> see "steps." fields. I assume stage=recipe step here.
> Note that this table schema is the same thing as go/build-proto
Great; didn't realize steps were there. The only thing that's really missing (and I even missed it when writing this) is a way to derive parent/child relationships but I think I can work around that for now by using gitiles commit ID as a cross-reference key.
> bqschemaupdater tool can create/update table with schema derived from a proto file.
> Foundation never creates table schemas manually
> http://godoc.org/go.chromium.org/luci/tools/cmd/bqschemaupdater/
> This is how we use the same go/build-proto in both buildbucket RPC and tables.
Nice, that's a great tip.
> Beware that a JOIN in BQ is much more expensive than in traditional relational databases. I see the bug talks about multiple tables.
> - https://cloud.google.com/bigquery/docs/best-practices-performance-compute#optimize_your_join_patterns
> - https://cloud.google.com/bigquery/docs/best-practices-performance-communication#reduce_data_before_using_a_join
Yea, I probably should have included on this bug description that we should try to use nested columns wherever possible. I think this might be a problem with the BQ streaming API since I believe that we would need a fully materialized row with all of its nested data to make that work. However, a solution would be to duplicate data in other tables to avoid doing joins.
> Please consider making the event-streaming part reusable by other recipes. Chromium might benefit from this.
Will-do.
|
|||
►
Sign in to add a comment |
|||
Comment 1 by jclinton@chromium.org
, Nov 10