New issue
Advanced search Search tips

Issue 904151 link

Starred by 2 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Feature

Blocking:
issue 904150
issue 904152


Show other hotlists

Hotlists containing this issue:
CrOSParallelCQ


Sign in to add a comment

Establish CrOS logs to BigQuery

Project Member Reported by jclinton@chromium.org, Nov 10

Issue description

Before we can build alerts off of high cardinality data that requires computation, we need to export that data. To do this, we’ll use the BigQuery streaming API <https://cloud.google.com/bigquery/streaming-data-into-bigquery#bigquery-stream-data-python> to insert structured events into tables from CBuildBot/recipes. At a minimum, we should have a table for stage event data, which is too high cardinality for most Monarch use-cases.

That first table, stage events, will probably have a structure like this:

  {TIMESTAMP, ID, BUILDER, ARCH, STAGE, STATE_ENUM, DURATION, DEBUG_MESSAGE}

without intending to query on DEBUG_MESSAGE, an unstructured string field.

Other tables we might immediately add include:
* GS client data: builder name, ID, upload/download enum, file size, remote URI
* BuildPackages and BuildImage metadata: builder name, ID, number of packages emerged, rebuilt, used binary, new version count
* Package-level data: builder name, Packages/Image phase enum, ID, package name, emerge time
* Package-level unit test data: builder name, ID, package name, unit test name, pass/fail/warning, unit test time
* CommitQueueSync/DetectRelevantChanges: builder name, CL included, CL relevant, time to cherry-pick, dependant CL’s count
* Signer stats
* Archived artifacts published
* Prebuilts published

Some inspiration should be drawn from our historical Cloud Datastore <https://pantheon.corp.google.com/datastore/entities/query/kind?organizationId=433637338589&project=cosmic-strategy-646>, which was a previous attempt to do something similar. There was even a now broken integration <http://shortn/_Ezvsl7uWTu> into Viceroy. We should delete all usage of Cloud Datastore from CBuildBot at the same time.

 
Blocking: 904152
Labels: -Type-Bug Type-Feature
Labels: Disable-Nags
"Package-level data: builder name, Packages/Image phase enum, ID, package name, emerge time" could actually be a nested, repeated proto under "BuildPackages and BuildImage metadata: builder name, ID, number of packages emerged, rebuilt, used binary, new version count".
"CommitQueueSync/DetectRelevantChanges: builder name, CL included, CL relevant, time to cherry-pick, dependant CL’s count" could also include raw CL's patched vs stacks of CL's.
{TIMESTAMP, ID, BUILDER, ARCH, STAGE, STATE_ENUM, DURATION, DEBUG_MESSAGE} might be already available in 
https://pantheon.corp.google.com/bigquery?project=cr-buildbucket&folder&organizationId=433637338589&p=cr-buildbucket&d=chrome&t=completed_builds_BETA&page=table
see "steps." fields. I assume stage=recipe step here.
Note that this table schema is the same thing as go/build-proto

bqschemaupdater tool can create/update table with schema derived from a proto file.
Foundation never creates table schemas manually
http://godoc.org/go.chromium.org/luci/tools/cmd/bqschemaupdater/
This is how we use the same go/build-proto in both buildbucket RPC and tables.

Beware that a JOIN in BQ is much more expensive than in traditional relational databases. I see the bug talks about multiple tables.
- https://cloud.google.com/bigquery/docs/best-practices-performance-compute#optimize_your_join_patterns
- https://cloud.google.com/bigquery/docs/best-practices-performance-communication#reduce_data_before_using_a_join

Please consider making the event-streaming part reusable by other recipes. Chromium might benefit from this.

> {TIMESTAMP, ID, BUILDER, ARCH, STAGE, STATE_ENUM, DURATION, DEBUG_MESSAGE} might be already available in 
> https://pantheon.corp.google.com/bigquery?project=cr-buildbucket&folder&organizationId=433637338589&p=cr-buildbucket&d=chrome&t=completed_builds_BETA&page=table
> see "steps." fields. I assume stage=recipe step here.
> Note that this table schema is the same thing as go/build-proto

Great; didn't realize steps were there. The only thing that's really missing (and I even missed it when writing this) is a way to derive parent/child relationships but I think I can work around that for now by using gitiles commit ID as a cross-reference key.

> bqschemaupdater tool can create/update table with schema derived from a proto file.
> Foundation never creates table schemas manually
> http://godoc.org/go.chromium.org/luci/tools/cmd/bqschemaupdater/
> This is how we use the same go/build-proto in both buildbucket RPC and tables.

Nice, that's a great tip.

> Beware that a JOIN in BQ is much more expensive than in traditional relational databases. I see the bug talks about multiple tables.
> - https://cloud.google.com/bigquery/docs/best-practices-performance-compute#optimize_your_join_patterns
> - https://cloud.google.com/bigquery/docs/best-practices-performance-communication#reduce_data_before_using_a_join

Yea, I probably should have included on this bug description that we should try to use nested columns wherever possible. I think this might be a problem with the BQ streaming API since I believe that we would need a fully materialized row with all of its nested data to make that work. However, a solution would be to duplicate data in other tables to avoid doing joins.

> Please consider making the event-streaming part reusable by other recipes. Chromium might benefit from this.

Will-do.

Sign in to add a comment