Project: chromium Issues People Development process History Sign in
New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Issue 407402 Telemetry perf uploads still use LKGR
Starred by 1 user Project Member Reported by, Aug 25 2014 Back to list
Status: Started
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

issue 407399

Sign in to add a comment
I'm told that telemetry perf builds still use LKGR for selecting the reference build.

We should just build a more perf-specific "last known good revision" finder which better fits their needs.
Blocking: chromium:407399
Comment 2 by, Aug 26 2014
The reference builds don't use lkgr -- they're manually updated from time to time to the current stable release of chrome. Maybe you mean something about standalone telemetry? They're on the lkgr waterfall, but afaik, it would be find to move it.
Comment 3 by, Aug 26 2014
Not the reference builds, the archives.

Yes, we don't need LKGR specifically. How would we do this?
I've learned that this customer only cares about the last build to pass the telemetry tests.
Note that there are also scripts that can download chrome builds and run page sets against them. Said builds support any of the channels (stable, beta, dev, canary) and lkgr. If we're dropping lkgr, we should make probably at least try to make sure that we have equivalent canary coverage or something like that (I think there's no linux canary?)
It appears there is no linux canary:
Comment 7 by, Sep 13 2014
Labels: -Pri-2 Pri-1 Infra-CodeYellow CY-TreeAlwaysOpen
Status: Assigned
Who owns telemetry uploading? We need an owner for this. We'd like to kill LKGR in the next week or two. tonyg, assigning to you for now since you probably know who the right owner is.
Comment 8 by, Sep 13 2014
I see. This is running on a dedicated master that uses lkgr. For all the other bots on this master, we can likely switch this whole master over to lkcr, but that won't work for this case since you need the telementry tests to pass. Or is compiling enough maybe? (lkcr == last known compiling revision)

We have very limited resources to work on this and lkgr is a significant drain on our time to do other more important work (e.g. make the CQ good, make git fast, etc). So, we need to find the sweet spot of a relatively low-effort fix to this problem that is minimally good enough for telemetry. That is, unless there is a non-infra team volunteer with the time to do a better fix.

tonyg, can you comment on what is the minimal thing you'd be OK with? A couple options I see:
1. Live with using LKCR for this. Loses telemetry_perf_unittests passing.
2. Move the zip uploading to one of the main waterfall bots that run the telemetry_perf_unittests (e.g. This wouldn't get you cross-config coverage, but it would get you telemetry_perf_unittests passing on one config.
3. ...?
Comment 9 by, Sep 16 2014
I think #2 is a better compromise than #1 and I don't have alternate ideas. My rationale is that Chrome compilation doesn't tell us anything about Telemetry suitability. However, passing Telemetry unittests on one platform does filter out at least some breakages.

Dave, could you please move the step to the bot that Ojan suggests in #c8? Note this is CY related.
Comment 10 by, Sep 24 2014
dtu, any progress?
Comment 11 by, Oct 5 2014
Friendly ping. dtu, will you be able to do this or should I try to find someone else?
Comment 12 by, Oct 6 2014
Status: Started
I'm on it.
Comment 13 by, Oct 17 2014
Comment 14 by, Oct 22 2014
Labels: -Infra-CodeYellow -CY-TreeAlwaysOpen
Is this still an issue, or did it get fixed?
Labels: Cr-Tests-Telemetry
As far as I can tell, we are still relying on the LKGR bot to upload the Moving it to one of the bots that run telmeetry_perf_unittests should work.
Comment 17 by, May 28 2015
We don't use the zip internally at the moment. We should just kill it in favor of checking out the Catapult repo.
@dtu - is comment #17 still accurate? Can we just shut off this bot?
+I don't think we can kill just yet, because many clients are relying on (ChromeOS team being one of those). Checking out the Catapult repo doesn't help for these cases, because those team needs the benchmark code in tools/perf/

I think Annie's solution in #16 sounds sane to me. 
@nednyugen - yeah, I remember you telling me this. Wanted to cross-check what you said against what @dtu wrote, and use that as a way to lean on dtu@ to actually fix this bug :).
Dave: are you still working on this?
Components: -Infra Infra>Client>Chrome
Status: Assigned
Status: Started
Note that after the lkgr breakage in the lkgr waterfall effectively uses lkCr since a few weeks. Has this caused any trouble here?

As I read the comments here, it's possible now to have uploaded for revisions with broken telemetry_perf_unittests. Effectively we're currently running with option 1 from comment 8.
Sign in to add a comment