New issue
Advanced search Search tips
Starred by 2 users
Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 3
Type: Bug

Blocking:
issue 407399



Sign in to add a comment
Remove cluster_fuzz dependencies on LKGR
Project Member Reported by eseidel@chromium.org, Aug 25 2014 Back to list
Remove cluster_fuzz dependencies on LKGR

I'm told cluster_fuzz still depends on having LKGR.  We should just have it build its own "last known good revision" from recent builds which is more targeted to cluster_fuzz's needs.
 
Blocking: chromium:407399
Cc: ojan@chromium.org
There is no work to be done from my side other than verifying. ClusterFuzz does not create builds, i think we should just go with https://code.google.com/p/chromium/issues/detail?id=407399#c4.
Could you tell me which tests cluster_fuzz cares about passing?  Or does cluster_fuzz only care that the compile passes?  Or that tests pass with < N crashes?
Compile is a must-have, as otherwise we will archive builds without the binaries and mess up regression testing badly.

Other thing we care about is startup crash. It should launch the browser and basic pageload tests should pass.

All other things (like passing 90% of tests, etc) are just addons. They are not needed and won't be blockers. If this is doable, then we should set a rule like 80% of the tests should pass or something like that.
Comment 5 by ojan@chromium.org, Aug 26 2014
It almost never happens that the binary compiles and fewer than 90% of the tests pass. It does happen, but it's *really* rare. Make you could just use LKCR? If that's not sufficient, I think any script that walks through the last 40 revisions and finds the best one will meet this criteria 100% of the time.
Thinking again, LKCR is more than enough. compile breakage on main build, will also cause compile breakage on memory build and that prevents the build from getting archived already. verified on http://build.chromium.org/p/chromium.lkgr/builders/ASAN%20Release/builds/8171.
Isn't there a problem with the number of LKCR releases there are, however? Given that you'd be building, instrumenting and archiving each and every LKCR release you'd have a lot more than currently occur with LKGR... is this an issue?
LCKR will be roughly like one every 40 revisions, which is good enough for us. LKGR has been more frequent sometimes and we have a rule in ClusterFuzz to only use the new build if it is more than 50 revisions apart (so that we don't waste too much time in pulling new builds.).
Cc: glider@chromium.org kcc@chromium.org
[Speaking of the ASan bots as an example]
I wonder if the ASan builder that packages builds for CF can run a (small?) subset of browser_tests and skip packaging if it fails?  Or maybe one of the ASan tester bots can be used to package builds if all the previous steps are green?
Comment 10 by ojan@chromium.org, Aug 28 2014
If you were going to do that, it'd make more sense to me to invest that effort in writing a script that computed an LKBR or something that is the best revision in the last 40. It could update every 30 minutes or so. That would give you much better coverage than trying to run a relatively random subset of the tests.

Canary already has a script that basically does this. I don't think it'd be hard to extend it to do the same for a more generic thing.

That said, either way is fine with me. Totally up to you.
Comment 11 by ojan@chromium.org, Sep 13 2014
Labels: -Pri-2 Pri-1 Infra-CodeYellow CY-TreeAlwaysOpen
Status: Assigned
OK, so it sounds like in the short term we should just use LKCR. inferno, can you make that change? We'd like to kill LKGR in the next week or two.
Comment 12 by aarya@google.com, Sep 13 2014
Cc: mbarbe...@chromium.org infe...@chromium.org
Owner: ----
Status: Untriaged
I don't know the chromium build infrastructure side scripts, can you please find someone to work on this. I can make changes from the ClusterFuzz side as needed.
Comment 13 by ojan@chromium.org, Sep 13 2014
Abhishek, who is responsible for maintaining clusterfuzz? Where is the logic for deciding to use lkgr? If you didn't set that up, do you know who did?
Comment 14 by aarya@google.com, Sep 13 2014
Cc: cmp@chromium.org
Ojan- me and Marty(mbarbella@) are the developers and maintainers of ClusterFuzz. However, we don't manage the build archiving aspect on the google cloud storage bucket, that is done by the chromium infrastructure team side [we just pickup whatever gets stored there]. As far as i remember, some of the scripts were written by Alex (glider@) some time back. e.g. https://chromium.googlesource.com/experimental/chromium/tools/build/+log/4b00eb6ea9f5f7825a4b65c3312e56f3bb4063a7/scripts/slave/chromium/cf_archive_build.py. I don't know where the code hooks will go from changing from lkgr to lkcr. Alex, do you have any idea ? Also, ccing Chase for suggestions.
Comment 15 by ojan@chromium.org, Sep 13 2014
Status: Available
OK, that python file traces back to cf_archive_build in the master config file: https://code.google.com/p/chromium/codesearch#search/&q=cf_archive_build%20file:cfg&sq=package:chromium&type=cs.

So, the clusterfuzz uploaders actually run on the lkgr master. Maybe we just need to switch the lkgr master over to using lkcr and then we're done? We need to make sure the other bots on this waterfall are OK with that, e.g.  issue 407402 .
Comment 16 by aarya@google.com, Sep 13 2014
Cc: machenb...@chromium.org hinoka@chromium.org iannucci@chromium.org
Seems like it, but someone from chrome infra should confirm.

'cf_archive_build': ActiveMaster.is_production_host, // does ActiveMaster refer to lkgr master ??? what is needed for changing this to lkcr ??
Technically, I think the only thing that is needed, is to change the branch from lkgr->lkcr here:
https://code.google.com/p/chromium/codesearch#chromium/tools/build/masters/master.chromium.lkgr/master_lkgr_cfg.py&l=347

And as a consequence, all other occurrences of "lkgr" should be changed to "lkcr" as well in master_lkgr_cfg.py, and slaves.cfg.

Last but not least, the whole master should probably be renamed, right? Is that possible, or do we need to recreate it as a new lkcr master? This is only cosmetic, but it might be confusing if a master called lkgr pulls its builds from lkcr.
Comment 18 by ojan@chromium.org, Oct 22 2014
Labels: -Infra-CodeYellow -CY-TreeAlwaysOpen
Owner: dpranke@chromium.org
Status: Assigned
Just FYI how we do this in V8 (maybe too simple for chromium): We have _one_ bot as a gate for a respective clusterfuzz builder, e.g. our own ASAN builder/tester must pass building/testing on a specific revision. The tester then triggers the clusterfuzz builder for uploading the same revision to CF using the trigger recipe_module.

I don't know why the notion of lkgr/lkcr would be useful here, as it means that a bunch of other unrelated bots have to be green. E.g. why would clusterfuzz linux testing care if windows has a compile error in the same revision, as long as the linux build compiles and tests cleanly?

I understand that it might make sense when a set of several different tester bots are the gate for a particular clusterfuzz build type. But since we are on swarming, we mostly have one bot that covers all tests of a particular build configuration.
Labels: -Pri-1 Pri-2
downgrading to P2 to make sure any M53-blockers get precedence.
Components: -Infra Infra>Client>Chrome
Labels: -Pri-2 Pri-3
Labels: OS-All
Cc: -cmp@chromium.org
Labels: -Pri-3 Pri-2
I'd like to come back to this and make some progress. I know inferno@ and I discussed how we thought we'd do this months ago, but I've now forgotten what we discussed :(.

@inferno, let's see if we can find some time this week to discuss a path forward.
Comment 27 by aarya@google.com, Jan 15 2017
Sounds good, lets discuss this week.
Cc: estaab@chromium.org
Labels: -Pri-2 Pri-3
Owner: ----
Status: Available
actually, I don't know when I'll have time to work on this :(.
Comment 29 by ojan@chromium.org, Mar 7 2017
Cc: -ojan@chromium.org
Owner: zhangtiff@chromium.org
Status: Assigned
Assigning this to myself because I've been looking into turning LKGR. 
Sign in to add a comment