Issue metadata
Sign in to add a comment
|
Regression: GN bots building all don't work with analyze |
||||||||||||||||||||||||
Issue descriptionHere's the patch: https://codereview.chromium.org/2042423002/ . This copies a file. The file isn't referenced in any files. The win_clang thinks it needs to compile all: { "compile_targets": [ "all" ], "status": "Found dependency (all)", "test_targets": [] }
,
Jun 7 2016
What's wrong with win_clang building all? It's faster than it needs to be, and having a bot that checks that everything builds seems useful.
,
Jun 7 2016
thakis@, let me know if comment #1 doesn't answer your question?
,
Jun 7 2016
It doesn't. I added enough slaves to the pool to pay for win_clang. Building all is useful.
,
Jun 7 2016
I'm not debating whether building all is useful; I agree that it is. And, I agree that we should ideally have the capacity to build 'all' for whatever configs we like as long as the cycle time stays within acceptable limits. I'm saying that if we only have the capacity to build all for one set of bots, IMO we should be building all w/ msvs rather than clang, unless for some reason clang is substantially faster and thus we can afford clang but not msvs. Do you disagree with that?
,
Jun 7 2016
Yes. The MSVC bots need to run tests. win_clang currently doesn't, so it can afford this. (And adding yet another bot that's msvc compile-only doesn't help with capacity issues. Also, I doubt an all-building msvc bot would find a very different set of issues as a clang-cl all-building bot; if something's broken in all-only it's usually "this doesn't work at all" stuff.)
,
Jun 7 2016
I guess you could argue that win_clang gets you x64 coverage, while win_chromium_compile_dbg_ng gets you x86 coverage, and that arguably 'all' is more important to test on x64 than x86. Given my druthers, then, we'd build x64 w/ msvs and x86 w/ clang, but that's quibbling at that point. However, it's not okay to be building 'all' in the CQ as long as we have capacity problems, period, until bug 555273 is fixed. And that bug isn't easy to fix, whereas switching to 'gn_all' is trivial and it's a bug that 'all' would build anything that we'd care about (or anything at all, really) that 'gn_all' doesn't build. AFAIK we don't currently have capacity problems while win_chromium_rel_ng is on GYP, but if we get the cycle time fixed well enough w/ GN but still have capacity issues, I would argue that we will still need to change this.
,
Jun 7 2016
We can reconsider then. I too think we currently don't have capacity issues. The idea to have all slaves in one pool is to smooth out spikes afaik. If someone regresses one bot's cycle time and then tells other bot owners that their bots need to do less, that's in my understanding not what the shared pool is for.
,
Jun 9 2016
555273 has more than this specific bug, so I'm reopening it. https://codereview.chromium.org/2049973002/ is my sample change that didn't touch any source. win_clang, linux_chromium_clobber_rel_ng, and linux_chromium_rel_ng are all building unnecessarily. Note the capacity issue is only secondary. The main point about having analyze work correctly is to severely discourage folks from sometimes skipping the CQ for cls that shouldn't be running compiles or tests. We have gotten broken builds from these kind of non-source changes landing without CQ coverage. It has been a lot of work to get analyze working for all recipes, so IMO this is a pretty big regression that bots on the CQ building all with gn don't work.
,
Jun 9 2016
Bumping up to P0 since analyze is broken for linux_chromium_rel_ng. We are running all tests and compiling all targets for every change. This is a major regression.
,
Jun 9 2016
I've posted https://codereview.chromium.org/2055853002/ which flips all of the builders that have matching GN trybots in the CQ that build "all" to build "gn_all" instead. I'll let y'all decide whether that should be landed. That said, I don't think this should be a P0, and I'm lowering the priority. The whole point of an SLO is that if you are meeting it, then things are okay. As far as I know, we are currently meeting the SLO, and I have no particular evidence that people are skipping the CQ because things are too slow and the world is breaking as a result. We've also been building 'all' w/o analyze on Linux for half a year. Obviously, it would be good to make analyze work with all for GN. It would also be good to make the SLO more aggressive if we can, because faster cycle times are good. But, what the priority of those things should be versus everything else we could possibly work on is a different discussion than whether this is a P0 thing that should cause everything else to be dropped. I'll also work on the idea sky@ suggested in https://bugs.chromium.org/p/chromium/issues/detail?id=555273#c5 to see if we can do better in the 'all' case (or if I can just fix this outright), but I have other broken things on my plate as well that seem to me to be higher priority.
,
Jun 9 2016
Let's discuss this in person when you get in. The SLO is not the only thing that matters. i.e. analyze's benefits are not just related to lowering CQ times. Running less tests for changes that don't impact a test also decreases observed flakiness. Again, getting analyze working correctly (even if it didn't impact SLO) for all bots was a lot of work. Regressing that is undoing a lot of work.
,
Jun 9 2016
We should fix this but it seems a 6-month-old bug is pretty much not P0, and I don't think needs to block the work we're doing this week.
,
Jun 9 2016
I agree doesn't have to be solve by tomorrow, but I put it as P0 because I think it's a big regression and it should be fixed asap. From Dirk's comment on a code review, he says this could be fixed in a day. AFAIK Windows's reverted GN conversion was also building all, so this bug would have an even bigger effect there because of slower building/linking/tests.
,
Jun 9 2016
> AFAIK Windows's reverted GN conversion was also building all, > so this bug would have an even bigger effect there because of > slower building/linking/tests. As we discussed in person, the CQ bots, apart from win_clang, were not building 'all'. win_clang was (and continues to) build all, but is a compile-only bot that is within the SLO, as described above.
,
Jun 9 2016
Brett said he'll take a look.
,
Jun 21 2016
,
Jul 1 2016
As suggested by laforge@ and a conversation w/ the monorail folks, I'm going to try tracking GN-Migration related issues by *just* using the Proj-GN-Migration label, and not using blocking/rollup bugs, so that we can use blocking for just tasks that truly need to be completed before other tasks can make progress.
,
Jul 1 2016
,
Jul 1 2016
I'm actually going to close this as a dup of 555273. |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by dpranke@chromium.org
, Jun 7 2016Cc: thakis@chromium.org brettw@chromium.org brucedaw...@chromium.org
Components: -Infra Build
Labels: -Pri-3 Proj-GN-Migration OS-Windows Pri-1
Owner: thakis@chromium.org
Summary: win_clang shouldn't be building 'all' (was: Analyze returned targets for file not compiled)