Issue metadata
Sign in to add a comment
|
bot_update should fail to apply the patch in tryjobs that ran after Gerrit CL has landed |
||||||||||||||||||||||||
Issue descriptionOn CL https://chromium-review.googlesource.com/c/541221/ (patch 8) the layout test trybot ran twice inconsistently. The first time, it compiled, ran the tests and had a lot of failures. The second time, analyze decided that no compile was necessary. Though the CL indeed breaks layout tests. The two runs: https://build.chromium.org/p/tryserver.v8/builders/v8_linux_blink_rel/builds/22894 https://build.chromium.org/p/tryserver.v8/builders/v8_linux_blink_rel/builds/22904 In the second run, the problem seems to occur in "git diff to analyze patch", it didn't show any files. Though it seems to be equally called and also the two bot_update steps seem to do the right thing (just had a superficial look).
,
Jun 27 2017
Note that those two runs were on different bots, so this wasn't a case of the first run poisoning the checkout or anything like that.
,
Jun 27 2017
No, but maybe whatever ran before the second build on _its bot_ poisoned the checkout?
,
Jun 27 2017
This was the build that ran before on the same bot: https://build.chromium.org/p/tryserver.v8/builders/v8_linux_blink_rel/builds/22900 Also nothing out of the ordinary as far as I can see...
,
Jun 27 2017
I SSHed in and couldn't find anything weird with the checkout. I tried rebasing the checkout on the bot though. The git diff to analyze patch steps between the two runs look very similar. What's odd to me is that after step 15, the second bot seems to skip a ton of the steps that the first bot runs.
,
Jun 27 2017
Ignore my comment before about the git diff steps looking similar. I see the missing files now.
,
Jun 28 2017
Every difference after the "git diff" is a result of those missing files. I'm afraid this might not be very actionable, but still scary. Are there more sanity checks that we could add as a shot in the dark?
,
Jun 28 2017
Another example: https://build.chromium.org/p/tryserver.v8/builders/v8_linux_blink_rel/builds/22798 https://build.chromium.org/p/tryserver.v8/builders/v8_linux_blink_rel/builds/22961 Both from same CL/patchset. The second one had an empty "git diff". It was on a different host than the example from the description.
,
Jun 29 2017
It looks like the second build (22961) happened *after* the change landed (v8 was synced to 46251, but the change landed in 46249). v8_linux_blink_rel is an experimental bot, and as such, they don't block the commit queue. Depending on busy said experimental bot might be, this can mean that it doesn't get a chance to run until after the change might've landed. So, the empty diff is to be expected. I don't think there's an easy way to avoid this if you are capacity-constrained.
,
Jun 30 2017
Nice analysis, thanks Dirk! Yes, it all makes sense now. +tandrii for this CQ use case. This has the extra complication that a V8 patch is applied on a Chromium bot. CQ doesn't know anything about that. But I think we could maybe teach the recipe to detect that the CL has already landed? I assume 'git diff == no files' already gives us that answer? Maybe we could in this case bale out immediately with a different message. @tandrii, is CQ possibly re-triggering a failed experimental bot after the CL has landed? If that were possible, maybe we could avoid that. If not, then the second run must have been triggered just before the patch landed. And the patch managed to land before bot_update got to it. There's probably a max-1-minute window where that can happen.
,
Jul 3 2017
I think apply_issue for Rietveld patches was able to detect this is CL was marked as closed (IIRC). Current rebase logic for Gerrit patches in bot_update isn't that smart: it'd fail iff patch doesn't rebase cleanly. Hence you the observed results. I think best way to implement this is to add extra step: = git checkout refs/patch + # check if has been CL merged using Gerrit API. Race conditions still possible due to eventual consistency, but should be rare. + # For total protection, maybe check that commit message stays the same after rebase? = git rebase RETCH_HEAD onto refs/head/branch Also, IMO, CQ is WAI here, and since one can trigger tryjobs with git cl try with the same outcomes, I'm removing CQ component.
,
Jul 3 2017
,
Jul 5 2017
At this point, this bug is now a duplicate of 648343 |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by machenb...@chromium.org
, Jun 27 2017