Findit should NEVER flag chrome-release-bot commits as culprits |
||||||
Issue descriptionWhat is the bug or feature: Findit flagged an automated VERSION bump as the culprit for a compile failure overnight: https://findit-for-me.appspot.com/waterfall/culprit?key=ag9zfmZpbmRpdC1mb3ItbWVyRAsSDVdmU3VzcGVjdGVkQ0wiMWNocm9taXVtLzdlNDEwNjBkODA3OTVhYjAyYzc0MGRkNWFiZThjOTk5MjZlZTk2MmEM which caused the sheriff to revert the commit: https://chromium.googlesource.com/chromium/src.git/+/1db6740622c361d982f8e708ac4f29fe0f7441d4 This is a TERRIBLE idea, and can cause all sorts of havoc with Chrome's entire release pipeline. As noted in the release CL, the VERSION bump wasn't even the cause of the problem, just something that triggered a bug in another piece of code: https://chromium-review.googlesource.com/c/chromium/src/+/803075#message-0e904390151eee16b8ae3ccd9fb35ce45fc49677 To sum up, chrome-release-bot commits are a normal, expected, DAILY part of Chrome's development and release process, and unless something goes horribly, horribly wrong (e.g. a file was somehow mangled before committing, which should be pretty obvious), they will NEVER be the true cause of a build failure, and should never be automatically flagged or suggested for revert.
,
Dec 1 2017
FYI, findit just tried to blame the reland for some new failures: https://findit-for-me.appspot.com/waterfall/culprit?key=ag9zfmZpbmRpdC1mb3ItbWVyRAsSDVdmU3VzcGVjdGVkQ0wiMWNocm9taXVtL2VhMmU1ZmUyNDU3ZWMyNmY4ZTAwNmQwMTg4OWZjYzI5N2I5ZjY5NTIM
,
Dec 1 2017
,
Dec 1 2017
,
Dec 1 2017
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/8381d259b0aa367bb27db17de3d29df86b22cf79 commit 8381d259b0aa367bb27db17de3d29df86b22cf79 Author: Chan <chanli@chromium.org> Date: Fri Dec 01 19:39:52 2017 [Findit] Make Findit not flag changes by certain accounts as culprit. Bug: 791050 Change-Id: I8a715b484892b459800d03f602c5e016f0b08952 Reviewed-on: https://chromium-review.googlesource.com/804197 Commit-Queue: Chan Li <chanli@chromium.org> Reviewed-by: Shuotao Gao <stgao@chromium.org> [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/compile_failure/test/compile_try_job_test.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/test/git_test.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/build_failure_analysis.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/test_failure/test/test_try_job_test.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/test/try_job_test.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/compile_failure/compile_try_job.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/git.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/test/build_failure_analysis_test.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/compile_failure/test/compile_failure_analysis_test.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/common/constants.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/gerrit.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/try_job.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/test_failure/test/test_failure_analysis_test.py [modify] https://crrev.com/8381d259b0aa367bb27db17de3d29df86b22cf79/appengine/findit/services/test_failure/test_try_job.py
,
Dec 1 2017
I tried to sort out what have happened as below. IMHO, there might be some changes we could make: * Make Findit not to flag chrome-release-bot's CLs as suspects or culprits. CL landed already as above and to be deployed soon today. * In the commit message of chrome-release-bot's CLs, make it clear that the CL should never be reverted and the failures should be fixed instead. In this case, sheriff tried to revert the faulty CL, but they failed to due to merge conflict and decided to revert the Chrome version bump instead. Findit's auto-creation of the reverting CL could have make sheriff more likely to revert. But even without Findit's reverting CL, sheriffs could do the same manually as well. * Not sure if possible, but changes that rely on Chrome Version should be landed after Chrome Version is bumped to the expected value. In that way, those changes will be tested in CQ and prevent the headaches. The last two bullets are just my two cents :) ---------------------(All PST time)--------------------------- Nov 23, 2017, 03:02:30 A faulty policy template for chrome.*:65- was committed as https://chromium-review.googlesource.com/c/chromium/src/+/781634 Nov 30, 2017, 22:55:10 Chrome Version was bumped from 64 to 65 by chrome-release-bot in https://chromium-review.googlesource.com/c/chromium/src/+/803075 * The CL was landed without going through CQ. This is unlike other auto-roller commits. (A lot of compile failures are *triggered* as shown in https://screenshot.googleplex.com/zUm487K5mR3.png) Nov 30, 2017 23:30:52 Findit finished its first analysis, and auto-created a reverting CL of the Chrome Version bump https://chromium-review.googlesource.com/c/chromium/src/+/803234 * Not sure why no sheriffs were added as reviewers, but this reverting CL should have shown up on sheriff-o-matic UI. Nov 30, 2017 23:40:30 Sheriff created a reverting CL of the faulty policy template and sent it to CQ https://chromium-review.googlesource.com/c/chromium/src/+/803156 Nov 30, 2017 23:45:22 Due to merge conflict, CQ failed for the reverting CL of the faulty policy. Nov 30, 2017 23:48:31 Sheriff decided to revert the Chrome Version bump https://chromium-review.googlesource.com/c/chromium/src/+/781634#message-45ade6d8e426b683283032efcd67e16eec3e352f Nov 30, 2017 23:49:32 Sheriff submitted Findit's reverting CL of the Chrome Version bump. (Compile failures started to disappear after the revert as shown in https://screenshot.googleplex.com/zUm487K5mR3.png) Nov 30, 2017 23:53:10 Sheriff commented findings and decision to revert in the CL of Chrome Version bump. Nov 30, 2017 23:53:56 tzik@ created a hotfix for the faulty policy template https://chromium-review.googlesource.com/c/chromium/src/+/802670 Dec 01, 2017 00:02:21 tzik@'s hotfix landed. Dec 01, 2017 00:11:38 Sheriff posted the hotfix in the CL of Chrome Version bump, and didn't reland it because Sheriff is not sure whether a reland will break some automation. Dec 01, 2017 07:43:42 The Chrome Version bump was relanded in https://chromium-review.googlesource.com/c/chromium/src/+/803734 (Test failures in PolicyPrefsTest.PolicyToPrefsMapping are *triggered* as shown in https://screenshot.googleplex.com/p7PnTbCSrvn.png) Dec 01, 2017 09:48:59 Findit finished its first analysis, and alerted on the relanding CL https://chromium-review.googlesource.com/c/chromium/src/+/803734#message-4fe707ee7fd2ba837f035b89658e3d63f371d5f6 * Not sure why, but the finding is not showing up on https://screenshot.googleplex.com/crV35We40H9.png Dec 01, 2017 11:41:00 Test failures in PolicyPrefsTest.PolicyToPrefsMapping are still there.
,
Dec 2 2017
,
Dec 2 2017
Change was deployed.
,
Dec 2 2017
Once chrome-release-bot@ makes the needed change as in the blocking bug, we will remove it from the backlist.
,
Sep 8
No longer on the Chrome team, e-mail me @google.com if any attention still required from me here, otherwise good luck! |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by st...@chromium.org
, Dec 1 2017Status: Assigned (was: Unconfirmed)