Try bots failing to apply Gerrit patches: "name consists only of disallowed characters" |
|||||||||||||
Issue descriptionExamples: https://build.chromium.org/p/tryserver.blink/builders/linux_trusty_blink_rel/builds/9554 https://build.chromium.org/p/tryserver.chromium.android/builders/android_clang_dbg_recipe/builds/269949 http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_amd64-generic_chromium_compile_only_ng/builds/340444 http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_compile_dbg_ng/builds/297710 http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_tsan_rel_ng/builds/75532 https://build.chromium.org/p/tryserver.chromium.android/builders/cast_shell_android/builds/268995 Also from https://chromium-review.googlesource.com/c/503472
,
May 16 2017
It looks like the bots have figured out how to make progress again, triage as you see fit. :)
,
May 16 2017
I'm seeing this on another CL now https://chromium-review.googlesource.com/c/506519/ :(
,
May 16 2017
,
May 16 2017
https://chromium-review.googlesource.com/c/503472 got most bots to work but continues to fail to apply the patch on a few bots for 6 tries in a row so far: http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_compile_dbg_ng/builds/297798 http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_asan_rel_ng/builds/371893 http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_layout_tests_slimming_paint_v2/builds/4481 http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presubmit/builds/438405
,
May 16 2017
I think the real error here is during the rebase itself, before the --abort failures: ===Running git checkout FETCH_HEAD (attempt #1)=== In directory: src Previous HEAD position was 6791009b216d... DevTools: console is missing when debugging Node HEAD is now at edefce3444d4... Remove cullRect() from PaintOpBuffer. ===Succeeded in 0.1 mins=== ===Rebasing=== ===Running git checkout -b tmp/d47863e1509945219f1a2969a53a8dfc (attempt #1)=== In directory: src Switched to a new branch 'tmp/d47863e1509945219f1a2969a53a8dfc' ===Succeeded in 0.0 mins=== ===Running git rebase 6791009b216d80e5e184cf074ff863c345399000 (attempt #1)=== In directory: src First, rewinding head to replay your work on top of it... fatal: name consists only of disallowed characters: " ===Failed in 0.1 mins=== ===Running git rebase --abort (attempt #1)=== In directory: src No rebase in progress? ===Failed in 0.0 mins=== <retries> during the first rebase, it prints 'fatal: name consists only of disallowed characters: "'. That lone trailing double quote is worrisome. This seems to be complaining about the contents of "git config user.name", which it needs to access during the rebase in order to create the new commit object. It's not just your CL that's failing. It's lots of runs on particular machines. Looks like maybe something bad got committed and reverted, but some machines got poisoned git configs in the mean time? CCing some Recipes folks who might know how the bot_update module is going wrong here.
,
May 16 2017
Issue 722864 has been merged into this issue.
,
May 16 2017
,
May 16 2017
Taking this as a trooper. I'll see if I can log in to one of the bots and see what's up.
,
May 16 2017
Meanwhile, the bot I was going to look at passed the next build: https://build.chromium.org/p/tryserver.chromium.linux/buildslaves/slave1338-c4
,
May 16 2017
Does this ring any bells? @@@STEP_LOG_LINE@exception@Traceback (most recent call last):@@@ @@@STEP_LINK@logdog-->exception@https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Fchromium_presubmit%2F438336%2F%2B%2Frecipes%2Fsteps%2FUncaught_Exception%2F0%2Flogs%2Fexception%2F0@@@ @@@STEP_LOG_LINE@exception@ File "/b/build/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/run.py", line 373, in _old_run@@@ @@@STEP_LOG_LINE@exception@ recipe_result = recipe_script.run(api, properties)@@@ @@@STEP_LOG_LINE@exception@ File "/b/build/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/loader.py", line 98, in run@@@ @@@STEP_LOG_LINE@exception@ self.run_steps, properties, self.PROPERTIES, api=api)@@@ @@@STEP_LOG_LINE@exception@ File "/b/build/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/loader.py", line 619, in invoke_with_properties@@@ @@@STEP_LOG_LINE@exception@ **additional_args)@@@ @@@STEP_LOG_LINE@exception@ File "/b/build/scripts/slave/.recipe_deps/recipe_engine/recipe_engine/loader.py", line 580, in _invoke_with_properties@@@ @@@STEP_LOG_LINE@exception@ return callable_obj(*props, **additional_args)@@@ @@@STEP_LOG_LINE@exception@ File "/b/build/scripts/slave/recipes/run_presubmit.py", line 123, in RunSteps@@@ @@@STEP_LOG_LINE@exception@ return _RunStepsInternal(api)@@@ @@@STEP_LOG_LINE@exception@ File "/b/build/scripts/slave/recipes/run_presubmit.py", line 48, in _RunStepsInternal@@@ @@@STEP_LOG_LINE@exception@ upstream = bot_update_step.json.output['properties'].get(@@@ @@@STEP_LOG_LINE@exception@KeyError: 'properties'@@@ @@@STEP_LOG_END@exception@@@ https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Fchromium_presubmit%2F438336%2F%2B%2Frecipes%2Fstdout
,
May 16 2017
FWIW, a passing build on the same slave doesn't have this crash: https://luci-milo.appspot.com/buildbot/tryserver.chromium.linux/chromium_presubmit/438444 (see steps > [stdout])
,
May 16 2017
I think that uncaught exception is just a consequence of bot_update not succeeding correctly and not producing the right output json document.
,
May 16 2017
And I think the bots succeed on builds that are trying Rietveld patches.
,
May 16 2017
I checked a few more builds, and indeed the same exception is causing the "rebase" error, while green runs don't have that exception. Looking through the bot_update code to figure out...
,
May 16 2017
Indeed, it seems like a gerrit issue - thanks, Aaron!
,
May 16 2017
Or, rather, recipes issue related to gerrit patches
,
May 16 2017
,
May 16 2017
Basically, ensure_checkout fails somewhere in rebase (because Gerrit already did the rebase?) and fails to construct the step_result object. The above crash is a consequence of that (thx agable@ for helping figuring this out).
,
May 16 2017
The only recent change in bot_update that might affect things seems to be this: https://chromium-review.googlesource.com/c/501849/ Not yet sure if it's related.
,
May 16 2017
Gerrit shouldn't have rebased; bot_update explicitly doesn't ask gerrit to rebase (even though it could) since we decided that presubmit (esp dry run) shouldn't modify the patch in question. The rebase failure seems due to the fact that the git client can't get a valid value for "user.name" when constructing the header metadata of the rebase commit: https://github.com/git/git/blob/e2cb6ab84c94f147f1259260961513b40c36108a/ident.c#L402
,
May 16 2017
Issue 722905 has been merged into this issue.
,
May 16 2017
,
May 16 2017
slave505-c4$ cd /b/c/b/linux_layout/src slave505-c4$ git config user.name slave505-c4$ No output. What's not clear to me is why this only started failing now; why was this rebase succeeding for the past weeks/months of dogfood and infra usage?
,
May 16 2017
FWIW this is failing on classic Infra waterfalls, too. Maybe something to do with either Git version (are we pushing a new Git?) or a Gerrit backend change?
,
May 16 2017
6:58am slave503-c4 fails: https://build.chromium.org/p/tryserver.blink/builders/linux_trusty_blink_rel/builds/9542 7:10am slave503-c4 succeeds: https://build.chromium.org/p/tryserver.blink/builders/linux_trusty_blink_rel/builds/9546 Truly flaky, even on the individual host level, even when confined to just Gerrit patches
,
May 16 2017
Disregard the previous comment; the success was a Rietveld patch.
,
May 16 2017
tandrii has a possible fix here: https://chromium-review.googlesource.com/c/505495/ Unfortunately it is failing all of the tests, for similar reasons. Although the patch applies (because depot_tools has no intervening commits to rebase on top of), the tests themselves are failing, many with the same error message as the rebase.
,
May 16 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/depot_tools/+/90e958b3b882b4d0c4457f8c86398bbde10a5208 commit 90e958b3b882b4d0c4457f8c86398bbde10a5208 Author: Andrii Shyshkalov <tandrii@chromium.org> Date: Tue May 16 18:52:45 2017 bot_update: set fake user.{name,email} when running rebase. R=agable@chromium.org,sergeyberezin@chromium.org,iannucci@chromium.org BUG= 722853 Change-Id: I9bd8c259a87cc65e320e99d9c5e18838b970d543 Reviewed-on: https://chromium-review.googlesource.com/505495 Reviewed-by: Sergey Berezin <sergeyberezin@chromium.org> Reviewed-by: Aaron Gable <agable@chromium.org> [modify] https://crrev.com/90e958b3b882b4d0c4457f8c86398bbde10a5208/recipes/recipe_modules/bot_update/resources/bot_update.py
,
May 16 2017
The error message being printed was only introduced in git 2.13: https://github.com/git/git/commit/13b9a24e58f736b70e48846cf7e5b7cfa66c3fec Git 2.13 was only released a week ago: https://github.com/blog/2360-git-2-13-has-been-released It looks like the version of git updated on our GCE bots without warning. It also looks like we're going to have to deal with this permanently going forward, everywhere, whenever we get around to rolling up to 2.13 on other platforms that we do manage to control.
,
May 16 2017
2.13 was pushed to the git-core ppa 9 hours ago: https://launchpad.net/~git-core/+archive/ubuntu/ppa
,
May 16 2017
Wait, are updating from git-core ppa on GCE slave start?
,
May 16 2017
We shouldn't be, updating git is only done by either puppet, or on image creation.
,
May 16 2017
friedman@ says that our puppet config was pinning git-core at 'latest'. Working with him to fix that now.
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/39645221841978946a21b43e376be0b126efb58a commit 39645221841978946a21b43e376be0b126efb58a Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 19:15:18 2017
,
May 16 2017
The change above moves the pin from 'latest' to 'installed'. This will prevent any bots which haven't updated yet (unlikely) from getting 2.13. It will also prevent us from unexpectedly upgrading the whole fleet to 2.14, whenever that comes out. It won't, however, prevent *new* bots from upgrading to 2.14. So friedman is also working on finding .debs of 2.12 (the previous stable), and adding those to our private ppa, so that we can then pin an exact version and get puppet to do a rollback everywhere. Finally, the work that iannucci/nodir/I are doing on serving git via cipd in depot_tools will make the system version of git irrelevant.
,
May 16 2017
v8 custom recipe is failing for the same reason: https://build.chromium.org/p/tryserver.v8/builders/v8_node_linux64_rel/builds/2022/steps/update%20v8/logs/stdio
,
May 16 2017
Issue 722937 has been merged into this issue.
,
May 16 2017
I found a 2.11 for trusty amd64 which I added to the internal infra apt repo. Testing it in puppet now ... CL soon.
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/3b180089259d8701806d5dfb27f40e0dd78ea1e6 commit 3b180089259d8701806d5dfb27f40e0dd78ea1e6 Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 20:02:15 2017
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/b8e53da449b7938fed34cc1bfdec892f104ddf96 commit b8e53da449b7938fed34cc1bfdec892f104ddf96 Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 20:14:29 2017
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/3257e7fa3792a38e874ca023c0eff045b4a2b16b commit 3257e7fa3792a38e874ca023c0eff045b4a2b16b Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 20:18:45 2017
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/d31de7370497b96a41fe2f39e5ff7753e7b925d8 commit d31de7370497b96a41fe2f39e5ff7753e7b925d8 Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 20:28:20 2017
,
May 16 2017
Ok, 2.11 is rolling out now.
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/d48770b0e4bbe8d8dd9f4591b4178322fdd7e6f3 commit d48770b0e4bbe8d8dd9f4591b4178322fdd7e6f3 Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 20:41:50 2017
,
May 16 2017
,
May 16 2017
We are seeing successful tasks: https://luci-milo.appspot.com/swarming/task/362bf540e1923e10
,
May 16 2017
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/da05e45b6849a033bf1ddfc4e98679a1bd10b756 commit da05e45b6849a033bf1ddfc4e98679a1bd10b756 Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 21:59:50 2017
,
May 16 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/414cef9e11b7fb1656c18fee5f7f227b42d4c2e2 commit 414cef9e11b7fb1656c18fee5f7f227b42d4c2e2 Author: Elliott Friedman <friedman@google.com> Date: Tue May 16 22:01:40 2017
,
May 16 2017
Some bots (notably, android bots which are in the process of doing an important official build) are failing to download the correctly-specified package from our PPA: $ sudo puppet agent -t Info: Retrieving pluginfacts Info: Retrieving plugin Info: Loading facts Fact file /var/lib/puppet/facts.d/windows_defender.ps1 was parsed but returned an empty data set Info: Caching catalog for lin64-17-m0.official.chromium.org Info: Applying configuration version '1494974916' Notice: /Stage[main]/Chrome_infra::Packages::Debian/Exec[git_apt_update]/returns: executed successfully Error: Could not update: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold --force-yes install git=1:2.11.0-2~ppa0~ubuntu14.04.1' returned 100: Reading package lists... Building dependency tree... Reading state information... E: Version '1:2.11.0-2~ppa0~ubuntu14.04.1' for 'git' was not found Error: /Stage[main]/Chrome_infra::Packages::Debian/Package[git]/ensure: change from absent to 1:2.11.0-2~ppa0~ubuntu14.04.1 failed: Could not update: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold --force-yes install git=1:2.11.0-2~ppa0~ubuntu14.04.1' returned 100: Reading package lists... Building dependency tree... Reading state information... E: Version '1:2.11.0-2~ppa0~ubuntu14.04.1' for 'git' was not found Notice: Finished catalog run in 10.23 seconds bevc is working on making sure the ppa is serving the right stuff.
,
May 16 2017
(Result of the above is that the bots have no git binary at all; which is obviously a problem.) In the mean time, the number of bots with installed 2.13 has been falling; I expect it has hit zero right around now. Will check with friedman to confirm the number when he's back from meeting.
,
May 16 2017
,
May 16 2017
Bots in the 0 vlan have been recovered. Looks like some other bots (e.g. in b-lab) also lost their git binaries at the same time. Working on recovering those.
,
May 17 2017
draft postmortem here: https://docs.google.com/document/d/18lQlAP5fmuqbLxcsQJfKkDw-c7awzbSymjRPfHkJ89M
,
May 17 2017
The following revision refers to this bug: https://chromium.googlesource.com/v8/v8.git/+/68b81ff4fd7c1669d0284bfd92aaeeae8b1cf1df commit 68b81ff4fd7c1669d0284bfd92aaeeae8b1cf1df Author: Andrii Shyshkalov <tandrii@chromium.org> Date: Wed May 17 07:48:25 2017 Fix update_node tool to work around git 2.14. Example failure: https://uberchromegw.corp.google.com/i/tryserver.v8/builders/v8_node_linux64_rel/builds/2022/steps/update%20v8/logs/stdio R=machenbach@chromium.org Bug: chromium:722853 Change-Id: I5483dd7e09ac20fce214cd90ca949118fe1e52b0 Reviewed-on: https://chromium-review.googlesource.com/505622 Commit-Queue: Andrii Shyshkalov <tandrii@chromium.org> Reviewed-by: Michael Achenbach <machenbach@chromium.org> Cr-Commit-Position: refs/heads/master@{#45359} [modify] https://crrev.com/68b81ff4fd7c1669d0284bfd92aaeeae8b1cf1df/tools/release/update_node.py
,
May 17 2017
,
May 17 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/3b0f13dbe8105db70b0b7ec466c94c65eac91140 commit 3b0f13dbe8105db70b0b7ec466c94c65eac91140 Author: Elliott Friedman <friedman@google.com> Date: Wed May 17 22:59:53 2017
,
May 17 2017
The following revision refers to this bug: https://chrome-internal.googlesource.com/infra/puppet/+/d5bab28bc08ecd38db18451af5317757089d2200 commit d5bab28bc08ecd38db18451af5317757089d2200 Author: Elliott Friedman <friedman@google.com> Date: Wed May 17 23:35:46 2017 |
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by danakj@chromium.org
, May 16 2017