Frequent infra failures in content_unittests on fuchsia_x64 trybot, blocking CQ |
|||
Issue descriptionhttps://ci.chromium.org/buildbot/tryserver.chromium.linux/fuchsia_x64/?limit=200 This bot's been intermittently failing content_unittests all day; see attached screenshot in case it's cleared itself up. Jobs are all taking about 20 minutes and have an exit code of 0. Just a couple of the many examples: https://ci.chromium.org/buildbot/tryserver.chromium.linux/fuchsia_x64/38242 https://ci.chromium.org/buildbot/tryserver.chromium.linux/fuchsia_x64/38226 Is this running up against some global timeout for the bot?
,
Dec 19 2017
https://ci.chromium.org/buildbot/tryserver.chromium.linux/fuchsia_x64/38161 seems to be an example where it was a fair chunk less than 20 minutes. LUCI says "18 mins 55 secs". The VM's last log is "1084.017" == 18.07s.
,
Dec 19 2017
...18.07 minutes?
,
Dec 19 2017
Aha, looking at https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Ffuchsia_x64%2F38161%2F%2B%2Frecipes%2Fsteps%2Fcontent_unittests__with_patch_%2F0%2Fstdout there's some sneaky output: [01084.017] 03759.03785> Tests took 1071 seconds. +--------------------------------------------------------------------------------------+ | End of shard 0 | | Pending: 0.6s Duration: 1112.0s Bot: gce-trusty-dd4ff4d6-us-west1-c-brqz Exit: 0 | +--------------------------------------------------------------------------------------+ Total duration: 1112.0s Missing or invalid gtest JSON file: /tmp/tmpiZI6Im/0/output.json ValueError: Unterminated string starting at: line 51722 column 7 (char 4489213) Task ran but no result was found: shard 0 test output was missing or invalid some shards did not complete: 0 I downloaded the swarming output: $ python swarming.py collect -S chromium-swarm.appspot.com --task-output-dir=foo 3a854cf8377a7710 and indeed the file is truncated: [scottmg:~/work/cr/src/tools/swarming_client ((4bd9152...))]$ tail foo/0/output.json }, "LocalStorageContextMojoTest.InvalidVersion": { "file": "../../content/browser/dom_storage/local_storage_context_mojo_unittest.cc", "line": 368 }, "LocalStorageContextMojoTest.MetaDataClearedOnDelete": { "file": "../../content/browser/dom_storage/local_storage_context_mojo_unittest.cc", "line": 440 }, "Lo[scottmg:~/work/cr/src/tools/swarming_client ((4bd9152...))]$ I think this is a Fuchsia/Chromium problem, not an Infra issue.
,
Dec 19 2017
(sorry, yeah, 18.07 minutes)
,
Dec 19 2017
I CQ'd a fix for what I believe is the problem. I'll check on success rate for this bot in the morning PST.
,
Dec 19 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c552dad1797351aa6b54998e2a8fca21d0c07758 commit c552dad1797351aa6b54998e2a8fca21d0c07758 Author: Scott Graham <scottmg@chromium.org> Date: Tue Dec 19 07:47:06 2017 fuchsia: Move runner shutdown to after flush delay The delay before VM termination was accidentally moved in https://chromium-review.googlesource.com/c/chromium/src/+/734128 and so on fast shutdown the results .json file might not be flushed to the disk image. This could result in a truncated output.json (see linked bug) causing bot failures. Also, unify delay to longer of current values as they're arbitrary anyway. TBR=wez@chromium.org Bug: 796026 Change-Id: Ie8986ec928e37d251571590c9f05b58109df0f1f Reviewed-on: https://chromium-review.googlesource.com/833757 Reviewed-by: Scott Graham <scottmg@chromium.org> Commit-Queue: Scott Graham <scottmg@chromium.org> Cr-Commit-Position: refs/heads/master@{#524966} [modify] https://crrev.com/c552dad1797351aa6b54998e2a8fca21d0c07758/build/fuchsia/runner_common.py
,
Dec 19 2017
I believe this is resolved, looking at https://ci.chromium.org/buildbot/tryserver.chromium.linux/fuchsia_x64/?limit=200.
,
Dec 19 2017
Thanks Scott for fixing this so quickly. |
|||
►
Sign in to add a comment |
|||
Comment 1 by scottmg@chromium.org
, Dec 19 2017