Log thread backtraces for timed-out test job processes before killing them. |
||
Issue descriptionWe still see individual test job processes timeout fairly regularly (about 10% of base_unittests runs under the Fuchsia/x64 FYI bot, for example), with behaviour that looks suspiciously like a deadlock/hang, rather than things just taking too long. We can see the TestLauncher logging lines to indicate that it is waiting for output, every 15s - this means that the system isn't completely locked-up, so we could have the TestLauncher log backtraces of all the threads in the job process, when it has timed-out and is about to be torn down.
,
Jan 6 2018
This is desired to debug issue 793412 and issue 755282 .
,
Jan 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6915579f3e38e2be1c3763653ee0ca75624f171d commit 6915579f3e38e2be1c3763653ee0ca75624f171d Author: Wez <wez@chromium.org> Date: Tue Jan 09 03:17:36 2018 Fix PreExecHook test to actually verify that the hook is run. Previously the test would pass under platforms on which the |LaunchOptions::pre_exec_delegate| parameter was ignored (e.g. Fuchsia). (Also disables the test under Fuchsia) Bug: 799268 Change-Id: I7a307010739faa3c1631897d24057014aebdfe84 Reviewed-on: https://chromium-review.googlesource.com/852980 Reviewed-by: Daniel Cheng <dcheng@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#527886} [modify] https://crrev.com/6915579f3e38e2be1c3763653ee0ca75624f171d/base/process/process_util_unittest.cc
,
Jan 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/dc9eb2b1266e19bbfd4b9c69ff288750fef4004c commit dc9eb2b1266e19bbfd4b9c69ff288750fef4004c Author: Wez <wez@chromium.org> Date: Tue Jan 09 07:25:07 2018 Update TestLauncher to use Fuchsia jobs in place of POSIX process jobs. TestLauncher was previously using the |LaunchOptions::new_process_group| to request LaunchProcess() to isolate each test job into a separate group, for easy process cleanup. Since the |new_process_group| was not implemented in the Fuchsia implementation of LaunchProcess, this had no effect besides causing errors to be logged when we attempted to kill(-pid). We remove |new_process_group| and update TestLauncher to use native Fuchsia jobs to group test job processes. Bug: 799268 , 755282 Change-Id: Ia96cd77c5b4066d6da522cc7fe0e4e427229dac3 Reviewed-on: https://chromium-review.googlesource.com/852559 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Cr-Commit-Position: refs/heads/master@{#527925} [modify] https://crrev.com/dc9eb2b1266e19bbfd4b9c69ff288750fef4004c/base/process/launch.h [modify] https://crrev.com/dc9eb2b1266e19bbfd4b9c69ff288750fef4004c/base/test/launcher/test_launcher.cc
,
Jan 9 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c6b685d005b61d767406ed41dc963e0a5cd8e327 commit c6b685d005b61d767406ed41dc963e0a5cd8e327 Author: Wez <wez@chromium.org> Date: Tue Jan 09 17:34:49 2018 Implement or remove some process LaunchOptions under Fuchsia. - Implement LaunchOptions::wait. - Move |maximize_relimits| to exist conditional on OS_LINUX. - Don't define |real_path| or |pre_exec_delegate| under Fuchsia. Bug: 799268 Change-Id: I4ec2643fc16a72f99518e892e353be72e6329130 Reviewed-on: https://chromium-review.googlesource.com/852981 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Cr-Commit-Position: refs/heads/master@{#528031} [modify] https://crrev.com/c6b685d005b61d767406ed41dc963e0a5cd8e327/base/process/launch.h [modify] https://crrev.com/c6b685d005b61d767406ed41dc963e0a5cd8e327/base/process/launch_fuchsia.cc
,
Jan 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/660ff99a094be171290b0f1fe4a6a3e450ca55ef commit 660ff99a094be171290b0f1fe4a6a3e450ca55ef Author: Kevin Marshall <kmarshall@chromium.org> Date: Wed Jan 10 01:59:59 2018 Roll Fuchsia SDK to 6b4cb32d100d2ecfaaa9642adfb0de451c5b9a69. - Fixes argv[0] to report package-relative path for "main" binary. - Fixes 'threads' utility not to hang if threads exit mid-dump. - Adds tracing command & service to the SDK, for easier debugging. - Fixes SSH to return valid exit codes from remote commands. - Fixes "run" to correctly route program output via stdout/stderr. - Fixes NET-354 (SSH leaving processes hanging after client disconnects.) Bug: 707030, 799268 , 793412 , 798851 , 778467 Change-Id: Ie3ab3fed54df1884089b57e1638883684de6836f Reviewed-on: https://chromium-review.googlesource.com/857809 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#528206} [modify] https://crrev.com/660ff99a094be171290b0f1fe4a6a3e450ca55ef/DEPS
,
Jan 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/26a004d0c98da273581c9d6953a35ea97874a204 commit 26a004d0c98da273581c9d6953a35ea97874a204 Author: Wez <wez@chromium.org> Date: Wed Jan 10 08:27:08 2018 Dump list of all threads in timed-out/hung sub-processes. Bug: 799268 , 755282 , 793412 Change-Id: I6737bbac53253205c6d31d32ce1a34a27e9ceee1 Reviewed-on: https://chromium-review.googlesource.com/853079 Commit-Queue: Wez <wez@chromium.org> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Cr-Commit-Position: refs/heads/master@{#528261} [modify] https://crrev.com/26a004d0c98da273581c9d6953a35ea97874a204/base/test/launcher/test_launcher.cc
,
Jan 11 2018
Verified this working for a trivial faked "hung" test at ToT.
,
Jan 19 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0503192a831de0fb2176f7a33efc90f85fbf1b9f commit 0503192a831de0fb2176f7a33efc90f85fbf1b9f Author: Wez <wez@chromium.org> Date: Fri Jan 19 23:50:57 2018 Temporarily re-enable a load of flakey tests, to get some hang dumps. Bug: 738275, 799268 Change-Id: I6a2c4b313879b7d690661fe95d4d84ed03315028 Reviewed-on: https://chromium-review.googlesource.com/876987 Reviewed-by: Scott Graham <scottmg@chromium.org> Commit-Queue: Wez <wez@chromium.org> Cr-Commit-Position: refs/heads/master@{#530664} [modify] https://crrev.com/0503192a831de0fb2176f7a33efc90f85fbf1b9f/testing/buildbot/filters/fuchsia.base_unittests.filter |
||
►
Sign in to add a comment |
||
Comment 1 by w...@chromium.org
, Jan 5 2018