New issue
Advanced search Search tips

Issue 799268 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Jan 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Fuchsia
Pri: 1
Type: Bug



Sign in to add a comment

Log thread backtraces for timed-out test job processes before killing them.

Project Member Reported by w...@chromium.org, Jan 4 2018

Issue description

We still see individual test job processes timeout fairly regularly (about 10% of base_unittests runs under the Fuchsia/x64 FYI bot, for example), with behaviour that looks suspiciously like a deadlock/hang, rather than things just taking too long.

We can see the TestLauncher logging lines to indicate that it is waiting for output, every 15s - this means that the system isn't completely locked-up, so we could have the TestLauncher log backtraces of all the threads in the job process, when it has timed-out and is about to be torn down.
 

Comment 1 by w...@chromium.org, Jan 5 2018

Status: Started (was: Assigned)

Comment 2 by w...@chromium.org, Jan 6 2018

This is desired to debug  issue 793412  and  issue 755282 .
Project Member

Comment 3 by bugdroid1@chromium.org, Jan 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6915579f3e38e2be1c3763653ee0ca75624f171d

commit 6915579f3e38e2be1c3763653ee0ca75624f171d
Author: Wez <wez@chromium.org>
Date: Tue Jan 09 03:17:36 2018

Fix PreExecHook test to actually verify that the hook is run.

Previously the test would pass under platforms on which the
|LaunchOptions::pre_exec_delegate| parameter was ignored (e.g. Fuchsia).

(Also disables the test under Fuchsia)

Bug:  799268 
Change-Id: I7a307010739faa3c1631897d24057014aebdfe84
Reviewed-on: https://chromium-review.googlesource.com/852980
Reviewed-by: Daniel Cheng <dcheng@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#527886}
[modify] https://crrev.com/6915579f3e38e2be1c3763653ee0ca75624f171d/base/process/process_util_unittest.cc

Project Member

Comment 4 by bugdroid1@chromium.org, Jan 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/dc9eb2b1266e19bbfd4b9c69ff288750fef4004c

commit dc9eb2b1266e19bbfd4b9c69ff288750fef4004c
Author: Wez <wez@chromium.org>
Date: Tue Jan 09 07:25:07 2018

Update TestLauncher to use Fuchsia jobs in place of POSIX process jobs.

TestLauncher was previously using the |LaunchOptions::new_process_group|
to request LaunchProcess() to isolate each test job into a separate
group, for easy process cleanup.

Since the |new_process_group| was not implemented in the Fuchsia
implementation of LaunchProcess, this had no effect besides causing
errors to be logged when we attempted to kill(-pid).

We remove |new_process_group| and update TestLauncher to use native
Fuchsia jobs to group test job processes.

Bug:  799268 ,  755282 
Change-Id: Ia96cd77c5b4066d6da522cc7fe0e4e427229dac3
Reviewed-on: https://chromium-review.googlesource.com/852559
Commit-Queue: Wez <wez@chromium.org>
Reviewed-by: Daniel Cheng <dcheng@chromium.org>
Cr-Commit-Position: refs/heads/master@{#527925}
[modify] https://crrev.com/dc9eb2b1266e19bbfd4b9c69ff288750fef4004c/base/process/launch.h
[modify] https://crrev.com/dc9eb2b1266e19bbfd4b9c69ff288750fef4004c/base/test/launcher/test_launcher.cc

Project Member

Comment 5 by bugdroid1@chromium.org, Jan 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c6b685d005b61d767406ed41dc963e0a5cd8e327

commit c6b685d005b61d767406ed41dc963e0a5cd8e327
Author: Wez <wez@chromium.org>
Date: Tue Jan 09 17:34:49 2018

Implement or remove some process LaunchOptions under Fuchsia.

- Implement LaunchOptions::wait.
- Move |maximize_relimits| to exist conditional on OS_LINUX.
- Don't define |real_path| or |pre_exec_delegate| under Fuchsia.

Bug:  799268 
Change-Id: I4ec2643fc16a72f99518e892e353be72e6329130
Reviewed-on: https://chromium-review.googlesource.com/852981
Commit-Queue: Wez <wez@chromium.org>
Reviewed-by: Daniel Cheng <dcheng@chromium.org>
Cr-Commit-Position: refs/heads/master@{#528031}
[modify] https://crrev.com/c6b685d005b61d767406ed41dc963e0a5cd8e327/base/process/launch.h
[modify] https://crrev.com/c6b685d005b61d767406ed41dc963e0a5cd8e327/base/process/launch_fuchsia.cc

Project Member

Comment 6 by bugdroid1@chromium.org, Jan 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/660ff99a094be171290b0f1fe4a6a3e450ca55ef

commit 660ff99a094be171290b0f1fe4a6a3e450ca55ef
Author: Kevin Marshall <kmarshall@chromium.org>
Date: Wed Jan 10 01:59:59 2018

Roll Fuchsia SDK to 6b4cb32d100d2ecfaaa9642adfb0de451c5b9a69.

- Fixes argv[0] to report package-relative path for "main" binary.
- Fixes 'threads' utility not to hang if threads exit mid-dump.
- Adds tracing command & service to the SDK, for easier debugging.
- Fixes SSH to return valid exit codes from remote commands.
- Fixes "run" to correctly route program output via stdout/stderr.
- Fixes NET-354 (SSH leaving processes hanging after client disconnects.)

Bug: 707030,  799268 ,  793412 ,  798851 , 778467
Change-Id: Ie3ab3fed54df1884089b57e1638883684de6836f
Reviewed-on: https://chromium-review.googlesource.com/857809
Commit-Queue: Wez <wez@chromium.org>
Reviewed-by: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#528206}
[modify] https://crrev.com/660ff99a094be171290b0f1fe4a6a3e450ca55ef/DEPS

Project Member

Comment 7 by bugdroid1@chromium.org, Jan 10 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/26a004d0c98da273581c9d6953a35ea97874a204

commit 26a004d0c98da273581c9d6953a35ea97874a204
Author: Wez <wez@chromium.org>
Date: Wed Jan 10 08:27:08 2018

Dump list of all threads in timed-out/hung sub-processes.

Bug:  799268 ,  755282 ,  793412 
Change-Id: I6737bbac53253205c6d31d32ce1a34a27e9ceee1
Reviewed-on: https://chromium-review.googlesource.com/853079
Commit-Queue: Wez <wez@chromium.org>
Reviewed-by: Daniel Cheng <dcheng@chromium.org>
Cr-Commit-Position: refs/heads/master@{#528261}
[modify] https://crrev.com/26a004d0c98da273581c9d6953a35ea97874a204/base/test/launcher/test_launcher.cc

Comment 8 by w...@chromium.org, Jan 11 2018

Status: Fixed (was: Started)
Verified this working for a trivial faked "hung" test at ToT.
Project Member

Comment 9 by bugdroid1@chromium.org, Jan 19 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0503192a831de0fb2176f7a33efc90f85fbf1b9f

commit 0503192a831de0fb2176f7a33efc90f85fbf1b9f
Author: Wez <wez@chromium.org>
Date: Fri Jan 19 23:50:57 2018

Temporarily re-enable a load of flakey tests, to get some hang dumps.

Bug: 738275,  799268 
Change-Id: I6a2c4b313879b7d690661fe95d4d84ed03315028
Reviewed-on: https://chromium-review.googlesource.com/876987
Reviewed-by: Scott Graham <scottmg@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#530664}
[modify] https://crrev.com/0503192a831de0fb2176f7a33efc90f85fbf1b9f/testing/buildbot/filters/fuchsia.base_unittests.filter

Sign in to add a comment